Edit ‘large_language_model’

This commit is contained in:
osmarks 2024-09-05 20:05:25 +00:00 committed by wikimind
parent 7f7485a66d
commit e2b93a9bbd

View File

@ -1,4 +1,4 @@
A large language model is a [[neural net]] model of [[language]] which is [[large]], usually in the sense of parameter count or total [[compute]], making them [[good]] at text prediction by [[scaling laws]]. Usually these are [[autoregressive]] and pretrained on general text data with a next token prediction [[loss function]], though this is not necessarily required. The largest large LLMs known are around 2 trillion parameters, though the smallest LLM is not known. A large language model is a [[neural net]] model of [[language]] which is [[large]], usually in the sense of parameter count or total [[compute]], making it [[good]] at text prediction by [[scaling laws]]. Usually these are [[autoregressive]] and pretrained on general text data with a next token prediction [[loss function]], though this is not necessarily required. The largest large LLMs known are around 2 trillion parameters, though the smallest LLM is not known.
== History == History