Edit ‘large_language_model’
This commit is contained in:
parent
7f7485a66d
commit
e2b93a9bbd
@ -1,4 +1,4 @@
|
||||
A large language model is a [[neural net]] model of [[language]] which is [[large]], usually in the sense of parameter count or total [[compute]], making them [[good]] at text prediction by [[scaling laws]]. Usually these are [[autoregressive]] and pretrained on general text data with a next token prediction [[loss function]], though this is not necessarily required. The largest large LLMs known are around 2 trillion parameters, though the smallest LLM is not known.
|
||||
A large language model is a [[neural net]] model of [[language]] which is [[large]], usually in the sense of parameter count or total [[compute]], making it [[good]] at text prediction by [[scaling laws]]. Usually these are [[autoregressive]] and pretrained on general text data with a next token prediction [[loss function]], though this is not necessarily required. The largest large LLMs known are around 2 trillion parameters, though the smallest LLM is not known.
|
||||
|
||||
== History
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user