diff --git a/large_language_model.myco b/large_language_model.myco index 59f564a..fb73502 100644 --- a/large_language_model.myco +++ b/large_language_model.myco @@ -1,4 +1,4 @@ -A large language model is a [[neural net]] model of [[language]] which is [[large]], usually in the sense of parameter count or total [[compute]], making them [[good]] at text prediction by [[scaling laws]]. Usually these are [[autoregressive]] and pretrained on general text data with a next token prediction [[loss function]], though this is not necessarily required. The largest large LLMs known are around 2 trillion parameters, though the smallest LLM is not known. +A large language model is a [[neural net]] model of [[language]] which is [[large]], usually in the sense of parameter count or total [[compute]], making it [[good]] at text prediction by [[scaling laws]]. Usually these are [[autoregressive]] and pretrained on general text data with a next token prediction [[loss function]], though this is not necessarily required. The largest large LLMs known are around 2 trillion parameters, though the smallest LLM is not known. == History