Create ‘gpt-4’
This commit is contained in:
parent
6be8909f4a
commit
1f669e2d3c
14
gpt-4.myco
Normal file
14
gpt-4.myco
Normal file
@ -0,0 +1,14 @@
|
||||
GPT-4 is a 2022/2023 [[large language model]] by [[OpenAI]]. Similarly to [[GPT-3]], the model was not released except through an [[API]]; dissimilarly, almost no technical details such as architecture, dataset and parameter count were released in the [[https://arxiv.org/abs/2303.08774|technical report]], though credible leaks later provided some of this information.
|
||||
|
||||
== Architecture
|
||||
|
||||
* 1.8 trillion parameters.
|
||||
* 16-way [[Mixture of Experts]] with two used per forward pass.
|
||||
* 25000 [[A100]] [[GPU]]s for ~90 days.
|
||||
* ~13T tokens training data (not all unique; run for multiple epochs).
|
||||
|
||||
== GPT-4-base
|
||||
|
||||
Unlike GPT-3, a [[base model]], GPT-4 was only released to the general public in various [[instruction tuning|instruction-tuned]] forms, for "safety". GPT-4's base model is accessible under NDA. According to the limited information from those who have used it,
|
||||
|
||||
== Sydney
|
Loading…
Reference in New Issue
Block a user