mirror of
https://github.com/osmarks/nanogpt-experiments.git
synced 2024-12-23 00:20:29 +00:00
add reference for 6ND to notebook too
This commit is contained in:
parent
eae986c2d2
commit
0bb96d3fff
2
transformer_sizing.ipynb
generated
2
transformer_sizing.ipynb
generated
@ -358,7 +358,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"This is not a bad estimate at all. I trained this model and it converged in roughly 4 days."
|
"This is not a bad estimate at all. I trained this model and it converged in roughly 4 days. Btw as a good reference for where 6ND comes from and some intuition around it I recommend [Dzmitry's post](https://medium.com/@dzmitrybahdanau/the-flops-calculus-of-language-model-training-3b19c1f025e4)."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
Loading…
Reference in New Issue
Block a user