mirror of
https://github.com/osmarks/nanogpt-experiments.git
synced 2024-11-10 20:09:58 +00:00
add reference for 6ND to notebook too
This commit is contained in:
parent
eae986c2d2
commit
0bb96d3fff
2
transformer_sizing.ipynb
generated
2
transformer_sizing.ipynb
generated
@ -358,7 +358,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This is not a bad estimate at all. I trained this model and it converged in roughly 4 days."
|
||||
"This is not a bad estimate at all. I trained this model and it converged in roughly 4 days. Btw as a good reference for where 6ND comes from and some intuition around it I recommend [Dzmitry's post](https://medium.com/@dzmitrybahdanau/the-flops-calculus-of-language-model-training-3b19c1f025e4)."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
Loading…
Reference in New Issue
Block a user