1
0
mirror of https://github.com/osmarks/nanogpt-experiments.git synced 2024-12-18 14:10:28 +00:00
Commit Graph

17 Commits

Author SHA1 Message Date
Andrej
fb52554ca8
Merge pull request #1 from ankandrew/master
Minor Frozen GPTConfig
2022-12-29 13:45:20 -08:00
ankandrew
7f0e6d9a71 Frozen GPTConfig 2022-12-29 17:07:19 -03:00
Andrej Karpathy
682a0ac8f1 properly resume training, also loading iter_num and best_val_loss from checkpoints 2022-12-29 18:23:15 +00:00
Andrej Karpathy
f88aa2c2fe add link to mingpt 2022-12-29 17:38:33 +00:00
Andrej Karpathy
f2fc4be69b mention 4gpu loss as well in readme 2022-12-29 17:26:42 +00:00
Andrej Karpathy
fa57d464d7 pull out dtype up top 2022-12-29 05:32:55 +00:00
Andrej Karpathy
e7bac659f5 oops missed one # have to fix 2022-12-29 05:24:14 +00:00
Andrej Karpathy
97e2ab1b8d enhance readme, add some todos 2022-12-29 05:23:36 +00:00
Andrej
cc11744131
Add MIT LICENSE file 2022-12-28 21:11:26 -08:00
Andrej Karpathy
dea1507252 add support for DDP training. the scaling timings right now do not look good by default, have to dig more into 2022-12-29 05:06:07 +00:00
Andrej Karpathy
ee6459f1d0 readme tweaks 2022-12-29 02:00:25 +00:00
Andrej Karpathy
3000cf5dda add pytorch profiler support. not sure how to support both profiler and simple benchmarking, a bit gnarly atm hmm 2022-12-29 01:49:53 +00:00
Andrej Karpathy
b760ef1358 add data loading into benchmarking as well, just for completeness 2022-12-29 00:05:32 +00:00
Andrej Karpathy
70b5d93aee add benchmarking script v0 2022-12-28 23:55:43 +00:00
Andrej Karpathy
5d2b4807bf adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm 2022-12-28 23:31:23 +00:00
Andrej Karpathy
c9fe00c0e9 small readme clarification and training script defaults changes 2022-12-28 01:45:55 +00:00
Andrej Karpathy
fe8042867c first very bad commit 2022-12-28 00:58:19 +00:00