nanogpt-experiments

mirror of https://github.com/osmarks/nanogpt-experiments.git synced 2025-11-25 11:34:51 +00:00

Author	SHA1	Message	Date
Andrej Karpathy	41184a27f5	rename compile_model to compile, shroter, version 2 stragglers	2023-01-02 01:15:55 +00:00
Andrej Karpathy	35f51974c4	rename to compile it's shorter	2023-01-02 01:14:46 +00:00
Andrej Karpathy	2febf4463c	candidate changes to apis, have to think through more	2023-01-01 01:29:48 +00:00
Andrej Karpathy	7c6ea8409e	simplify the prepare script a lot, write only using one process, seems sufficient for now. ty @LaihoE for suggestion and @proger for flagging	2022-12-30 22:18:20 +00:00
Andrej Karpathy	d8abd21258	typo fix in readme	2022-12-30 00:07:58 +00:00
Andrej Karpathy	5a725d9098	add torch.compile by default, shows almost 1.8X improvement in throughput nice	2022-12-30 00:07:13 +00:00
Andrej	fb52554ca8	Merge pull request #1 from ankandrew/master Minor Frozen GPTConfig	2022-12-29 13:45:20 -08:00
ankandrew	7f0e6d9a71	Frozen GPTConfig	2022-12-29 17:07:19 -03:00
Andrej Karpathy	682a0ac8f1	properly resume training, also loading iter_num and best_val_loss from checkpoints	2022-12-29 18:23:15 +00:00
Andrej Karpathy	f88aa2c2fe	add link to mingpt	2022-12-29 17:38:33 +00:00
Andrej Karpathy	f2fc4be69b	mention 4gpu loss as well in readme	2022-12-29 17:26:42 +00:00
Andrej Karpathy	fa57d464d7	pull out dtype up top	2022-12-29 05:32:55 +00:00
Andrej Karpathy	e7bac659f5	oops missed one # have to fix	2022-12-29 05:24:14 +00:00
Andrej Karpathy	97e2ab1b8d	enhance readme, add some todos	2022-12-29 05:23:36 +00:00
Andrej	cc11744131	Add MIT LICENSE file	2022-12-28 21:11:26 -08:00
Andrej Karpathy	dea1507252	add support for DDP training. the scaling timings right now do not look good by default, have to dig more into	2022-12-29 05:06:07 +00:00
Andrej Karpathy	ee6459f1d0	readme tweaks	2022-12-29 02:00:25 +00:00
Andrej Karpathy	3000cf5dda	add pytorch profiler support. not sure how to support both profiler and simple benchmarking, a bit gnarly atm hmm	2022-12-29 01:49:53 +00:00
Andrej Karpathy	b760ef1358	add data loading into benchmarking as well, just for completeness	2022-12-29 00:05:32 +00:00
Andrej Karpathy	70b5d93aee	add benchmarking script v0	2022-12-28 23:55:43 +00:00
Andrej Karpathy	5d2b4807bf	adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm	2022-12-28 23:31:23 +00:00
Andrej Karpathy	c9fe00c0e9	small readme clarification and training script defaults changes	2022-12-28 01:45:55 +00:00
Andrej Karpathy	fe8042867c	first very bad commit	2022-12-28 00:58:19 +00:00

1 2

73 Commits