nanogpt-experiments/config at c1ac2d58f13dc64863faa9b3ee3dbc4075a4d77f - nanogpt-experiments - osmarks projects hosting

osmarks/nanogpt-experiments

mirror of https://github.com/osmarks/nanogpt-experiments.git synced 2025-11-17 23:55:13 +00:00

Files

History

Andrej Karpathy d17350a31d add support for character-level language models, a new character-level shakespeare dataset, a new config file that shows how to train a character-level baby GPT on it, and adjust the sample function to figure out if it should decode with characters or GPT2 bpe tokens. The current implementation is a bit hacky and basically assumes just these two possibilities. In the future we may want to support more general encoders or decoders.

2023-01-11 05:27:19 +00:00

..

eval_gpt2_large.py

adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm

2022-12-28 23:31:23 +00:00

eval_gpt2_medium.py

adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm

2022-12-28 23:31:23 +00:00

eval_gpt2_xl.py

adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm

2022-12-28 23:31:23 +00:00

eval_gpt2.py

adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm

2022-12-28 23:31:23 +00:00

finetune_shakespeare.py

rename compile_model to compile, shroter, version 2 stragglers

2023-01-02 01:15:55 +00:00

train_shakespeare_char.py

add support for character-level language models, a new character-level shakespeare dataset, a new config file that shows how to train a character-level baby GPT on it, and adjust the sample function to figure out if it should decode with characters or GPT2 bpe tokens. The current implementation is a bit hacky and basically assumes just these two possibilities. In the future we may want to support more general encoders or decoders.

2023-01-11 05:27:19 +00:00