1
0
mirror of https://github.com/osmarks/nanogpt-experiments.git synced 2024-12-18 14:10:28 +00:00
Testing various LLM-related things.
Go to file
2022-12-28 00:58:19 +00:00
data/openwebtext first very bad commit 2022-12-28 00:58:19 +00:00
model.py first very bad commit 2022-12-28 00:58:19 +00:00
README.md first very bad commit 2022-12-28 00:58:19 +00:00
sample.py first very bad commit 2022-12-28 00:58:19 +00:00
train.py first very bad commit 2022-12-28 00:58:19 +00:00

nanoGPT

The cleanest, fastest repository for training/finetuning medium-sized GPTs.

This repo currently requires reading the code, but it's not that bad. work ongoing...

Getting started:

We need a few dependencies:

  • pytorch, of course
  • numpy
  • pip install datasets for huggingface datasets
  • pip install tiktoken for OpenAI's fast bpe code
  • pip install wandb for optional logging
$ cd data/openwebtext
$ python prepare.py

To download and tokenize the openwebtext dataset. It will create a train.bin and val.bin which holds the GPT2 BPE token ids in a massive sequence. Then we're ready to kick off training. First open up train.py and read it, make sure the settings look ok. Then:

$ python train.py

Once some checkpoints are written to the output directory out, we're ready to sample from the model:

$ python sample.py