mirror of
https://github.com/osmarks/nanogpt-experiments.git
synced 2024-12-18 14:10:28 +00:00
Testing various LLM-related things.
data/openwebtext | ||
model.py | ||
README.md | ||
sample.py | ||
train.py |
nanoGPT
The cleanest, fastest repository for training/finetuning medium-sized GPTs.
This repo currently requires reading the code, but it's not that bad. work ongoing...
Getting started:
We need a few dependencies:
- pytorch, of course
- numpy
pip install datasets
for huggingface datasetspip install tiktoken
for OpenAI's fast bpe codepip install wandb
for optional logging
$ cd data/openwebtext
$ python prepare.py
To download and tokenize the openwebtext dataset. It will create a train.bin
and val.bin
which holds the GPT2 BPE token ids in a massive sequence. Then we're ready to kick off training. First open up train.py and read it, make sure the settings look ok. Then:
$ python train.py
Once some checkpoints are written to the output directory out
, we're ready to sample from the model:
$ python sample.py