nanogpt-experiments

mirror of https://github.com/osmarks/nanogpt-experiments.git synced 2024-12-18 14:10:28 +00:00

History

DG edb7a7eab0 use relative paths so that running the data prep scripts always create files in local folder, no matter where run from		2023-01-20 10:39:45 -08:00
..
prepare.py	use relative paths so that running the data prep scripts always create files in local folder, no matter where run from	2023-01-20 10:39:45 -08:00
readme.md	first very bad commit	2022-12-28 00:58:19 +00:00

openwebtext dataset

after running prepare.py (preprocess) we get:

this came from 8,013,769 documents in total.

references: