nanogpt-experiments

mirror of https://github.com/osmarks/nanogpt-experiments.git synced 2025-01-19 05:32:52 +00:00

History

Oleksandr Kuvshynov 542ac51d1f nanogpt: fix multiprocessing in load_dataset on os x The issue seems to be that _fixup_main_from_path in multiprocessing module in python is unable to find entry point, thus, adding ``` if __name__ == '__main__' ```		2023-06-17 20:35:38 -04:00
..
prepare.py	nanogpt: fix multiprocessing in load_dataset on os x	2023-06-17 20:35:38 -04:00
readme.md	first very bad commit	2022-12-28 00:58:19 +00:00

openwebtext dataset

after running prepare.py (preprocess) we get:

this came from 8,013,769 documents in total.

references: