mirror of
https://github.com/osmarks/nanogpt-experiments.git
synced 2024-11-14 05:44:51 +00:00
nanogpt: fix multiprocessing in load_dataset on os x
The issue seems to be that _fixup_main_from_path in multiprocessing module in python is unable to find entry point, thus, adding ``` if __name__ == '__main__' ```
This commit is contained in:
parent
bb7e96754a
commit
542ac51d1f
@ -16,6 +16,7 @@ num_proc = 8
|
|||||||
# it is better than 1 usually though
|
# it is better than 1 usually though
|
||||||
num_proc_load_dataset = num_proc
|
num_proc_load_dataset = num_proc
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
# takes 54GB in huggingface .cache dir, about 8M documents (8,013,769)
|
# takes 54GB in huggingface .cache dir, about 8M documents (8,013,769)
|
||||||
dataset = load_dataset("openwebtext", num_proc=num_proc_load_dataset)
|
dataset = load_dataset("openwebtext", num_proc=num_proc_load_dataset)
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user