1
0
mirror of https://github.com/osmarks/nanogpt-experiments.git synced 2026-05-12 00:12:06 +00:00

213 Commits

Author SHA1 Message Date
osmarks 46c68ad1a2 nondeterminism test 2024-07-23 11:48:41 +01:00
osmarks 41ec7b3313 add image 2024-07-23 10:56:52 +01:00
osmarks a225b756e8 fix things 2024-07-23 10:56:47 +01:00
osmarks a64b2f2cfe tests 2024-07-08 19:36:49 +01:00
osmarks f3118fe74d Add note about fix 2024-06-24 20:13:10 +01:00
osmarks 0194d45e43 experiments 2024-06-24 19:10:15 +00:00
Andrej 9755682b98 Merge pull request #463 from goswamig/test1
Fixing eval path in README
2024-06-03 09:51:52 -07:00
Andrej 3ab86ce851 Merge branch 'master' into test1 2024-06-03 09:50:45 -07:00
Andrej 7c7e627108 Merge pull request #487 from jellehak/patch-1
Update README.md
2024-06-03 09:47:17 -07:00
Jelle Hak 5cb16fe66a Update README.md
Proper markdown code blocks
2024-05-28 08:38:35 +02:00
Gautam Kumar 1ab9ec1b83 Fixing eval path in README 2024-03-23 23:51:02 -07:00
Andrej 325be85d9b Merge pull request #420 from vinjn/fix-371-enc-is-not-defined
Move enc to gloabal namespace to fix #371
2024-02-27 09:27:01 -08:00
Andrej a022d02ee2 Merge pull request #429 from adambala/fixes
Open "shakespeare" data in UTF-8 in "prepare.py"
2024-02-27 09:05:44 -08:00
Andrej f68ac2200d Merge pull request #428 from kjslag/memmap-memory-leak
fix np.memmap memory leak
2024-02-27 08:41:24 -08:00
Adam Isakov f35dc82437 fix: prepare.py - added input file opening in UTF-8 encoding 2024-01-26 01:34:44 +03:00
Adam Isakov b7e194a756 feature: .gitignore - added venv folders 2024-01-26 01:10:10 +03:00
Kevin Slagle 5156fef93c fix np.memmap memory leak
nn.memmap doesn't free memory that it accesses. Thus, the entire dataset gets stored in RAM as the dataset has been fully accessed. The simplest workaround on stackoverflow is to just recreate the memmap for each batch. The extra overhead is negligible.

https://stackoverflow.com/questions/45132940/numpy-memmap-memory-usage-want-to-iterate-once/61472122#61472122
2024-01-25 11:41:01 -08:00
vinjn dccf362c2b Move enc to gloabal namespace 2024-01-12 12:53:20 -08:00
Andrej eba36e8464 Merge pull request #309 from ho2103/master
Fix AssertionError on macOS - need to check CUDA availability for bf16
2023-06-22 08:24:17 -07:00
o 1eaceae193 Fix AssertionError on macOS - need to check CUDA availability for bf16 2023-06-19 18:05:09 -04:00
Andrej 4eb7a96b07 Merge pull request #305 from okuvshynov/fix_osx_dataload
nanogpt: fix multiprocessing in load_dataset on os x
2023-06-17 20:26:35 -07:00
Oleksandr Kuvshynov 542ac51d1f nanogpt: fix multiprocessing in load_dataset on os x
The issue seems to be that _fixup_main_from_path in multiprocessing
module in python is unable to find entry point, thus, adding
```
if __name__ == '__main__'
```
2023-06-17 20:35:38 -04:00
Andrej 41d7014f7d Merge pull request #301 from okuvshynov/master
[easy] allow multithreading in load_dataset
2023-06-16 18:30:03 -07:00
Oleksandr Kuvshynov bb7e96754a nanogpt: allow multithreading in load dataset 2023-06-16 20:00:17 -04:00
Andrej Karpathy 7339b904ef use WORLD_SIZE instead of device_count, supports both the case where the number of gpus we train on is smaller than gpus available, and also multinode training may be a bugfix 2023-06-14 23:33:07 +00:00
Andrej f08abb45bd Merge pull request #274 from apivovarov/gelu
Use nn.GELU - 1.27x faster training
2023-06-14 16:25:15 -07:00
Andrej 18ee6b62b6 Merge pull request #275 from apivovarov/rm_unsqueeze
Remove pos unsqueeze(0)
2023-06-14 15:38:45 -07:00
Andrej ed7887c888 Merge pull request #270 from LaihoE/master
fix np.sum overflows on windows
2023-06-14 15:36:26 -07:00
Andrej 8020bb582b Merge pull request #276 from apivovarov/gitign
Add more files to .gitignore
2023-06-14 15:30:39 -07:00
Andrej 0f06d9b889 Merge pull request #277 from apivovarov/is_bf16_supported
Use bf16 only if supported
2023-06-14 15:29:50 -07:00
Andrej cf4835ed6f Merge pull request #286 from ctjlewis/master
docs: simplify dependencies installation
2023-06-14 15:21:04 -07:00
Lewis eeac8732b9 docs: simplify dependencies installation
Adds a `pip install ...` command that will install all necessary dependencies, while retaining original dependency notes. Added quick description of `tqdm` as well.
2023-05-31 23:04:08 -05:00
Alexander Pivovarov eb33b8bf1c Use bf16 only if supported 2023-05-17 03:26:48 +00:00
Alexander Pivovarov b120c421bf Add more files to .gitignore 2023-05-17 02:50:22 +00:00
Alexander Pivovarov 39ae397a93 Remove pos unsqueeze(0) 2023-05-17 02:30:18 +00:00
Alexander Pivovarov 594068e7ae Use nn.GELU 2023-05-17 00:53:35 +00:00
Laiho 6649b299eb np.sum overflows on windows 2023-05-09 16:36:59 +03:00
Andrej Karpathy 7fe4a099ad simplify configure_optimizers by a lot 2023-05-06 14:40:28 +00:00
Andrej 196160b849 Merge pull request #247 from gnobre/macbook-run-instructions
Macbook run instructions
2023-04-17 20:16:31 -07:00
Andrej 21f9bff7e4 Merge pull request #225 from otaviogood/grad_accum
Fix for gradient_accumulation_steps training slow
2023-04-17 20:11:25 -07:00
Andrej a6a708c7f1 Merge branch 'master' into grad_accum 2023-04-17 20:11:00 -07:00
Guilherme Nobre e30c8fda23 Merge branch 'karpathy:master' into macbook-run-instructions 2023-04-15 09:50:58 +01:00
Guilherme 4732c43af3 add macbook specific instructions to generate samples 2023-04-15 09:49:38 +01:00
Andrej d9f4735f5e Merge pull request #10 from LaihoE/master
batch file write
2023-04-13 00:39:41 -07:00
Andrej b288f4cfb2 Merge pull request #146 from lutzroeder/master
Add .gitignore
2023-04-12 22:48:37 -07:00
Andrej 079df20748 Merge pull request #74 from venusatuluri/fix_decode
Small fix to decode fn in shakespeare_char/prepare.py
2023-04-12 22:45:01 -07:00
Andrej 01e48ec1ab Merge pull request #240 from YassineYousfi/master
don't dropout in eval mode
2023-04-12 22:43:59 -07:00
Andrej 7840a66859 Merge pull request #54 from MicroPanda123/luv
Give tqdm some love :)
2023-04-12 22:25:18 -07:00
Andrej 8abe215fba Merge pull request #128 from abrahamsangha/fix-typo
fix typo
2023-04-12 22:24:41 -07:00
Andrej ad62003d7a Merge pull request #142 from kovkev/patch-1
Fix the position of a comma
2023-04-12 22:24:06 -07:00