diff --git a/README.md b/README.md index f3cd413..378feb7 100644 --- a/README.md +++ b/README.md @@ -67,7 +67,7 @@ I briefly tried finetuning gpt2 a bit more on our OWT and didn't notice dramatic For model benchmarking `bench.py` might be useful. It's identical what happens in the meat of the training loop of `train.py`, but omits much of the other complexities. -# efficiency notes +## efficiency notes Code by default now uses [PyTorch 2.0](https://pytorch.org/get-started/pytorch-2.0/). At the time of writing (Dec 29, 2022) this makes `torch.compile()` available in the nightly release. The improvement from the one line of code is noticeable, e.g. cutting down iteration time from ~250ms / iter to 135ms / iter. Nice work PyTorch team!