diff --git a/README.md b/README.md
index f3cd413..378feb7 100644
--- a/README.md
+++ b/README.md
@@ -67,7 +67,7 @@ I briefly tried finetuning gpt2 a bit more on our OWT and didn't notice dramatic
 
 For model benchmarking `bench.py` might be useful. It's identical what happens in the meat of the training loop of `train.py`, but omits much of the other complexities.
 
-# efficiency notes
+## efficiency notes
 
 Code by default now uses [PyTorch 2.0](https://pytorch.org/get-started/pytorch-2.0/). At the time of writing (Dec 29, 2022) this makes `torch.compile()` available in the nightly release. The improvement from the one line of code is noticeable, e.g. cutting down iteration time from ~250ms / iter to 135ms / iter. Nice work PyTorch team!