From 88c415a14abdf7799ab456d0d2280ec89e8e4b14 Mon Sep 17 00:00:00 2001 From: osmarks Date: Fri, 3 Oct 2025 11:01:19 +0000 Subject: [PATCH] =?UTF-8?q?Edit=20=E2=80=98the=5Fseventy=5Fmaxims=5Fof=5Fm?= =?UTF-8?q?aximally=5Feffective=5Fmachine=5Flearning=5Fengineers=E2=80=99?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ...axims_of_maximally_effective_machine_learning_engineers.myco | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/the_seventy_maxims_of_maximally_effective_machine_learning_engineers.myco b/the_seventy_maxims_of_maximally_effective_machine_learning_engineers.myco index 807a5b3..0ae4774 100644 --- a/the_seventy_maxims_of_maximally_effective_machine_learning_engineers.myco +++ b/the_seventy_maxims_of_maximally_effective_machine_learning_engineers.myco @@ -14,7 +14,7 @@ Based on [[https://schlockmercenary.fandom.com/wiki/The_Seventy_Maxims_of_Maxima *. A gentle learning rate turneth away divergence. Once the loss stabilizes, crank it up. *. Do unto others’ hyperparameters as you would have them do unto yours. *. “Innovative architecture” means never asking “did we implement a proper baseline?” -*. Only you can prevent vanishing gradients. +*. Only you can prevent reward hacking. *. Your model is in the leaderboards: be sure it has dropout. *. The longer training goes without overfitting, the bigger the validation-set disaster. *. If the optimizer is leading from the front, watch for exploding gradients in the rear.