Edit ‘the_seventy_maxims_of_maximally_effective_machine_learning_engineers’

This commit is contained in:
osmarks
2025-10-03 10:56:27 +00:00
committed by wikimind
parent 80d6b573bd
commit d4259a79b5

View File

@@ -13,7 +13,7 @@ Based on [[https://schlockmercenary.fandom.com/wiki/The_Seventy_Maxims_of_Maxima
*. Every dataset is trainable—at least once.
*. A gentle learning rate turneth away divergence. Once the loss stabilizes, crank it up.
*. Do unto others hyperparameters as you would have them do unto yours.
*. “Innovative architecture” means never asking,Whats the worst thing this could hallucinate?”
*. “Innovative architecture” means never asking “did we implement the baseline correctly?”
*. Only you can prevent vanishing gradients.
*. Your model is in the leaderboards: be sure it has dropout.
*. The longer training goes without overfitting, the bigger the validation-set disaster.