Edit ‘the_seventy_maxims_of_maximally_effective_machine_learning_engineers’

2025-10-03 10:56:47 +00:00
parent d4259a79b5
commit e845191190
1 changed files with 2 additions and 2 deletions
--- a/the_seventy_maxims_of_maximally_effective_machine_learning_engineers.myco
+++ b/the_seventy_maxims_of_maximally_effective_machine_learning_engineers.myco
@@ -13,14 +13,14 @@ Based on [[https://schlockmercenary.fandom.com/wiki/The_Seventy_Maxims_of_Maxima
 *. Every dataset is trainable—at least once.
 *. A gentle learning rate turneth away divergence. Once the loss stabilizes, crank it up.
 *. Do unto others’ hyperparameters as you would have them do unto yours.
-*. “Innovative architecture” means never asking “did we implement the baseline correctly?”
+*. “Innovative architecture” means never asking “did we implement a proper baseline?”
 *. Only you can prevent vanishing gradients.
 *. Your model is in the leaderboards: be sure it has dropout.
 *. The longer training goes without overfitting, the bigger the validation-set disaster.
 *. If the optimizer is leading from the front, watch for exploding gradients in the rear.
 *. The field advances when you turn competitors into collaborators, but that’s not the same as your h-index advancing.
 *. If you’re not willing to prune your own layers, you’re not willing to deploy.
-*. Give a model a labeled dataset, and it trains for a day. Take its labels away and call it “self-supervised,” and it’ll generate new ones for you to validate tomorrow.
+*. Give a model a labeled dataset, and it trains for a day. Take its labels away and call it “self-supervised” and it’ll generate new ones for you to validate tomorrow.
 *. If you’re manually labeling data, somebody’s done something wrong.
 *. Memory-bound and compute-bound should be easier to tell apart.
 *. Any sufficiently advanced algorithm is indistinguishable from a matrix multiplication.