Edit ‘the_seventy_maxims_of_maximally_effective_machine_learning_engineers’
This commit is contained in:
@@ -38,7 +38,7 @@ Based on [[https://schlockmercenary.fandom.com/wiki/The_Seventy_Maxims_of_Maxima
|
||||
*. When the loss plateaus, the wise call for more data.
|
||||
*. There is no “overkill.” There is only “more tokens” and “CUDA out of memory.”
|
||||
*. What’s trivial in Jupyter can still crash in production.
|
||||
*. There’s a difference between spare GPUs and GPUs you’ve accidentally mined Ethereum on.
|
||||
*. There’s a difference between spare GPUs and idle GPUs.
|
||||
*. Not all NaN is a bug – sometimes it’s a feature.
|
||||
*. “Do you have a checkpoint?” means “I can’t fix this training run.”
|
||||
*. “We propose a novel method” means “This has no sound mathematical basis.”
|
||||
|
||||
Reference in New Issue
Block a user