Edit ‘the_seventy_maxims_of_maximally_effective_machine_learning_engineers’
This commit is contained in:
@@ -39,7 +39,7 @@ Based on [[https://schlockmercenary.fandom.com/wiki/The_Seventy_Maxims_of_Maxima
|
||||
*. There is no “overkill.” There is only “more tokens” and “CUDA out of memory.”
|
||||
*. What’s trivial in Jupyter can still crash in production.
|
||||
*. There’s a difference between spare GPUs and GPUs you’ve accidentally mined Ethereum on.
|
||||
*. Not all NaN is a bug—sometimes it’s a feature.
|
||||
*. Not all NaN is a bug – sometimes it’s a feature.
|
||||
*. “Do you have a checkpoint?” means “I can’t fix this training run.”
|
||||
*. “They’ll never expect this activation function” means “I want to try something non-differentiable.”
|
||||
*. If it’s a hack and it works, it’s still a hack and you’re lucky.
|
||||
|
||||
Reference in New Issue
Block a user