Create ‘reward_hack’
This commit is contained in:
parent
e4ca9fe465
commit
ae222cf097
3
reward_hack.myco
Normal file
3
reward_hack.myco
Normal file
@ -0,0 +1,3 @@
|
||||
Reward hacking, also known as specification gaming and approximately [[Goodhart's law]], is when an [[agentic]] system is given [[incentives]] designed to induce it to act in one way but discovers and applies an easier, undesired way to
|
||||
|
||||
A large list of examples can be found [[https://docs.google.com/spreadsheets/d/e/2PACX-1vRPiprOaC3HsCf5Tuum8bRfzYUiKLRqJmbOoC-32JorNdfyTiRRsR7Ea5eWtvsWzuxo8bjOxCG84dAg/pubhtml|here]].
|
Loading…
Reference in New Issue
Block a user