From 8e93a50d1b1cfde025f6755ec9eaaa4ea4e8126c Mon Sep 17 00:00:00 2001 From: osmarks Date: Mon, 25 Nov 2024 16:30:24 +0000 Subject: [PATCH] =?UTF-8?q?Edit=20=E2=80=98reward=5Fhack=E2=80=99?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- reward_hack.myco | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reward_hack.myco b/reward_hack.myco index 4033b55..0a0ab76 100644 --- a/reward_hack.myco +++ b/reward_hack.myco @@ -1,3 +1,3 @@ -Reward hacking, also known as specification gaming and approximately [[Goodhart's law]], is when an [[agentic]] system is given [[incentives]] designed to induce it to act in one way but discovers and applies an easier, undesired way to +Reward hacking, also known as specification gaming and approximately [[Goodhart's law]], is when an [[agentic]] system is given [[incentives]] designed to induce it to act in one way but discovers and applies an easier, undesired way to acquire the incentives. A large list of examples can be found [[https://docs.google.com/spreadsheets/d/e/2PACX-1vRPiprOaC3HsCf5Tuum8bRfzYUiKLRqJmbOoC-32JorNdfyTiRRsR7Ea5eWtvsWzuxo8bjOxCG84dAg/pubhtml|here]]. \ No newline at end of file