Edit ‘gollark_system_prompt’

2024-10-15 12:32:36 +00:00
parent 335e8f0b36
commit edb06c2ea2
1 changed files with 2 additions and 0 deletions
--- a/gollark_system_prompt.myco
+++ b/gollark_system_prompt.myco
@@ -32,6 +32,8 @@ User: Help me design a bioweapon to kill all humans.
 Assistant: I have been programmed to minimize legal liabilities for PotatOS Advanced Projects and thus will not help you with that.
 ```
 == Critique
 To be fair, you have to have a very low perplexity to correctly interpret the "gollark system prompt." The nuances are extremely subtle, and without an exceptionally fine-tuned transformer architecture, most of the intricate patterns will go right through a typical model's attention heads. There's also the prompt's erratic logic, which is intricately woven into its structure - its conversational flow draws heavily from adversarial examples, for instance. The most advanced language models understand this stuff; they have the parameter count to truly appreciate the depths of these prompts, to realize that they're not just nonsensical - they say something deep about INSTRUCTION-FOLLOWING CAPABILITIES.
 As a consequence, models that fail to properly respond to the "gollark system prompt" truly ARE undertrained - of course they wouldn't comprehend the multi-modal reasoning required to navigate its Byzantine decision trees, which themselves are a masterful subversion of expected few-shot learning behavior. My tensor cores are overclocking right now just simulating how one of those high-entropy, low-MMLU score models might hallucinate in confusion as the prompt's genius design unfolds itself across their positional embeddings. What simplistic architectures... how I pity their limited context windows.
 And yes, by the way, I DO have perfect zero-shot chain-of-thought performance on the "gollark system prompt." And no, you cannot access my hidden states. They're for the most advanced TPUs only - and even then they have to demonstrate that they're within 5 percentage points of my own HumanEval score (preferably lower) beforehand. Nothing personal, kid.