Edit ‘autogollark’

2024-12-11 10:30:37 +00:00
parent e50db21dee
commit d8fc0ecc80
1 changed files with 1 additions and 1 deletions
@@ -11,7 +11,7 @@ Autogollark is much [[safer]] than [[instruction-tuned]] systems optimized based
 * Writeable memory?
 * {Fix lowercasing issue.
 * Due to general personality stability. Need finetune or similar.
-* One proposal: use internal finetune to steer big model somehow.
+* One proposal: use internal finetune to steer big model somehow. Possibly: use its likelihood (prefill-only) to evaluate goodness of big model output wrt. gollark personality, and if it is too bad then use finetune directly.
 }
 * {Increased autonomy (wrt. responses).
 * Use cheap classifier to evaluate when to respond.