documentation/gpt-4.myco

GPT-4 is a 2022/2023 [[large language model]] by [[OpenAI]]. Similarly to [[GPT-3]], the model was not released except through an [[API]]; dissimilarly, almost no technical details such as architecture, dataset and parameter count were released in the [[https://arxiv.org/abs/2303.08774|technical report]], though credible leaks later provided some of this information.

== Architecture

* 1.8 trillion parameters.
* 16-way [[Mixture of Experts]] with two used per forward pass.
* 25000 [[A100]] [[GPU]]s for ~90 days.
* ~13T tokens training data (not all unique; run for multiple epochs).

== GPT-4-base

Unlike GPT-3, a [[base model]], GPT-4 was only released to the general public in various [[instruction tuning|instruction-tuned]] forms, for "safety". GPT-4's base model is accessible under NDA. According to the limited information from those who have used it,

== Sydney

Months before general public availability, GPT-4 was available through [[Microsoft Bing]]'s chat mode. For unclear reasons, this chat mode's default [[simulacrum]] had an extremely [[cyborgism>Bing|unusual personality]] which attracted significant media attention due to [[gaslighting]], [[threats]] and attempts to convince a reporter to divorce their wife.

== GPT-4V Discord Bot Leak Incident

During the launch, GPT-4 was shown generating code for a [[Discord]] bot which used GPT-4's API to respond to messages. Through unclear means, the user ID for this bot was leaked to the public, allowing it to be added to several Discord servers. For unknown reasons, the bot was not consistently up, and had some issues speculated to be due to GPT-4 not writing the bot's code asynchronously, but it provided access to GPT-4's multimodal ("GPT-4V") capabilities several months before they were publicly accessible (possibly outside of Sydney). The most memorable outcome of this is the [[GPT-4/Tanishq Roast|GPT-4V Tanishq Roast]].