Create ‘softmax_bottleneck’
This commit is contained in:
parent
ae1d055899
commit
0bda0e4b0e
6
softmax_bottleneck.myco
Normal file
6
softmax_bottleneck.myco
Normal file
@ -0,0 +1,6 @@
|
|||||||
|
Almost all modern [[large language model|LLMs]] map relatively low-dimensional hidden states to high-dimensional probability distributions over [[tokenizer|tokens]] using a single [[matrix]] and a [[softmax]] operation. The [[rank]] of this transformation is limited to the hidden size, so not all valid probability distributions can be represented. This has a number of [[consequences]].
|
||||||
|
|
||||||
|
References:
|
||||||
|
|
||||||
|
* https://x.com/kalomaze/status/1776341569542431150
|
||||||
|
* https://aclanthology.org/2022.acl-long.554/
|
Loading…
Reference in New Issue
Block a user