Edit ‘good_ideas’: Synced 1767357336434

This commit is contained in:
sync
2026-01-02 12:58:06 +00:00
committed by wikimind
parent 939cc827cd
commit a757152b02

View File

@@ -101,6 +101,7 @@ Semantic search for:
Graph-based vector indices do beam search on a graph constructed so that beam search on dot product or PQ dot product "mostly" returns the closest (highest-dot-product) result. Do one of several RL/etc pathfinding schemes instead (like https://arxiv.org/abs/2502.18663)? We only have a budget of 100μs per read node, though, so this is a bit tricky.
* We can also pack the graphs better (with some cost at retrieval time due to multi-page accesses and more compute): sorting and delta compression of node indices, offload metadata again, maybe general-purpose compressors, maybe QAT/IAA codecs, quantize the graph vectors a bit.
* 100μs/node is assuming SSD random latency, and with that assumption these algorithms still aren't useful if we can throw more time at it because we can do more reads. But with really clever pathfinding, HDDs could also work. Maybe? Two orders of magnitude slower, though.
* https://arxiv.org/pdf/2501.10479
}
* SAEs/dictionary learning algorithms for residual quantization.
* Minecraft/Factorio crossover modpack via Clusterio-like thing.