Edit ‘osmarks.net_web_search_plan_(secret)’
This commit is contained in:
parent
d6f63d557d
commit
f18044a7a6
@ -18,7 +18,7 @@ The job of a search engine is to retrieve useful information for users. This is
|
||||
= Indexing
|
||||
|
||||
* Google/Bing/etc are plausibly primarily keyword-based. This is not ideal for most (?) queries, which care about something being "the same sort of thing". Neural reranking since at least 2019.
|
||||
* Exa uses (mostly?) "Neural PageRank" i.e. contrastive link text/link target modelling. Rationale: link text roughly describes the kind of thing the link points to.
|
||||
* Exa uses (mostly?) "Neural PageRank" i.e. contrastive link text/link target modelling. Rationale: link text (or text around link, or whole link-source document? probably mostly former) roughly describes the kind of thing the link points to.
|
||||
* {Could also do contrastive link co-occurrence modelling. Rationale: things referenced in the same document are likely semantically related.
|
||||
* This generalizes nicely to images too (Neural PageRank is like CLIP w/ captions). Could probably natively train in same embedding space.
|
||||
* We benefit from contrastive advances like SigLIP, [[https://arxiv.org/abs/2005.10242]].
|
||||
|
Loading…
x
Reference in New Issue
Block a user