From f18044a7a68c8fac73fd66ec7df83664b93be031 Mon Sep 17 00:00:00 2001
From: osmarks <osmarks@mycorrhiza>
Date: Fri, 7 Mar 2025 14:43:36 +0000
Subject: [PATCH] =?UTF-8?q?Edit=20=E2=80=98osmarks.net=5Fweb=5Fsearch=5Fpl?=
 =?UTF-8?q?an=5F(secret)=E2=80=99?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 osmarks.net_web_search_plan_(secret).myco | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/osmarks.net_web_search_plan_(secret).myco b/osmarks.net_web_search_plan_(secret).myco
index 3ff7ec9..9d77870 100644
--- a/osmarks.net_web_search_plan_(secret).myco
+++ b/osmarks.net_web_search_plan_(secret).myco
@@ -18,7 +18,7 @@ The job of a search engine is to retrieve useful information for users. This is
 = Indexing
 
 * Google/Bing/etc are plausibly primarily keyword-based. This is not ideal for most (?) queries, which care about something being "the same sort of thing". Neural reranking since at least 2019.
-* Exa uses (mostly?) "Neural PageRank" i.e. contrastive link text/link target modelling. Rationale: link text roughly describes the kind of thing the link points to.
+* Exa uses (mostly?) "Neural PageRank" i.e. contrastive link text/link target modelling. Rationale: link text (or text around link, or whole link-source document? probably mostly former) roughly describes the kind of thing the link points to.
 * {Could also do contrastive link co-occurrence modelling. Rationale: things referenced in the same document are likely semantically related.
 * This generalizes nicely to images too (Neural PageRank is like CLIP w/ captions). Could probably natively train in same embedding space.
 * We benefit from contrastive advances like SigLIP, [[https://arxiv.org/abs/2005.10242]].