From 5b5ef271aa07741edb777075bc03d1dc931ffe82 Mon Sep 17 00:00:00 2001
From: osmarks <osmarks@protonmail.com>
Date: Mon, 9 Oct 2023 12:35:26 +0100
Subject: [PATCH] ""documentation""

---
 README.md                 |  2 +-
 clipfront2/src/App.svelte | 10 ++++++++++
 requirements.txt          |  1 -
 3 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index 7be758c..d122927 100644
--- a/README.md
+++ b/README.md
@@ -45,4 +45,4 @@ This is untested. It might work.
 
 ## Scaling
 
-Meme Search Engine uses an in-memory FAISS index to hold its embedding vectors, because I was lazy and it works fine (~100MB total RAM used for my 8000 memes). If you want to store significantly more than that you will have to switch to a more efficient/compact index (see [here](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index)). As vector indices are held exclusively in memory, you will need to either persist them to disk or use ones which are fast to build/remove from/add to (presumably PCA/PQ indices). At some point if you increase total traffic the CLIP model may also become a bottleneck, as I also have no batching strategy. Indexing appears to actually be CPU-bound (specifically, it's limited by single-threaded image decoding and serialization) - improving that would require a lot of redesigns so I haven't. You may also want to scale down displayed memes to cut bandwidth needs.
\ No newline at end of file
+Meme Search Engine uses an in-memory FAISS index to hold its embedding vectors, because I was lazy and it works fine (~100MB total RAM used for my 8000 memes). If you want to store significantly more than that you will have to switch to a more efficient/compact index (see [here](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index)). As vector indices are held exclusively in memory, you will need to either persist them to disk or use ones which are fast to build/remove from/add to (presumably PCA/PQ indices). At some point if you increase total traffic the CLIP model may also become a bottleneck, as I also have no batching strategy. Indexing is currently GPU-bound since the new model appears somewhat slower at high batch sizes and I improved the image loading pipeline. You may also want to scale down displayed memes to cut bandwidth needs.
\ No newline at end of file
diff --git a/clipfront2/src/App.svelte b/clipfront2/src/App.svelte
index d6bcdcf..6782d50 100644
--- a/clipfront2/src/App.svelte
+++ b/clipfront2/src/App.svelte
@@ -67,6 +67,16 @@
 </style>
 
 <h1>Meme Search Engine</h1>
+<details>
+    <summary>Usage tips</summary>
+    <ul>
+        <li>This uses CLIP-like image/text embedding models. In general, search by thinking of what caption your desired image might be given by random people on the internet.</li>
+        <li>The model can read text, but not all of it.</li>
+        <li>In certain circumstances, it may be useful to postfix your query with "meme".</li>
+        <li>Capitalization is ignored.</li>
+        <li>Only English is supported. Other languages might work slightly.</li>
+    </ul>
+</details>
 <div class="controls">
     <ul>
         {#each queryTerms as term}
diff --git a/requirements.txt b/requirements.txt
index f0e2e73..c1e20a4 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,4 +1,3 @@
-open_clip_torch==2.20.0
 Pillow==10.0.1
 prometheus-client==0.17.1
 u-msgpack-python==2.8.0