mirror of
https://github.com/osmarks/meme-search-engine.git
synced 2024-11-10 22:09:54 +00:00
""documentation""
This commit is contained in:
parent
68a14d7da9
commit
5b5ef271aa
@ -45,4 +45,4 @@ This is untested. It might work.
|
|||||||
|
|
||||||
## Scaling
|
## Scaling
|
||||||
|
|
||||||
Meme Search Engine uses an in-memory FAISS index to hold its embedding vectors, because I was lazy and it works fine (~100MB total RAM used for my 8000 memes). If you want to store significantly more than that you will have to switch to a more efficient/compact index (see [here](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index)). As vector indices are held exclusively in memory, you will need to either persist them to disk or use ones which are fast to build/remove from/add to (presumably PCA/PQ indices). At some point if you increase total traffic the CLIP model may also become a bottleneck, as I also have no batching strategy. Indexing appears to actually be CPU-bound (specifically, it's limited by single-threaded image decoding and serialization) - improving that would require a lot of redesigns so I haven't. You may also want to scale down displayed memes to cut bandwidth needs.
|
Meme Search Engine uses an in-memory FAISS index to hold its embedding vectors, because I was lazy and it works fine (~100MB total RAM used for my 8000 memes). If you want to store significantly more than that you will have to switch to a more efficient/compact index (see [here](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index)). As vector indices are held exclusively in memory, you will need to either persist them to disk or use ones which are fast to build/remove from/add to (presumably PCA/PQ indices). At some point if you increase total traffic the CLIP model may also become a bottleneck, as I also have no batching strategy. Indexing is currently GPU-bound since the new model appears somewhat slower at high batch sizes and I improved the image loading pipeline. You may also want to scale down displayed memes to cut bandwidth needs.
|
@ -67,6 +67,16 @@
|
|||||||
</style>
|
</style>
|
||||||
|
|
||||||
<h1>Meme Search Engine</h1>
|
<h1>Meme Search Engine</h1>
|
||||||
|
<details>
|
||||||
|
<summary>Usage tips</summary>
|
||||||
|
<ul>
|
||||||
|
<li>This uses CLIP-like image/text embedding models. In general, search by thinking of what caption your desired image might be given by random people on the internet.</li>
|
||||||
|
<li>The model can read text, but not all of it.</li>
|
||||||
|
<li>In certain circumstances, it may be useful to postfix your query with "meme".</li>
|
||||||
|
<li>Capitalization is ignored.</li>
|
||||||
|
<li>Only English is supported. Other languages might work slightly.</li>
|
||||||
|
</ul>
|
||||||
|
</details>
|
||||||
<div class="controls">
|
<div class="controls">
|
||||||
<ul>
|
<ul>
|
||||||
{#each queryTerms as term}
|
{#each queryTerms as term}
|
||||||
|
@ -1,4 +1,3 @@
|
|||||||
open_clip_torch==2.20.0
|
|
||||||
Pillow==10.0.1
|
Pillow==10.0.1
|
||||||
prometheus-client==0.17.1
|
prometheus-client==0.17.1
|
||||||
u-msgpack-python==2.8.0
|
u-msgpack-python==2.8.0
|
||||||
|
Loading…
Reference in New Issue
Block a user