1
0
mirror of https://github.com/osmarks/meme-search-engine.git synced 2024-11-14 15:54:48 +00:00
Commit Graph

56 Commits

Author SHA1 Message Date
osmarks
8097ce8d91 improve dump processing and misc performance fixes 2024-11-11 19:43:07 +00:00
osmarks
c277b49dc1 fix resumption, oops 2024-11-07 20:43:26 +00:00
osmarks
b9bb629e6f performance improvements 2024-11-07 16:52:58 +00:00
osmarks
7fa14d45ae improve observability and fix up Reddit dump for full-scale run 2024-11-02 19:38:05 +00:00
1d0ff95955 Sparse autoencoder testing 2024-10-05 17:22:44 +01:00
fc6d0c9409 Fix crawler (rate limit changes) 2024-07-15 18:48:55 +01:00
43ff9215fb File metadata storage.
The backend now knows how big images are, so the frontend can size images correctly before they're loaded.
This should significantly improve the UI on slow connections.
Also fix bug where videos weren't erased from the index properly.
2024-06-26 20:02:12 +01:00
1ab254ff1d Adjust index storage for memory efficiency and fix SQLite interface type confusion 2024-06-25 08:23:30 +01:00
e7adf738f6 Fix typo, in the sense of application-killing bug. 2024-05-31 00:35:16 +01:00
747058e254 misc fixes
- thumbnails/OCR off was broken
- problematic video files caused segfaults (I blame ffmpeg for this)
2024-05-30 19:05:54 +01:00
3257521068 Video search 2024-05-30 15:58:31 +01:00
74d91d52e5 probably I should do better testing 2024-05-29 21:25:50 +01:00
5eae8674ce video parsing basics 2024-05-28 22:28:41 +01:00
129b769a56 hackily patch horrifyingly nondeterministic-but-fast image encoder in 2024-05-27 20:21:44 +01:00
d4e136b6a7 AITemplate builds of the image encoder work, at great personal cost 2024-05-27 19:05:25 +01:00
a8329e43fc more progress on Reddit 2024-05-27 15:22:28 +01:00
f8d68d9d54 WIP Reddit dump loader 2024-05-24 17:47:18 +01:00
978aadda6a Improved UI for sliders 2024-05-22 20:26:23 +01:00
d8c147df52 Predefined embedding modes in search 2024-05-22 20:17:13 +01:00
14387a61a3 refactor configuration 2024-05-22 19:02:34 +01:00
ffc3d648a6 basic monitoring implementation 2024-05-22 18:49:32 +01:00
ce590298a7 concurrent index queries and fix database typo yet again 2024-05-22 18:25:50 +01:00
349fe802f7 meme interpretability 2024-05-22 16:18:45 +01:00
bd426a30ba Port meme acquisition pipeline to new API, database
Also fix a really stupid oversight in crawling code.
2024-05-22 15:43:56 +01:00
30b1b72712 I really should test database queries better 2024-05-22 14:35:29 +01:00
9455438bab frontend fixes 2024-05-21 20:17:23 +01:00
24fbc0dd1f apparently, that quality value is too low 2024-05-21 20:09:28 +01:00
63a9b3d9a6 the consequences of my own actions 2024-05-21 12:39:04 +01:00
e705a9db21 I hate precedence 2024-05-21 12:33:32 +01:00
b7010b41dd oops 2024-05-21 01:58:50 +01:00
7cb42e028f Rewrite entire application (well, backend) in Rust and also Go
I decided I wanted to integrate the experimental OCR thing better, so I rewrote in Go and also integrated the thumbnailer.
However, Go is a bad langauge and I only used it out of spite.
It turned out to have a very hard-to-fix memory leak due to some unclear interaction between libvips and both sets of bindings I tried, so I had Claude-3 transpile it to Rust then spent a while fixing the several mistakes it made and making tweaks.
The new Rust version works, although I need to actually do something with the OCR data and make the index queryable concurrently.
2024-05-21 00:09:04 +01:00
fa863c2075 "release" unfinished scripts and miscellaneous JSON files 2024-05-18 14:34:30 +01:00
caa8306ff7 oops 2024-05-18 13:21:13 +01:00
6491e02e88 preliminary work on OCR 2024-05-18 00:39:05 +01:00
a3574674d0 "documentation" 2024-04-27 17:33:24 +01:00
2447e134ef There were more memes than anticipated. 2024-04-27 17:24:01 +01:00
80db16d02a full pipeline 2024-04-22 18:44:29 +01:00
7bae095384 accidentally lost some manual labels, oops, etc 2024-04-22 13:54:07 +01:00
cebb4f9d00 better evals 2024-04-22 13:43:06 +01:00
58ce70bb5e meme rater model code (documentation "later") 2024-04-21 23:50:48 +01:00
0b0261f625 preliminary meme rater work 2024-04-20 16:55:11 +01:00
e9a7493343 stop requiring internal aiosqlite patch 2024-01-25 00:01:02 +00:00
e3ffc426b7 Actually delete missing files 2024-01-02 14:12:26 +00:00
4626f53bcb Return to OpenCLIP 2023-11-13 17:31:43 +00:00
74bb1bc343 thumbnailer system 2023-10-27 15:50:21 +01:00
5b5ef271aa ""documentation"" 2023-10-09 12:35:26 +01:00
68a14d7da9 unignore device 2023-10-08 22:54:06 +01:00
20fcc9317f forgot the README 2023-10-08 22:52:57 +01:00
46fca3eb7f faster indexing, SigLIP models 2023-10-08 22:52:17 +01:00
2c9ce67ab2 I really should do better testing 2023-09-30 21:10:27 +01:00