1
0
mirror of https://github.com/osmarks/meme-search-engine.git synced 2026-05-25 23:52:06 +00:00

40 Commits

Author SHA1 Message Date
osmarks 00afb01e7a updates & minor fixes 2025-11-07 16:59:25 +00:00
osmarks a1e694ed6d oops 2025-03-26 12:02:42 +00:00
osmarks e57931d47f Multithread query server
While profiling suggests that most operations are cheap and IO-bound rather than CPU-bound, the GEMM for deduplication is pretty slow. As such, use multiple threads for higher throughput.
2025-01-31 13:47:47 +00:00
osmarks 5215822e39 mlock 2025-01-29 14:48:15 +00:00
osmarks 257486678d connect metrics correctly 2025-01-24 15:23:53 +00:00
osmarks ee23b81444 release version 2025-01-24 09:24:28 +00:00
osmarks 3852d0078d integrate rating model correctly 2025-01-23 13:45:59 +00:00
osmarks a5a6e960bb query code 2025-01-18 17:09:00 +00:00
osmarks 63caba2746 integrate rating model 2025-01-18 11:29:03 +00:00
osmarks f4376f62ed RobustVamana algorithm for big index run 2025-01-16 21:10:12 +00:00
osmarks 5ab91aa17e CLI switch for L 2025-01-14 08:31:03 +00:00
osmarks 4dd97631df fix entire index algorithm (very silly bug) 2025-01-12 19:48:53 +00:00
osmarks 0a196694b1 minor tweaks 2025-01-11 12:19:10 +00:00
osmarks 087419f470 remove vestigal r_cap 2025-01-11 07:36:46 +00:00
osmarks 8ce51bcb56 correct DiskANN algorithm (silly bug with greedy search) 2025-01-11 07:35:04 +00:00
osmarks e9ee563381 tweak some parameters 2025-01-03 09:22:39 +00:00
osmarks 265502f141 tweak index build, this had better work, aaa 2025-01-02 21:04:26 +00:00
osmarks f1283137d6 release WIP DiskANN index orchestration code 2025-01-01 14:40:24 +00:00
osmarks 512b776e10 use slightly worse image scaling 2024-11-13 18:31:18 +00:00
osmarks 8097ce8d91 improve dump processing and misc performance fixes 2024-11-11 19:43:07 +00:00
osmarks c277b49dc1 fix resumption, oops 2024-11-07 20:43:26 +00:00
osmarks b9bb629e6f performance improvements 2024-11-07 16:52:58 +00:00
osmarks 7fa14d45ae improve observability and fix up Reddit dump for full-scale run 2024-11-02 19:38:05 +00:00
osmarks 43ff9215fb File metadata storage.
The backend now knows how big images are, so the frontend can size images correctly before they're loaded.
This should significantly improve the UI on slow connections.
Also fix bug where videos weren't erased from the index properly.
2024-06-26 20:02:12 +01:00
osmarks 1ab254ff1d Adjust index storage for memory efficiency and fix SQLite interface type confusion 2024-06-25 08:23:30 +01:00
osmarks 747058e254 misc fixes
- thumbnails/OCR off was broken
- problematic video files caused segfaults (I blame ffmpeg for this)
2024-05-30 19:05:54 +01:00
osmarks 3257521068 Video search 2024-05-30 15:58:31 +01:00
osmarks 74d91d52e5 probably I should do better testing 2024-05-29 21:25:50 +01:00
osmarks 5eae8674ce video parsing basics 2024-05-28 22:28:41 +01:00
osmarks a8329e43fc more progress on Reddit 2024-05-27 15:22:28 +01:00
osmarks f8d68d9d54 WIP Reddit dump loader 2024-05-24 17:47:18 +01:00
osmarks d8c147df52 Predefined embedding modes in search 2024-05-22 20:17:13 +01:00
osmarks 14387a61a3 refactor configuration 2024-05-22 19:02:34 +01:00
osmarks ffc3d648a6 basic monitoring implementation 2024-05-22 18:49:32 +01:00
osmarks ce590298a7 concurrent index queries and fix database typo yet again 2024-05-22 18:25:50 +01:00
osmarks 30b1b72712 I really should test database queries better 2024-05-22 14:35:29 +01:00
osmarks 24fbc0dd1f apparently, that quality value is too low 2024-05-21 20:09:28 +01:00
osmarks 63a9b3d9a6 the consequences of my own actions 2024-05-21 12:39:04 +01:00
osmarks b7010b41dd oops 2024-05-21 01:58:50 +01:00
osmarks 7cb42e028f Rewrite entire application (well, backend) in Rust and also Go
I decided I wanted to integrate the experimental OCR thing better, so I rewrote in Go and also integrated the thumbnailer.
However, Go is a bad langauge and I only used it out of spite.
It turned out to have a very hard-to-fix memory leak due to some unclear interaction between libvips and both sets of bindings I tried, so I had Claude-3 transpile it to Rust then spent a while fixing the several mistakes it made and making tweaks.
The new Rust version works, although I need to actually do something with the OCR data and make the index queryable concurrently.
2024-05-21 00:09:04 +01:00