Edit ‘osmarks.net_web_search_plan’
This commit is contained in:
@@ -20,6 +20,7 @@ The job of a search engine is to retrieve useful information for users. This is
|
||||
* {Images, PDFs, etc contain useful knowledge which hasn't been integrated properly into most things. We need* these.
|
||||
* Common Crawl doesn't even get PDFs because they're complicated to process!
|
||||
* Obscure papers, product user manuals, shiny reports from organizations.
|
||||
* https://arxiv.org/abs/2407.01449
|
||||
}
|
||||
* So much tacit knowledge is in videos. Oh no. Maybe we can get away with an autotranscriber and frame extraction.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user