diff --git a/osmarks.net_web_search_plan.myco b/osmarks.net_web_search_plan.myco index 6ca5221..cc2c452 100644 --- a/osmarks.net_web_search_plan.myco +++ b/osmarks.net_web_search_plan.myco @@ -38,6 +38,7 @@ The job of a search engine is to retrieve useful information for users. This is * (Modern)ColBERT late interaction w/ pooling. * https://gwern.net/tree-embedding } +* Train with native formatting rather than stripping it all. Adjust tokenizer to contain HTML-like thing. = Filtering