Amazing project! Search is not working properly for me

Just found this project and have to say it’s the best approach I have seen to getting these files organized and readable. Unfortunately, the search function doesn’t seem to be working properly for me. I only get results from the “Persons” category and those are in accurate. Example: I’m searching for “WTC”. I only get 14 results in “Persons” and those don’t contain the full three letters, just a W. Other words don’t show any results at all.

1 Like

Hey Gorn, thanks for reporting this. You were right - the search was only hitting document titles and short summaries, not the actual full text of the documents. That meant most search terms came up empty or only matched person names.

I just shipped a fix that rewrites the search to also query the full OCR text of every document in the database. It now searches across three sources:

  1. Document title + summary (the original behavior)
  2. Full OCR extracted text (25,800+ documents)
  3. Recovered/redacted text layer (39,500+ pages)

Results are combined and deduplicated so you get the best match regardless of where the term appears.

Try your WTC search again and let me know if it’s working better now. The difference should be dramatic - terms like “bank”, “wire”, “massage” etc that were invisible before now return hundreds of hits.

Similar issue. I tried to search for certain search terms in the forum, looking for threads that already discussed the document I am looking at. Let’s say I insert the exact EFTA number into the search bar: I get 50+ results with the AI-star icon, but most of them do not actually contain references to the said EFTA file. Same happens when I search for specific names of people. This makes it currently not possible to easily locate threads that already discuss said files or persons.