Hoping for confirmation of a few high-level ideas? #921

plbremer · 2025-03-27T21:21:11Z

Hi,

Thanks for putting together this very compelling tooling. I was hoping to ask a few specific questions about what is going on to make sure that everything is working as we expect before trying to productionize :)

We can/should build a classic document retrieval index with Tantivy up-front in the case of >10,000 <100,000 documents. This index does not involve a vector store at all.
In the publication's Figure 1a, the Tantivy document store is the tool that the Paper Search agent is interacting with.
Any vectorization that occurs happens on-the-fly with the Gather Evidence Agent. Where is this vectorization stored? Is it possible to slowly accumulate vectors somewhere? I recognize that we can save a Docs object, however, every query will probably have a unique set of documents that is retrieved, so it is not clear if we can meaningfully aggregate previous vectorizations. (obviously the system works even if we cant accumulate these meaningfully)
The README mentions options for larger-than-memory vector stores. Is this relevant for anything other than opting for a tremendously large k? Can we parametrically avoid this?
If you have custom citations, or no citations, will the Citation Traversal agent simply not operate? Where does the citation graph come from? If I have internal documents, can I provide my own?
It looks like my answers triggered the creation of an index. Is there any documentation around interacting with that SeachIndex?

Thanks for your time.

dosubot bot added documentation Improvements or additions to documentation question Further information is requested labels Mar 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hoping for confirmation of a few high-level ideas? #921

Hoping for confirmation of a few high-level ideas? #921

plbremer commented Mar 27, 2025

Hoping for confirmation of a few high-level ideas? #921

Hoping for confirmation of a few high-level ideas? #921

Comments

plbremer commented Mar 27, 2025