LLPhant icon indicating copy to clipboard operation
LLPhant copied to clipboard

Multiple files - limited response

Open stef157 opened this issue 11 months ago • 3 comments

Hello,

I am currently testing LLPhant and I am getting quite special results. I have a list of 200 products stored in a database.

Each product is converted into a JSON file.

The procedure is as follows: I pass the folder as an argument to FileDataReader. The next step is to create the embeddings and finally I add that to the FileSystemVectorStore.

When I call the getNumberOfDocuments method, I get the number 200.

Finally, I ask "How many products are there ?", and I get the answer 4 (k value ?).

Is there something I should consider doing differently?

Thanks

stef157 avatar Jan 30 '25 20:01 stef157

Hello @stef157, maybe you could use tools (https://github.com/LLPhant/LLPhant?tab=readme-ov-file#tools) creating a function that calls the getNumberOfDocuments and then instruct the LLM to call it when you ask informations about the number of products in your vector store.

f-lombardo avatar Jan 31 '25 08:01 f-lombardo

Hi @f-lombardo ,

Thanks for your response !

So, absolutely yes - for the quantity, but if I ask which products are under 30€ or the products from the previous collection (based on the date) or even the sweaters, it cannot answer.

However, if I ask for information related to the 4 products that come up, then it can answer.

Unless I didn't understand the example presented with the function? IMO the tool is used to perform tasks that the AI does not know how to do or cannot do (such as a calculation, sending an email, performing a database insert based on a result, etc. - retrieving information that is not found in the dataset). Or, I need to create a tool that asks to generate an SQL query based on the structure - was that the idea?

Thank you, it made me think about different aspects/functions of LLphant.

stef157 avatar Jan 31 '25 08:01 stef157

In my opinion a vector store should be used to provide additional information about the external world to the AI, but not to provide information about the vector store itself. Information about the number of records in a DB or something like that are more effectively gained using SQL. So we could instruct the AI to call some external functions that performs SQL queries when we need such data. Or maybe we could let the AI create the needed SQL and then run it in an external function, even if this approach could lead to security, performance and correctness problems.

f-lombardo avatar Feb 08 '25 18:02 f-lombardo