llama_index
llama_index copied to clipboard
Showing raw OpenAI requests / responses
Is there any option to view the raw text request that's being submitted to OpenAI LLMs? I am playing around with custom prompts / Summary Templates and want to confirm that my query is being sent to the LLM correctly, but can't find any way of doing this.
I also looked into this and couldn't find a way. Would be great to see the raw request/responses, indeed.
I'll mark a TODO for this!
One option for this: using log level DEBUG if #391 is merged in
I would be happy to take this up if you're looking for an extra pair of hands!
i think this should be closed with release https://github.com/jerryjliu/gpt_index/releases/tag/v0.4.0
Does the DEBUG
logging level actually address the main question of this ticket?
I was experimenting in the interactive Python shell and wanted to see how the requests to the upstream OpenAI API were chained.
Here is what I get without any logging customization:
>>> res = idx.query("<QUERY>")
INFO:root:> [query] Total LLM token usage: 3417 tokens
INFO:root:> [query] Total embedding token usage: 17 tokens
With logging level set to INFO
:
>>> res = idx.query("<QUERY>")
INFO:root:> [query] Total LLM token usage: 3384 tokens
> [query] Total LLM token usage: 3384 tokens
INFO:root:> [query] Total embedding token usage: 17 tokens
> [query] Total embedding token usage: 17 tokens
With logging level set to DEBUG
:
>>> res = idx.query("<QUERY>")
INFO:root:> [query] Total LLM token usage: 3428 tokens
> [query] Total LLM token usage: 3428 tokens
> [query] Total LLM token usage: 3428 tokens
INFO:root:> [query] Total embedding token usage: 17 tokens
> [query] Total embedding token usage: 17 tokens
> [query] Total embedding token usage: 17 tokens
So, it looks like the same information is repeated at each logging level, instead of exposing more detailed information as we go deeper in the logging levels.
Am I doing anything wrong?