Steven Hillion comments

Repositories
Issues
Comments

Results 3 comments of


                                            Steven Hillion

Research:Enhance Testing and Automation for LLM Response Accuracy, accuracy monitoring DAG: automated tests to count docs, test queries, others

Thanks David. Let's leave this open until we've decided what to do with (1) auto-populating the test spreadsheet via a DAG, and (2) evaluating with a secondary LLM. We'll discuss...

Research:Enhance Testing and Automation for LLM Response Accuracy, accuracy monitoring DAG: automated tests to count docs, test queries, others

Discussed with @davidgxue — let's consider using another LLM to evaluate the quality of responses, and then add that to the DAG so that we can have it run regularly...

Need to specify tokenization for content

@sunank200 — is this issue still relevant?