insights
insights copied to clipboard
feat: Insights Copilot
https://github.com/frappe/insights/assets/25369014/9a5e83ef-c3fc-4f46-93e7-e743ed25184c
Todo:
- [x] use openai function calls
- [x] find relevant tables based on the question
- [x] execute query if asked to
- [x] stream response using
frappe.publish_realtime - [ ] include only tables that have more recent and frequent data to reduce no. of tables in the schema
- [ ] Integrate Documentation in chat
Codecov Report
Patch coverage: 10.73% and project coverage change: -2.42 :warning:
Comparison is base (
c2a03df) 57.58% compared to head (b28b033) 55.16%.
Additional details and impacted files
@@ Coverage Diff @@
## develop #108 +/- ##
===========================================
- Coverage 57.58% 55.16% -2.42%
===========================================
Files 49 51 +2
Lines 2775 2922 +147
===========================================
+ Hits 1598 1612 +14
- Misses 1177 1310 +133
| Impacted Files | Coverage Δ | |
|---|---|---|
| insights/api/chat_bot_ai.py | 0.00% <0.00%> (ø) |
|
| .../insights/doctype/insights_chart/insights_chart.py | 0.00% <0.00%> (ø) |
|
| insights/api/__init__.py | 37.55% <16.00%> (-2.45%) |
:arrow_down: |
| insights/utils.py | 58.33% <23.07%> (-41.67%) |
:arrow_down: |
| .../insights/doctype/insights_query/insights_query.py | 62.58% <33.33%> (-1.42%) |
:arrow_down: |
| ...ctype/insights_data_source/insights_data_source.py | 72.72% <100.00%> (+0.27%) |
:arrow_up: |
| ...hts/doctype/insights_settings/insights_settings.py | 36.00% <100.00%> (+4.08%) |
:arrow_up: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Nice work! What made you stop working on this?
include only tables that have more recent and frequent data to reduce no. of tables in the schema
Maybe with newer models with a bigger context window, like gpt-4-32k, this will be less of an issue.
The accuracy wasn't very good. Most of the time I had to point out what's wrong in the generated query, which isn't feasible for someone who isn't good with SQL.
However I think it'll be useful for developers. I'll try to merge it soon then
Makes sense. Hopefully the accuracy will improve with newer models, bigger context and fine-tuning the prompt.
@barredterra, isn't it about giving the model the local knowledge, Database tables their structure and meanings?
FYI:
Using gpt-4 model which costs of avg $0.02 / 1k tokens the below generation consumed $0.05 that's ₹4 per query 😅
Question: find top 10 categories by order value
Answer:
SELECT p.product_category_name, SUM(op.payment_value) AS total_order_value
FROM orderpayments op
JOIN orders o ON op.order_id = o.order_id
JOIN orderitems oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
GROUP BY p.product_category_name
ORDER BY total_order_value DESC
LIMIT 10
Using gpt-3.5-turbo which costs $0.0013 on avg, consumed $0.003 (₹0.25 per query)
Question: find top 10 categories by order value
Answer:
SELECT product_category_name, SUM(price) as order_value
FROM orderitems
JOIN products ON orderitems.product_id = products.product_id
GROUP BY product_category_name
ORDER BY order_value DESC
LIMIT 10