insights feat: Insights Copilot

https://github.com/frappe/insights/assets/25369014/9a5e83ef-c3fc-4f46-93e7-e743ed25184c

Todo:

[x] use openai function calls
[x] find relevant tables based on the question
[x] execute query if asked to
[x] stream response using frappe.publish_realtime
[ ] include only tables that have more recent and frequent data to reduce no. of tables in the schema
[ ] Integrate Documentation in chat

Apr 07 '23 15:04 nextchamp-saqib

Codecov Report

Patch coverage: 10.73% and project coverage change: -2.42 :warning:

Comparison is base (c2a03df) 57.58% compared to head (b28b033) 55.16%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #108      +/-   ##
===========================================
- Coverage    57.58%   55.16%   -2.42%     
===========================================
  Files           49       51       +2     
  Lines         2775     2922     +147     
===========================================
+ Hits          1598     1612      +14     
- Misses        1177     1310     +133

Impacted Files	Coverage Δ
insights/api/chat_bot_ai.py	`0.00% <0.00%> (ø)`
.../insights/doctype/insights_chart/insights_chart.py	`0.00% <0.00%> (ø)`
insights/api/__init__.py	`37.55% <16.00%> (-2.45%)`	:arrow_down:
insights/utils.py	`58.33% <23.07%> (-41.67%)`	:arrow_down:
.../insights/doctype/insights_query/insights_query.py	`62.58% <33.33%> (-1.42%)`	:arrow_down:
...ctype/insights_data_source/insights_data_source.py	`72.72% <100.00%> (+0.27%)`	:arrow_up:
...hts/doctype/insights_settings/insights_settings.py	`36.00% <100.00%> (+4.08%)`	:arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

Apr 08 '23 05:04 codecov-commenter

Nice work! What made you stop working on this?

include only tables that have more recent and frequent data to reduce no. of tables in the schema

Maybe with newer models with a bigger context window, like gpt-4-32k, this will be less of an issue.

Dec 18 '23 15:12 barredterra

The accuracy wasn't very good. Most of the time I had to point out what's wrong in the generated query, which isn't feasible for someone who isn't good with SQL.

However I think it'll be useful for developers. I'll try to merge it soon then

Dec 19 '23 13:12 nextchamp-saqib

Makes sense. Hopefully the accuracy will improve with newer models, bigger context and fine-tuning the prompt.

Dec 19 '23 13:12 barredterra

@barredterra, isn't it about giving the model the local knowledge, Database tables their structure and meanings?

Dec 21 '23 08:12 aa-prashant

FYI:

Using gpt-4 model which costs of avg $0.02 / 1k tokens the below generation consumed $0.05 that's ₹4 per query 😅

Question: find top 10 categories by order value Answer:

SELECT p.product_category_name, SUM(op.payment_value) AS total_order_value
FROM orderpayments op
JOIN orders o ON op.order_id = o.order_id
JOIN orderitems oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
GROUP BY p.product_category_name
ORDER BY total_order_value DESC
LIMIT 10

Using gpt-3.5-turbo which costs $0.0013 on avg, consumed $0.003 (₹0.25 per query)

Question: find top 10 categories by order value Answer:

SELECT product_category_name, SUM(price) as order_value
FROM orderitems
JOIN products ON orderitems.product_id = products.product_id
GROUP BY product_category_name
ORDER BY order_value DESC
LIMIT 10

Jan 23 '24 08:01 nextchamp-saqib

insights insights copied to clipboard

feat: Insights Copilot

Codecov Report

insights
insights copied to clipboard