lightdash icon indicating copy to clipboard operation
lightdash copied to clipboard

[CLOSE AT THE END OF MILESTONE] You should be able to save a chart from SQL Runner

Open TuringLovesDeathMetal opened this issue 2 years ago • 21 comments

Description: What is it?

You should be able to save charts made in SQL Runner

image

Acceptance criteria

  • [ ] You can save SQL runner charts
  • [ ] They appear in search results
  • [ ] They appear in spaces
  • [ ] They can be added to dashboards
  • [ ] They can be delivered on a schedule

Actions/metadata in the SQL Runner chart view:

  • [ ] Export CSV
  • [ ] Export chart
  • [ ] Add to dashboard
  • [ ] Move to space
  • [ ] Schedule delivery

Actions/metadata NOT in the SQL Runner chart view:

  • Explore from here

  • Filters

  • View underlying data (chart or table)

  • Drill into (chart or table)

  • [ ] Users can visualise and edit the SQL-generated chart, but there is some indication that it's non-interactive image

  • [ ] You can also see which charts are non-interactive from a list image

  • [ ] On a dashboard, explore from here is greyed out and on-hover it says this chart is non-interactive because it was generated from the SQL runner image

  • [ ] If you click on a value in the chart, only copy value is shown - the rest of the options are greyed-out with the same message on hover ^

image

Docs

  • [ ] create a new doc in guides/ called using the SQL runner
  • [ ] Explain the purpose of the SQL runner, how to access it
  • [ ] explain how you can save charts from the SQL runner, but also include the limitations fo charts from the SQL runner
  • [ ] Be sure to talk about how if you add SQL runner charts to a dashboard, then dashboard filters cannot be applied to the SQL runner chart

Analytics events

  • Add num_sql_runner_charts to dashboard_created, dashboard_updated events. This tells us how many of the charts on a dashboard come from sql runner charts.

  • add sql_runner_chart to saved_chart_created, saved_chart_updated and saved_chart_deleted events. This is a True/False value

TuringLovesDeathMetal avatar Mar 08 '22 09:03 TuringLovesDeathMetal

Blocked by #1135

TuringLovesDeathMetal avatar Mar 08 '22 09:03 TuringLovesDeathMetal

Is this issue still relevant? There have been no updates for 60 days, please close the issue or keep the conversation going!

stale[bot] avatar May 21 '22 11:05 stale[bot]

Regarding the ‘saving from SQL Runner topic’ -> The SQL knowledge across a good chunk of the organisation is quite high, and they would like to have that option :slightly_smiling_face: A big part of that comes from the fact that we are currently using Metabase, which works that way and this is what we are used to in general. The downside is that the equivalent of ‘Explore’ from Metabase is really weak. In conclusion, if LD were to give us the option to store charts from explore and SQL-Runner, we would keep the SQL as well as the non-SQL users happy :slightly_smiling_face:

https://lightdash-community.slack.com/archives/C03MG2VTR08/p1657891887680169?thread_ts=1657889231.371029&cid=C03MG2VTR08

ZeRego avatar Jul 15 '22 13:07 ZeRego

Is this issue still relevant? There have been no updates for 60 days, please close the issue or keep the conversation going!

stale[bot] avatar Oct 04 '22 19:10 stale[bot]

It's true that with Metabase, the equivalent "explore" functionality is sufficiently clunky that people who know SQL will just use that instead. However, even with Looker and within Lightdash, there will be edge cases where an SQL only chart makes more sense.

Example: We have a gapless table which is 1 row per user per day called user_stats it tracks the product each user is on e.g. "standard', 'premium', 'platinum' on any given day. We want to measure what % of people transition from one product to the other. Compare say 30 days ago to today. The result would be a 3x3 matrix showing:

  • standard --> standard 50%
  • standard --> premium 5%
  • ....

This would be quite difficult to achieve using standard Lightdash joins but relatively simple in SQL

WITH previous_states AS (

SELECT
user_id,
product,
FROM user_stats
WHERE date = '2022-10-01'

),

current_states AS (

SELECT
user_id,
product,
FROM user_stats
WHERE date = '2022-11-01'

)

SELECT
previous_states.product AS old_state,
current_states.product AS new_state,
COUNT(*) AS users
FROM previous_states
LEFT JOIN current_states
USING(user_id)
GROUP BY 1, 2

XiaozhouWang85 avatar Dec 07 '22 15:12 XiaozhouWang85

Also this sets up another very powerful feature that Metabase and Looker both have. In the above example, the "product" field and date values are obvious candidates to be parameterised. Allowing the end user to input values there from the chart / dashboard as filters allows for powerful tools to be built.

Metabase: https://www.metabase.com/glossary/parameter Looker: https://www.cloudskillsboost.google/focuses/21216?locale=ko&parent=catalog

I think the whole meta language for LookML thing that Looker does is overkill but Metabase's set up is simple and powerful.

Using the example from above, I would build a generic transition state explorer for my user_stats table where end user can input two dates to compare and select from a predefined list of states.

XiaozhouWang85 avatar Dec 07 '22 15:12 XiaozhouWang85

The other argument in favour of SQL runners is that yes this is bad practice and effectively tech debt but incurring debt in the short term can be valid.

For example: Finance person knows some SQL but doesn't know (or doesn't want to learn) git, VScode, dbt yml formats needed to properly contribute. Understands best practice is to use explore but the metric they want isn't there. Creates an SQL runner to unblock themselves.

Data analysts review SQL runner created regularly and migrate them back into proper Lightdash dashboards. If SQL runner was created for a good reason (i.e. metric missing) they can add the metric in asynchronously. If it's used for the wrong reasons, can reach out to the relevant person and advise them on best practice. Overall, users are less likely to be blocked and data analysts receive less "urgent requests".

XiaozhouWang85 avatar Dec 07 '22 15:12 XiaozhouWang85

Is this issue still relevant? There have been no updates for 60 days, please close the issue or keep the conversation going!

stale[bot] avatar Feb 05 '23 17:02 stale[bot]

Will this be added to the product at some point?

This feature would move us (100+ users) away from Metabase.

rcronin avatar Feb 25 '23 16:02 rcronin

This feature would also move us (800+ users) away from Looker. The lack of it turns migrating content barely impossible.

timsleeper avatar Mar 14 '23 22:03 timsleeper

Yes! Just to add further product feedback - we would entirely remove our dependence on Snowflake Dashboards (and be able to shift all of our BI to Lightdash) with this feature.

We definitely understand the philosophical desire to want to avoid allowing business logic to be created ad hoc, so some way of flagging charts that are "unstable" and/or "have logic that does not live in DBT" would be phenomenal, as well.

dsl16 avatar Mar 26 '23 17:03 dsl16

Similar to the comment from @dsl16 for any company without a green field install, there will likely be some migration process and allowing + flagging "ad hoc" charts from SQL queries is a huge value add as a intermediate step in that migration. I can imagine this would drive substantial adoption. As an example, we are migrating from periscope/sisense and this would be very very helpful.

akravetz avatar Mar 26 '23 17:03 akravetz

Similar to the comment from @dsl16 for any company without a green field install, there will likely be some migration process and allowing + flagging "ad hoc" charts from SQL queries is a huge value add as a intermediate step in that migration. I can imagine this would drive substantial adoption. As an example, we are migrating from periscope/sisense and this would be very very helpful.

To add to this, I would find it useful to have an approval layer on charts built with SQL runner. i.e. in addition to (or alternatively to) flagging a chart as being "built through SQL runner", one possible implementation could be for SQL runner charts to be displayed as "unvalidated" (or words to that effect) by default, and for there to be a validation page accessible only to admins (similar UI to the current validator for broken charts), where an admin can review the code and, if appropriate, mark the chart as "validated" (even better if you can give feedback to the person who wrote the code).

mhawkins-runa avatar Jul 19 '23 14:07 mhawkins-runa

+1

etadelta222 avatar Jul 24 '23 19:07 etadelta222

https://github.com/lightdash/lightdash/assets/9117144/ce00e37a-6a96-4b3b-a42c-8e0335d5ed55

This video shows a proof of concept of:

  • saving a chart from the SQL runner
  • update columns definitions: dimensions/metrics, value type, label
  • adding the SQL runner chart to a dashboard

Branch: feat/hackathon-save-sql-runner-charts

Insights

Feature limitations

As already mentioned in the ticket description there will be limitations in these charts since we can't manipulate the SQL.

  • Data interactivity ( drill down, underlying data ) won't be available in these charts
  • Dashboard filters won't be applied to these charts
  • Pivoting ( this limitation is already present in the SQL runner )
    • Cases where pivot shows incomplete/incorrect data
      • It is the editor/developer responsibility to write SQL that doesn't return duplicate rows
    • Cases where the column you selected isn't available
      • We generate a fake model ( dimensions / metrics ) based on the columns types. Any number column is marked as a metric. To work around this, you can convert the values to string.
  • only developers have permission to edit these charts

Open questions

  • Is there a way to help developers convert the SQL to dbt sql+yml ?
    • How would it work when there are multiple CTEs?
    • How to break down the SQL and identify what is a metric?

Implementation

I recommend that update columns definitions should be left out of this feature. Its not a blocker since there is a workaround for pivoting and labels can be changed in the chart configuration.

The biggest amount of work is:

  • change the chart types/interfaces
  • update chart code to handle the new chart type
    • disable data interactivity options
    • load different components based on new chart type

ZeRego avatar Sep 19 '23 16:09 ZeRego

Here is another option for how we could implement this: https://miro.com/app/board/uXjVMjWvUlQ=/?moveToWidget=3458764564607634429&cot=14

Overview of what's happening

Once you create a SQL query, you are basically creating a table that you can then explore. Think of this almost as like a "custom explore builder" rather than just a SQL runner.

Why is this different to the option above?

Pros:

  • We don't need to change how any of our other features interact with these charts (since they would have the same metric + dimension definition as existing explores)
  • Data interactivity would work in these charts.
  • Dashboard filters would also work in these charts.
  • By only allowing dimensions in these tables, we remove the issue of "metrics having duplicate dimension groups" (described here) because we aren't assuming anything is a metric
  • We could use this as a way to prototype .sql and .yml files in Lightdash so you could easily save them to dbt

Cons

  • By default, these tables would not have metrics, only dimensions (you'd need to add metrics after)
    • For now, we can just allow for out-of-the-box custom metrics.
    • This could be confusing to some users because you'd need to select "max" or something for a dimension to make it into a metric so it would be treated like a metric in the chart config.
  • It's two steps now instead of one to create a chart from your custom SQL.
  • Right now, we're making it so that you have the "explore builder" and the "chart builder" - but actually, if we want to use this as a way to create reusable custom explores, then we shouldn't have charts connected to them (there should be just a way to create a table with no chart). Maybe we're making this too confusing...but maybe this is also something we can worry about later?

Questions

  • do people want to be able to access all of the other drill down, etc. options? (e.g. if we add variables in here and people are using this to build funnels...would it be useful, for example, to know which user IDs went through each step? With this approach, you'd be able to get that drill-down functionality, but with the previous approach, you wouldn't).
  • is it annoying to split the visualization into an extra step?
  • Is it confusing to go through this option? (it's more complex than the other option we showed).

TuringLovesDeathMetal avatar Sep 21 '23 12:09 TuringLovesDeathMetal

Talked to some users about this and showed the prototypes. We're going to go with Jose's proposal here: https://github.com/lightdash/lightdash/issues/1519#issuecomment-1725949362

The reasons:

  1. The limitations of no drill down, underlying data or dashboards were intuitive to the data builders. These can also be made more clear in the UI when business users are interacting with these charts.
  2. Building custom explores feels like an entirely separate feature. We shouldn't just try to bolt it on to the SQL runner as a solution for "saving charts". It's actually much more complicated and the main purpose wouldn't be to save charts.

TuringLovesDeathMetal avatar Sep 26 '23 15:09 TuringLovesDeathMetal

This issue was mentioned by a user in slack: https://lightdash.slack.com/archives/C065AQZF5CG/p1701967718594729?thread_ts=1701967718.594729&cid=C065AQZF5CG

lightdash-bot avatar Apr 04 '24 07:04 lightdash-bot

This issue was mentioned by a user in slack: https://lightdash.slack.com/archives/C06J7R3HC5C/p1713573213441529?thread_ts=1713573213.441529&cid=C06J7R3HC5C

lightdash-bot avatar Apr 22 '24 07:04 lightdash-bot