framework icon indicating copy to clipboard operation
framework copied to clipboard

Summary data table display

Open mbostock opened this issue 2 years ago • 9 comments
trafficstars

We should have the equivalent of Observable’s data table display for a nice visual summary of tabular data.

mbostock avatar Oct 18 '23 01:10 mbostock

I'm interested in this one

cinxmo avatar Oct 18 '23 14:10 cinxmo

Do we only want to implement the visual summaries, or do we also want to port the filtering capabilities?

mootari avatar Oct 24 '23 14:10 mootari

I would like the visual summaries to support brushing/filtering, yes.

mbostock avatar Oct 24 '23 19:10 mbostock

I've shown a quick prototype of this using mosaic, which has the benefit of bringing SQL and using great optimization techniques under the hood, allowing to process millions of rows without a sweat.

Its input is a SQL TABLE definition such as FROM 'my-file.csv' or FROM 'my-file.parquet'; nothing prevents adding a javascript array of objects, a DBClient instance with possibly multiple tables, etc.

We then show the column names of that table, which you can click on to create a default SQL query. The query is shown and editable.

The SQL query creates a VIEW in the DuckDB instance.

The rows from that view are then shown as a series of (cross-filter) synchronized histograms, atop a table showing ~20 rows with infinite scroll (using Inputs.table). In the future, the table could allow operations such as click-to-filter, or color-by-value.

In terms of UX, the workflow works well for the use cases I had in mind, blending data table and SQL concepts. What's interesting is that even if you don't type your own SQL, you can copy the query you obtain by clicking, and paste it in your source code, in effect "ejecting". We could also have a button that ejects/copy the derived query including the transient interactive ordering and filtering.

The UI will need a lot of design, and we'll want to implement each component in the most lightweight way possible. But if you think this is the right way to address this, please hit the 👍 button, not good ? press 👎.

Of course this means that it's quite a bit slower to boot than a javascript component, since the browser first needs to download and instantiate duckdb (wasm). There's also room for a lightweight table component (I think we should just continue developing Inputs.table).

It's a bit ugly, but you have to start somewhere:

summary-table

Note that at this point all of this is defined in user-land, in the notebook—there's no impact on the cli src/, and it might even be built as a completely separate component that a user could choose to add or not, with alternatives like Inputs.table or navio; I also have a prototype that uses navio—it's just 10 lines of javascript.

Fil avatar Nov 01 '23 16:11 Fil

+1 to expand Inputs.table

CobusT avatar Dec 14 '23 23:12 CobusT

I'd love that!

maelp avatar Mar 08 '24 17:03 maelp

I've shown a quick prototype of this using mosaic, which has the benefit of bringing SQL and using great optimization techniques under the hood, allowing to process millions of rows without a sweat.

[snip]

It's a bit ugly, but you have to start somewhere: summary-table

Note that at this point all of this is defined in user-land, in the notebook—there's no impact on the cli src/, and it might even be built as a completely separate component that a user could choose to add or not, with alternatives like Inputs.table or navio; I also have a prototype that uses navio—it's just 10 lines of javascript.

I think all of this is nice and interesting @Fil !

quak is the same concept as your mosaic direction, and it seems to work nicely. It's pitched for Jupyter, but I'm certain it's not hard to get in Framework or OHQ notebooks. Here's an example in their webapp:

https://manzt.github.io/quak/?source=https://raw.githubusercontent.com/uwdata/mosaic/main/data/athletes.csv (for some reason not working in my Firefox)

If this could also pivot... :open_mouth:

declann avatar Aug 01 '24 10:08 declann