superset icon indicating copy to clipboard operation
superset copied to clipboard

[SIP-111] Proposal for improved database, schema, and table selection UI in SQL Lab sidebar

Open justinpark opened this issue 1 year ago • 13 comments

Proposal for Improved Database and Schema Selection, Table Selection UI, Table Metadata and Schema View, and Search Functionality

Motivation

Currently, when users extend the SQL editor by collapsing the left sidebar, the database and schema selectors become inaccessible.

Fig. 1. SQL editor with the sidebar Fig. 2. SQL Editor with folding the sidebar
Image Image

This not only makes it difficult for users to utilize the expanded editor layout UI, but it also limits the use of the left sidebar area, which is dedicated to the database and schema selector options. (We aim to customize the left sidebar in a manner similar to the Visual Studio Code sidebar, enabling a more interactive layout that includes features like a ChatGPT prompt and more.) Additionally, the dropdown UI for the table selector lacks usability and does not allow for easy navigation to the corresponding datasource without the need for the database and schema selectors. Moreover, the table metadata, including column schema and record preview, is displayed separately from the table schema dropdown, making it challenging for users to manage and explore their table schema view effectively. Furthermore, there is no search functionality available to find matches in the datasource catalog and table metadata and schema, which hampers users' ability to locate specific information quickly.

Proposed Change

Original Plan Alternative Plan
Image Image

The proposed change is to:

  1. Move the database and schema selectors to the main content layout, ensuring their accessibility at all times.

(The investigation is still ongoing for the tree view UI due to concerns about performance.) ~~2. Change the dropdown UI for the table selector to a tree view for improved navigation.~~ ~~3. Merge the table metadata, including column schema and record preview, under a new "My table schema" root in the tree view for better management and exploration.~~ ~~4. Add a search box to allow users to find matches in both the datasource catalog and table metadata and schema for enhanced information retrieval.~~

(In light of @michael-s-molina's suggestion, it seems the advantages of the tree-view outweigh the limitations of the search functionality. Therefore, considering back to the original proposal.)

  1. Change the dropdown UI for the table selector to a tree view for improved navigation.
  2. Merge the table metadata, including column schema and record preview, under a new "My table schema" root in the tree view for better management and exploration. [see figure 2]
  3. Add a search box to allow users to find matches in both the datasource catalog and table metadata and schema for enhanced information retrieval. (Filter only within the fetched recordset only and use distinct colors to mark the assets that are yet to be discovered. This method will make it easier for users to identify which parts are not included in the search list.) [see figure 1]

New or Changed Public Interfaces

There will be changes to the existing visualizations, dashboards, and React components to incorporate the improved database and schema selection, table selection UI, table metadata and schema view, and search functionality.

New Dependencies

There are no new dependencies required for this proposed change.

Migration Plan and Compatibility

I do not for-see any migration required.

Rejected Alternatives

None.

to: @michael-s-molina @john-bodley @kasiazjc cc: @betodealmeida @eschutho @rusackas

justinpark avatar Jan 02 '24 21:01 justinpark

@justinpark it seems like the left-hand panel will likely define (for searching purposes) an exhaustive list of all database and namespaces (schemas) which i) our current API doesn't handle, and ii) could be rather costly to fetch if the corpus of databases, schemas, and table is large.

Would you mind expanding how these perceived interactions (including the table schema) will likely materialize?

john-bodley avatar Jan 02 '24 21:01 john-bodley

Nice!

I have a few questions and comments:

  1. Should we have the DB/schema selectors above both the sidebar and the SQL editor, to reinforce the fact that the selection applies to both panels? Something like (apologies for the poor ASCII art):
 _______________________________________
|                                       |
| [database selector] [schema selector] |
|_______________________________________|
|                   |                   |
| sidebar           | SQL editor        |
|___________________|___________________|

My impression with your proposed design is that it's not clear that the selection in the SQL editor panel applies to the sidebar, since the panels are siblings hierarchically.

  1. What happens when databases have thousands of tables? Are we going to paginate the sidebar?
  2. Currently we have some affordances to show extra table metadata (in Presto, Hive, Trino, BigQuery, and GSheets). How are these going to be displayed in the new design?
  3. We've been discussing adding support for catalogs (projects, in BigQuery terms) to Superset. It would be nice to keep that in mind during this redesign, to make that work easier.

betodealmeida avatar Jan 02 '24 22:01 betodealmeida

@betodealmeida regarding your first point, I originally had the same thought, however it seems like the left-hand panel is not database/schema specific, i.e., it represents the entire corpuses of databases and nested schemas and tables.

john-bodley avatar Jan 02 '24 22:01 john-bodley

@betodealmeida As @john-bodley mentioned, the left-side panel is no longer linked to the database/schema selectors as it now has its own hierarchical structure through the tree format.

However, I've noticed a performance issue when handling large record sets in the tree view layout. (as you both mentioned)

  1. While the tree list view has the capability to manage large lists through virtual rendering, it doesn't currently have a plan to provide an option for pagination, as the backend API does not support this feature yet.

  2. As @john-bodley pointed out, we lack an API to search for a specific table name across the entire database, hence it's slated to be executed by the frontend. Thus, it will only highlight a record from the already fetched recordset, leading to some limitations in discovery.

Consequently, I'm leaning towards maintaining the dropdown UI (as opposed to a treeview UI) for the table schema selector (and designing the layout as per @betodealmeida 's first point) for the initial proposal until the API is fully equipped to handle pagination and search functions.

justinpark avatar Jan 02 '24 23:01 justinpark

Thank you for the SIP @justinpark.

Some thoughts to help with the discussion:

@justinpark it seems like the left-hand panel will likely define (for searching purposes) an exhaustive list of all database and namespaces (schemas) which i) our current API doesn't handle, and ii) could be rather costly to fetch if the corpus of databases, schemas, and table is large.

  1. What happens when databases have thousands of tables? Are we going to paginate the sidebar?

While the tree list view has the capability to manage large lists through virtual rendering, it doesn't currently have a plan to provide an option for pagination, as the backend API does not support this feature yet.

I think the answer here is lazy loading and if necessary pagination. Even if the Ant Design tree does not natively support pagination, we can still implement it by adding a link to load more items of a specific node. We should also keep in mind that we already load all the tables once a particular database and schema are selected, and that's why lazy loading the items will probably be sufficient. There's an example of this type of tree here.

As @john-bodley pointed out, we lack an API to search for a specific table name across the entire database, hence it's slated to be executed by the frontend. Thus, it will only highlight a record from the already fetched recordset, leading to some limitations in discovery.

Consequently, I'm leaning towards maintaining the dropdown UI (as opposed to a treeview UI) for the table schema selector (and designing the layout as per @betodealmeida 's first point) for the initial proposal until the API is fully equipped to handle pagination and search functions.

My vote would be to enhance our current API to support the changes proposed here instead of trying to adapt the interface to our current API limitations. The tree view for displaying database/schema/tables is widely used in most database/sql editors and I think the UI and API changes are what justifies the SIP in the first place.

  1. Currently we have some affordances to show extra table metadata (in Presto, Hive, Trino, BigQuery, and GSheets). How are these going to be displayed in the new design?
  1. We've been discussing https://github.com/apache/superset/issues/22862 (projects, in BigQuery terms) to Superset. It would be nice to keep that in mind during this redesign, to make that work easier.

Great callout from @betodealmeida. Let's think about these concepts now while we're still in the design/prototype phase as it's way more costly to think about them after. If we could enhance the mockups with these concepts it would be great.

michael-s-molina avatar Jan 03 '24 12:01 michael-s-molina

I believe the current setup uses the schema dropdown to determine the content of the tables dropdown. Placing the schema dropdown directly above the SQL Editor might imply a strict binding between the SQL Editor and the selected schema. However, it's important to note that the SQL Editor allows for cross-schema queries, so the association between the dropdown and the editor isn't necessarily exclusive.

geido avatar Jan 03 '24 17:01 geido

@justinpark @john-bodley @rusackas About the searching feature and performance concerns, we can implement the same pattern used by DBeaver and other tools where the search is applicable only to loaded databases. When a user loads a database by expanding its node, we could display a connected icon which would indicate that the database is available for searching.

https://github.com/apache/superset/assets/70410625/78c0380e-3616-4980-ae1e-4d4296ed46fa

michael-s-molina avatar Jan 04 '24 11:01 michael-s-molina

Thanks for all feedback. Here is my answer for all opinions.

john-bodley: ... the left-hand panel will likely define (for searching purposes) an exhaustive list of all database and namespaces (schemas) which i) our current API doesn't handle, and ii) could be rather costly to fetch if the corpus of databases, schemas, and table is large. Would you mind expanding how these perceived interactions (including the table schema) will likely materialize?

michael-s-molina: ... the search is applicable only to loaded databases. When a user loads a database by expanding its node, we could display a connected icon which would indicate that the database is available for searching.

Considering @john-bodley's concerns, the recommendation put forth by @michael-s-molina might be a suitable interim solution until the Superset API is fully equipped for a full-text search of the entire database list. As suggested in Figure 1, we could use varying colors to denote the assets that are available for search. This way, users can easily distinguish which parts are excluded from the search list.

Figure 1
Screenshot 2024-01-23 at 11 30 25 AM

betodealmeida: 2. What happens when databases have thousands of tables? Are we going to paginate the sidebar?

Similar to the current setup, the entire list will be displayed (as the paginated API isn't supported yet). However, it will be virtually rendered (as-is) within the viewport for better efficiency.

betodealmeida: 3. Currently we have some affordances to show extra table metadata (in Presto, Hive, Trino, BigQuery, and GSheets). How are these going to be displayed in the new design?

Indeed, much like the existing UI, the display will be situated beneath the table hierarchy as shown in Figure 2.

Figure 2
treeview_extra_meta

betodealmeida: 4. We've been discussing https://github.com/apache/superset/issues/22862 (projects, in BigQuery terms) to Superset. It would be nice to keep that in mind during this redesign, to make that work easier.

I've contemplated introducing a new hierarchy to support the catalogue. Kindly refer to the newly added "catalogue" section in Figure 2.

geido: I believe the current setup uses the schema dropdown to determine the content of the tables dropdown. Placing the schema dropdown directly above the SQL Editor might imply a strict binding between the SQL Editor and the selected schema. However, it's important to note that the SQL Editor allows for cross-schema queries, so the association between the dropdown and the editor isn't necessarily exclusive.

You're right that our current setup uses the schema dropdown to influence the content of the tables dropdown. However, within the tree hierarchy structure, the schema dropdown might be seen as redundant since it's already incorporated into the tree structure. Yet, removing the schema selector could potentially be a step back, given how it's implicitly used when a table name is provided without a specified schema name. (+ We're planning to make the schema dropdown optional, allowing users to run queries even without making a schema selection.)

cc: @betodealmeida @geido

justinpark avatar Jan 23 '24 23:01 justinpark

@justinpark Thank you for the modified mockups. One last suggestion would be to display the blue buttons (View Key, Copy S, Show CR) as tooltip icons that are displayed when hovering the table name (as we do now) given that those seem like secondary actions. Another option, would be to support right-clicking on a table and offer a menu of options. This one is better if we have many options when interacting with a table.

https://github.com/apache/superset/assets/70410625/aad66450-1c63-45a8-9779-aa6a21510b7f

michael-s-molina avatar Jan 25 '24 13:01 michael-s-molina

would be to display the blue buttons (View Key, Copy S, Show CR) as tooltip icons that are displayed when hovering the table name (as we do now) given that those seem like secondary actions.

The button arrangement mentioned above is one of the prototype ideas I've been considering. I favor the suggestion of using tooltips instead of a right-click popup menu (in terms of a11y POV like keyboard accessibility), and I'm inclined to proceed with this approach.

justinpark avatar Jan 25 '24 18:01 justinpark

@geido @betodealmeida As we're nearing the voting process, I'd like to address any requests or concerns. I'm planning to advance to the voting stage next week. Please share any concerns or questions you may have before we move forward.

justinpark avatar Feb 08 '24 01:02 justinpark

@justinpark I am still not fully convinced of the schema dropdown on top of the editor as I am concerned that it creates the impression that the SQL is strictly tied to the schema selected, even though that isn't the case. @yousoph @kasiazjc any thoughts?

geido avatar Feb 09 '24 17:02 geido

@justinpark I am still not fully convinced of the schema dropdown on top of the editor as I am concerned that it creates the impression that the SQL is strictly tied to the schema selected, even though that isn't the case. @yousoph @kasiazjc any thoughts?

Agreed with @geido here. Is the schema field mandatory or could that be left blank?

It's also not clear to me what the proposed interactions between the left panel and the top db/schema selectors are. Do they impact each other at all or are they operating independently of each other?

What about between the left panel and the SQL editor portion - does the left panel still open previews like the current left panel does?

yousoph avatar Feb 10 '24 01:02 yousoph

@yousoph Sorry for the late response. Here is my answer.

Agreed with geido here. Is the schema field mandatory or could that be left blank?

The schema field is optional, and it can also be left blank. If it appears confusing, I can remove the schema selector to simplify the process.

It's also not clear to me what the proposed interactions between the left panel and the top db/schema selectors are. Do they impact each other at all or are they operating independently of each other?

The DB selector (located above the SQL editor) does not affect the left panel. The left panel has its own hierarchy that allows exploration of databases, schemas, and tables independently. Please take a look at the screenshot below. (Please note that it's a prototype version, but it gives an idea of how the tree will be displayed.)

Screenshot 2024-01-23 at 11 30 25 AM

What about between the left panel and the SQL editor portion - does the left panel still open previews like the current left panel does?

Yes. It still open (metadata) previews under the table name.

Cursor_and__SIP-111__Proposal_for_improved_database__schema__and_table_selection_UI_in_SQL_Lab_sidebar_·_Issue__26395_·_apache_superset

justinpark avatar Mar 23 '24 00:03 justinpark

Hey, loving the designs here! I wanted to bring up the topic of caching and autocomplete and make sure they're somewhat in-scope for this SIP.

Around caching, the sqlalchemy get_table_names dialect method can be really slow at times on some database engine, and creates strains on some systems, and I think we currently have some caching in place around that method/endpoint. Maybe we simply need a force-refresh that's more explicit than say collapsing+re-extending the table name list. As with charts, it's great if we can surface cache metadata (is this served from cache? when was the snapshot taken) and expose a clear way to force-refresh. Both speed and freshness matter here.

About autocomplete, currently the scope for autocomplete in the editor is related to what's in the left panel, meaning by we only auto-complete table names within the active schemas, and column names for the tables added in that accumulator section. Here we probably want to follow the same pattern where autocomplete would be limited to what's been retrieved by the frontend, as opposed to say trying to include everything.


Another topic related to this left panel - and maybe out of scope for this SIP - is the idea of allowing users to search/find tables by names, especially when not knowing the schema it might be in. Currently if you don't know the name of the schema and/or the full name of the table, you're at a loss. While this is probably out of scope for the SIP, it's good to think how we might allow users to search for tables across schemas in the future, maybe with a search box at the top of the left panel, and maybe leveraging some sort of cache (?)

mistercrunch avatar Apr 22 '24 18:04 mistercrunch

@justinpark (or anyone else): Do you think there's more discussion to be had here, or is it time to put this up for a [VOTE]?

rusackas avatar Jul 02 '24 15:07 rusackas

This would be good to move forward!

geido avatar Jul 31 '24 15:07 geido

+1 on moving forward. Few other thoughts while the discussion is still open:

  • about WHEN to fetch the metadata client-side: I think realistically, most of the fetching of metadata should 1. be done at "user clicks expand" time, though upon expanding a database connection we could fetch 2 layers (catalogs + schemas) all at once. Async/partial fetching has clear implication for searching as we'd probably limit to what's in-context client-side. For that, I like the previous suggestion of using text-muted for unloaded items makes it clear enough that we won't search within those objects
  • about caching: we should cache every "expand" call here (when expanding a database, a schema, or a table), but we should also expose a way to force-refresh as often in a SQL IDE people are creating tables, adding columns to table, ... I think it's totally ok to show cached data here - but only if we offer a way to reload the metadata
  • about action buttons: I also would prefer a more subtle, icon-shown-on-hover approach like we haven now instead of large primary buttons like shown in the designs that show incomplete/truncated labels
  • about metadata: I don't think we currently do a good job at fetching/surfacing table-level or column-level comments/description, I'd love to have better support for not only showing it, but maybe even making it easy to add/commit descriptions from SQL Lab - this probably out-of-scope for this SIP

One last note is that we have a "metadata well" currently that's used to show some metadata in some database engine, things like "latest partition" and "table description". I think that belongs somewhere under individual tables and close to the action buttons

mistercrunch avatar Jul 31 '24 20:07 mistercrunch