[Pinot 1.2.0 UI] No support for "database" construct from UI
Problem statement
Pinot 1.2.0 supports database constructs as mentioned here. However, the UI does not allow creating a table using a database or querying a table within a database.
Steps to reproduce
- Launch the Pinot 1.2.0 controller UI.
- On the table screen there is no option to create table within a database.
- Create a table using REST API passing the database http header.
- Now, try to query above table using the UI passing the database as below:
SET database='testDb';
select * from myTable limit 10;
- Above query gives following error:
Database name 'testDb' from table prefix does not match database name 'default' from header
Thanks for reporting this. cc @jayeshchoudhary
Proposal to Contribute a Fix for "Database Support in Pinot UI"
I came across this issue where in Pinot 1.2.0 the UI does not support creating or querying tables apart from the 'default' database. I propose to fix this issue by enhancing the REST API and updating the UI accordingly. (I am a backend developer with little UI experience, but the changes seem manageable.)
Problem Statement
- Pinot 1.2.0 added support for database constructs.
- See:
- #12333
- #12591
- #12695
- But the UI does not allow creating tables/schemas/etc within a database or querying tables within a database.
- The REST API currently returns only resources attached to the default database unless explicitly specified.
Proposed Solution
-
[ ] Enhance REST API to support fetching resources for all databases for all relevant APIs:
-
Option 1: Use asterisk (*) as db name
- Modify the REST API to support fetching all tables/schemas/etc. irrespective of the database name.
- To achieve this we modify the existing endpoint to accept a asterisk (
*) for the database header to return resources from all databases. - Only concern is currently Pinot allows even asterisk to be a DB Name, we may have to reserve this character!
curl -X GET \ http://localhost:9000/tables \ -H 'Accept: application/json' \ -H 'database: *'
-
Option 2: Introduce a new header, say 'fetch_all'
- Introduce a new boolean header to support fetching all tables/schemas/etc
- We should not pass database name to this one, it will be ignored anyways
curl -X GET \ http://localhost:9000/tables \ -H 'Accept: application/json' \ -H 'fetch_all: true'
-
Option 3: Add support for regex in filtering resources
- Introduce new query parameters or headers, e.g. regex=
, to indicate that resources from all databases should be returned and filtered using the provided regex pattern. - We should not pass database name to this one, it will be ignored anyways.
- This might be an overkill but maybe useful in other use cases
curl -X GET \ http://localhost:9000/tables \ -H 'Accept: application/json' \ -H 'regex: .*'
- Introduce new query parameters or headers, e.g. regex=
-
Option 1: Use asterisk (*) as db name
-
[ ] Update the Pinot controller UI for resource creation:
- Allow creating tables within any database by introducing a new, optional, input field 'Database'.
- Allow creating schemas within any database by introducing a new, optional, input field 'Database'.
-
[ ] Update the Pinot controller UI for resource retrieval:
- Modify the UI to fetch tables/schemas/etc. for all databases by setting the database to
*in the UI context, currently we donot have any database set, hence we retrieve only resources associated with 'default' database.
- Modify the UI to fetch tables/schemas/etc. for all databases by setting the database to
-
[ ] Support for non default DB Queries in UI:
-
Option 1: Implement support for queries on table in non default DB in the UI
- We may have to inject a db header here to the queries and maybe just display as
SET database='testDb'; select * from myTable limit 10;in UI. - Ensure that the UI sets the appropriate database context before executing queries.
- We may have to inject a db header here to the queries and maybe just display as
-
Option 2: Fix/Add database query option
- We currently seem to have a bug (or may not yet support) due to which the query does not run if we do not pass a header. It should be propose in #12591. Currently running
select * from testDb.myTable limit 10fails withorg.apache.pinot.spi.exception.DatabaseConflictException: Database name 'test' from table prefix does not match database name 'default' from header - We will fix the bug / add support for this. No UI change will be needed and it will work as expected since we will already be able to get right tablename on clicking the table link. Refer Image 3 below.
- We currently seem to have a bug (or may not yet support) due to which the query does not run if we do not pass a header. It should be propose in #12591. Currently running
-
Option 1: Implement support for queries on table in non default DB in the UI
-
[ ] Additional Tasks:
- Will add more tasks as needed, as I dive into the codebase and have more understanding of the relevant code.
PoC
I just tried doing a PoC of this by changing the behavior of the table API to return all tables when no database is specified. Next, I Created 2 tables with same name in different DBs, here's how the UI renders now.
- Image 1:
- Image 2:
- Image 3:
NOTE: We will have to fall back to Task 1 as we can't change behavior of current API, this was done just for a quick PoC.
Benefits
By implementing these changes, we can enhance the functionality of the Pinot UI and REST API to support database constructs more effectively. This will simplify the process for users working with multiple databases within the same cluster.
I am eager to contribute to this project and look forward to your feedback on this proposal. Thank you for considering my proposal.
Best regards,
Nihal Jain
Hi @ajeydudhe, @Jackie-Jiang, @shounakmk219, @xiangfu0 @npawar
Any thoughts on this proposal?
Hey @NihalJain thanks a lot for taking this up!
Few questions
- What is the usecase for showing tables/schemas across all databases in one view?
- Can we have a place on UI to set the database name context globally which gets passed in headers for all the relevant APIs? We already have an API that lists all the database names.
I am more inclined to a user flow where user is required to set the database context first and then proceed to do any operations, same flow that's followed by other standard SQL CLIs where we are required to set the database/catalog context before doing any operation. This will also ensure that we are not required to prefix the table name with database name while building query on UI. Let me know your thoughts. Thanks!
What is the usecase for showing tables/schemas across all databases in one view?
We want to do this because as an ADMIN i may want a single view of the cluster and all its resources. Just like we can see all servers and brokers, IMO admin should be able to view all tables in the UI.
Can we have a place on UI to set the database name context globally which gets passed in headers for all the relevant APIs? We already have an API that lists all the database names.
I am more inclined to a user flow where user is required to set the database context first and then proceed to do any operations, same flow that's followed by other standard SQL CLIs where we are required to set the database/catalog context before doing any operation.
Sure I agree for task 2 onwards we should have a flow where all user CRUD is tied to a particular DB.