metabase-athena-driver
metabase-athena-driver copied to clipboard
Connection to multiple catalogs not supported
I have made a connection using the driver in metabase but it showing me all the database and tables in default AwsDataCatalog.
I have created other catalogs also in athena using hive metastore but that is not visible in metabase. How can i specify catalog name that i want to connect.
Multiple catalogs weren't a thing when I first wrote the driver, so I think I have to dig into how multiple catalogs/schemas are supported in Metabase and see how I could do the same here. It should be feasible, but this will remain on the backlog for now as you can write native SQL queries to access other catalogs as a workaround.
Hi, how are you? Do you have any predictions, or any tips on how to make this improvement?
I looked into this a little bit more today. As noted in this comment, I previously tried using the Catalog and MetadataRetrievalMethod properties to see if it would work, but didn't have success.
I've learned a bit more since then:
MetadataRetrievalMethodwill return the extra catalogs, but the existing code can't handle them.- In order to support federated queries, I had to add a new
EXTERNALtable type to theget-tablesmethod Catalogsetting could help but I believe it only sets the default catalog for actual queries, not database/table metadata
In order to make this feasible, a rewrite to the entire set of table metadata methods is required starting from driver/describe-database. A few notes on that:
fast-active-tableswill list all the catalogs, but when passing the schema/database names toget-tables, it ignores the catalog name that is returned by the API.fast-active-tablescan take a catalog name, but we currently just pass in "nil"- Even if we did pass in a catalog name, say via a connection parameter,
describe-tablestill needs to be rewritten to handle that.
I was able to manage to get it to "mostly" work (DB sync worked, tables and columns showed up, and I could query the table) by hardcoding in a catalog name in driver/describe-database and driver/describe-table, so that's promising.
That leaves a couple options:
- Add a connection parameter to specify a different catalog then the default
- Low effort, but I need to figure out how to get the catalog name to the
describe-tablemethod
- Low effort, but I need to figure out how to get the catalog name to the
- "Truly" support multiple catalogs
- high effort, full rewrite
That said, I certainly don't have the bandwidth for the second option. I'll look into the first a little bit more to see if it's feasible. It would be nice to be able to at least query different catalogs.
I've looked into adding a connection parameter for a different catalog and have been able to make it work! 🙌
If you're interested in testing it out, leave a 👍 reaction here and I'll make a test release. Otherwise, I'll probably wrap it up into the next big release.
I am actually surprisingly close to making multiple catalogs work as well, but still a bit harder. :)
Hi @dacort. This plugin is pretty awesome. I'll be happy to help you by testing multiple catalogs. Where can I find the testing release?
Hi @pmoralesp - after chatting with the MB team, I don't think multiple catalogs is an option (due to no way to show it in the UI). But I could potentially wrap up the option to specify a non-default catalog and upload a release. Let me know if that's still of interest.
That would be very usefull!!. I tried with Catalog=<non_default_catalog> as "Additional Athena connection string options", but doesn't worked.
Yea, there was a lot of extra work in the core of the driver I had to change to support a different catalog. I'll see if I can clean it up and get something out by next week.
@pmoralesp I've built an image that can be used to test this out. You can run it with Docker with the following command:
docker run -p 3000:3000 ghcr.io/dacort/metabase-athena-driver:sha-4814b40
When you go to setup Athena, there should be a new "Catalog" option. Just enter the name of your additional catalog (or as it's shown in the Athena console, data source).
In addition, I've moved the access key and secret credentials behind a toggle box as in a production environments, folks should be using Instance profile credentials.
Feel free to give it a shot and post here if you have any issues.
It works like a charm! First try and I be able to retrive ~60 tables from DynamoDB through Athena. Thanks @dacort
Specific catalogs can now be selected as part of the v1.5.0 release. Also closing as part of cleanup now that Athena is officially supported by Metabase.