OpenMetadata icon indicating copy to clipboard operation
OpenMetadata copied to clipboard

support for duckdb

Open geoHeil opened this issue 2 years ago • 10 comments
trafficstars

https://openmetadata.slack.com/archives/C02B6955S4S/p1697099341468359

https://github.com/open-metadata/OpenMetadata/blob/f63881b8b6f78b39a4a014eb3f67df62ce170780/ingestion/src/metadata/ingestion/lineage/models.py#L75 is listing duckdb

DBT supports duckdb

but for OM to ingest dbts nodes somehow the duckdb tables would need to be loaded beforehand

geoHeil avatar Oct 12 '23 11:10 geoHeil

duckdb is a supported dialect for the lineage engine, but we do not have a connector yet.

It should be good contribution from our community by watching the tutorial or reviewing similar PRs

Thanks!

pmbrull avatar Oct 25 '23 08:10 pmbrull

I see this issue is still open. I am interested in this project and contributing to it. Please assign a good first issue to me to work on. Thank you!

saurabhyadavdev avatar Dec 14 '23 17:12 saurabhyadavdev

hi @saurabhyadav1985, assigned, thanks

pmbrull avatar Dec 16 '23 14:12 pmbrull

Hey @saurabhyadav1985 , You Forgot To Add DuckDB.md File In openmetadata-ui/src/main/resources/ui/public/locales/en-US/Database . Please Check It.

Supan90-Shah3006 avatar Jan 25 '24 07:01 Supan90-Shah3006

hi @saurabhyadav1985 I see that the duckdb connection is a copy of the greenplum connection https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/entity/services/connections/database/duckdbConnection.json

What properties do we actually need to connect there?

Actually, looks like almost all the greenplum code has been copy-pasted into DuckDB. Not sure that's the best approach here, since DuckDB might have its own types, connection specifications etc. This might need another iteration to get the right information.

I think it's worth it to revert the change and put some time to review the requirements of the connector, rather than shipping it as-is and having troubles ingesting and migrating data after this gets updated

pmbrull avatar Jan 29 '24 08:01 pmbrull

@pmbrull what is the state of this? What exactly is missing? do you have some clear instructions? Maybe I find time to contribute

geoHeil avatar Mar 07 '24 12:03 geoHeil

@geoHeil the past contribution did not solve the actual problem at hand, so it was reverted to avoid any confusion.

If you'd like to contribute, you can follow this guide https://docs.open-metadata.org/v1.3.x/developers/contribute/developing-a-new-connector

Thanks

pmbrull avatar Mar 08 '24 09:03 pmbrull

This is really generic - if we want to reuse a DB connector let`s say postgres as a template for duckdb - can we speed up the process? I.e. is it enough to perhaps create the ingest connector but for the data model keep whatever postgres is offering (as that should be the same on the OM server side)

geoHeil avatar Mar 08 '24 13:03 geoHeil

This is really generic - if we want to reuse a DB connector let`s say postgres as a template for duckdb - can we speed up the process? I.e. is it enough to perhaps create the ingest connector but for the data model keep whatever postgres is offering (as that should be the same on the OM server side)

You can take other PRs as examples. I shared one above. But in the end, type mapping, sqlalchemy etc. needs to be dependant on each connector. The overall framework is already designed to force you to touch as few things as possible.

pmbrull avatar Mar 08 '24 15:03 pmbrull

I have created some preliminary DDB support - however outside of OMs standard ingestion framework - simply manually calling the API - would anyone be interested in re-using this?

geoHeil avatar Aug 16 '24 07:08 geoHeil

I have created some preliminary DDB support - however outside of OMs standard ingestion framework - simply manually calling the API - would anyone be interested in re-using this?

Hi @geoHeil ! Whats your state at this? I also would contribute or even build a basic custom connector as we definitely will need this.

pinkerltm avatar Jun 20 '25 11:06 pinkerltm

I am using a home-grown connector it does not have all the features/bells & whistles - but if desired I coud share it somehow

geoHeil avatar Jun 22 '25 19:06 geoHeil

I am not sure if I would have the time to make a full blown integration here - it is basically a monkey pateched version of the python integration where I use some python code, compute the metadata which is needed and then push it to the OM API

for a proper integration (i.e. showing the icon of duckdb instead of here pg) someone would have to do more.

geoHeil avatar Jun 22 '25 19:06 geoHeil