marquez icon indicating copy to clipboard operation
marquez copied to clipboard

Features to Enhance Search Capabilities Using OpenSearch

Open dpengpeng opened this issue 3 months ago • 8 comments

What is the transformation progress of using OpenSearch to enhance the search capability? The latest version of OpenSearch is the code from a year ago. Currently, the interface is the v2 interface of the beta version.

dpengpeng avatar Sep 29 '25 01:09 dpengpeng

Thanks for opening your first issue in the Marquez project! Please be sure to follow the issue template!

boring-cyborg[bot] avatar Sep 29 '25 01:09 boring-cyborg[bot]

Hi @dpengpeng

Just to let you know, we are working on updating the OpenSearch to the newest version, it's still not released. But you can already use the simple search functionality, which does not need opensearch to work. You can check the api(which will be updated during the next few days after the release of 0.53.1)

Just in case it's here - https://github.com/ilum-cloud/marquez

thijs-s avatar Sep 29 '25 16:09 thijs-s

Hi @dpengpeng

Just to let you know, we are working on updating the OpenSearch to the newest version, it's still not released. But you can already use the simple search functionality, which does not need opensearch to work. You can check the api(which will be updated during the next few days after the release of 0.53.1)

Just in case it's here - https://github.com/ilum-cloud/marquez

@thijs-s Thank you. I will try out your enhanced version. Has the Marquez team stopped maintaining this project? There hasn't been any code update in the last six months.

dpengpeng avatar Sep 30 '25 08:09 dpengpeng

unfortunately yes :(

this is why we created a fork while upstream activity was paused so we could ship performance and security fixes quickly without breaking compatibility. All changes are additive (API-compatible). From 0.53.x we’re aligning with upstream and contributing improvements back. We also don’t use marquez-web internally, we run our own UI optimized for large lineages and plan to open-source it.

thijs-s avatar Sep 30 '25 11:09 thijs-s

unfortunately yes :(

this is why we created a fork while upstream activity was paused so we could ship performance and security fixes quickly without breaking compatibility. All changes are additive (API-compatible). From 0.53.x we’re aligning with upstream and contributing improvements back. We also don’t use marquez-web internally, we run our own UI optimized for large lineages and plan to open-source it.

@thijs-s Why did the Marquez project choose the PG database to store lineage relationships when theoretically, graph databases should perform better than PG databases? Do you have plans to switch your private project to a graph database?

dpengpeng avatar Oct 16 '25 12:10 dpengpeng

@dpengpeng simplicity with starting a fresh project without too much of external dependencies, but you are totally right. There is no better way of moving forward for enterprise level tools than applying a real graph db over it.

At ilum we are planning to move the current structure, but still hold it in pg with the use of extensions like Apache AGE for full graph support, but if you have any recommendations for an alternative approach, I'm all ears.

thijs-s avatar Oct 16 '25 13:10 thijs-s

@dpengpeng simplicity with starting a fresh project without too much of external dependencies, but you are totally right. There is no better way of moving forward for enterprise level tools than applying a real graph db over it.

At ilum we are planning to move the current structure, but still hold it in pg with the use of extensions like Apache AGE for full graph support, but if you have any recommendations for an alternative approach, I'm all ears.

Apache AGE is indeed a good idea, I will also look into this plugin. My original idea is to completely replace the PG database with the graph database and insert the Openlineage data into the graph database. However, the reconstruction workload will be large.

dpengpeng avatar Oct 17 '25 02:10 dpengpeng

indeed, that might affect just too many users, but I will ask my team to reconsider both options. Maybe we will come up with some strategies for supporting both options during the transition period.

thijs-s avatar Oct 17 '25 13:10 thijs-s