ibis icon indicating copy to clipboard operation
ibis copied to clipboard

feat(flink): temporal join support

Open mfatihaktas opened this issue 2 years ago • 3 comments

Description of changes

Explorative PR towards addressing https://github.com/ibis-project/ibis/issues/8247.

This is my first time getting hands-on with sqlglot changes :) So, particularly seeking suggestions on

  • Temporal join API
  • The newly added op VersionedDatabaseTable and expression TemporalJoin.

Note: Added support only in the Flink backend for now. Did not spend time on how to error out when the user calls tempora_join() on a backend that does not support temporal join. Plan to address these once we reach an agreement on the API.

Issues closed

mfatihaktas avatar Feb 21 '24 17:02 mfatihaktas

Is there another ibis backend which supports this kind of temporal join? I would like to avoid introducing global APIs highly tailored towards certain backends since it would defeat the purpose of ibis. Having it implemented for another backend could validate the API and IR design we have here.

kszucs avatar Feb 27 '24 09:02 kszucs

Just had a closer look at the issue description which you have nicely collected the necessary information in, apparently the options are RisingWave and MySQL. How much work would it be to support risingwave here?

kszucs avatar Feb 27 '24 09:02 kszucs

Is there another ibis backend which supports this kind of temporal join? I would like to avoid introducing global APIs highly tailored towards certain backends since it would defeat the purpose of ibis. Having it implemented for another backend could validate the API and IR design we have here.

Just had a closer look at the issue description which you have nicely collected the necessary information in, apparently the options are RisingWave and MySQL. How much work would it be to support risingwave here?

Adding support for RisingWave would have been great, but RisingWave seems to support only processing-time temporal join. This PR adds support for only event-time temporal join. In event-time temporal join, rows are joined with their right-table versions specified by a given time-attribute (at_time in this PR). In processing-time temporal join, rows are always joined with their most-recent version in the right-table.

For MySQL, I could not find enough information on whether it supports temporal join or not. At the time, I could find only this article that discusses support for performing joins against temporal tables, which does not have the same semantics as Flink's temporal join. In my understanding, it is for joining against a particular version of the table that is specified by a given timestamp rather than a time attribute. So it is like first performing a time travel on the table(s) and then joining them.

mfatihaktas avatar Feb 27 '24 15:02 mfatihaktas

Closing, as I don't think I can maintain this code at the moment. We can always revisit later!

cpcloud avatar Sep 30 '24 11:09 cpcloud