Where can we find examples of serializing a view as a plan?
The example use case in homepage has this interesting line:
Serialize a plan that represents a SQL view for consistent use in multiple systems (e.g. Iceberg views in Spark and Trino)
It's a really awesome example, but I can't find any relative code of Substrait in Iceberg, Spark, and Trino.
Did I miss something?
BTW, I assume that it will be more friendly to attach the example link to the use cases on the homepage
I think that these examples are "examples of potential future uses."
Interestingly enough, there has been a discussion on the Iceberg mailing list in the last few weeks to make exactly that envisioned use case a reality.
Gluten (a Spark plugin) has modified Substrait to read Iceberg files. That modification on my list to mainstream these changes at some point:
https://github.com/apache/incubator-gluten/blob/main/gluten-substrait/src/main/resources/substrait/proto/substrait/algebra.proto#L152
More generally speaking, there is no currently existing example that is interesting. To make an interesting one depends on a database having an interesting way of querying a view.
I threw together a simple example using ibis and duckdb here: query-duckdb-view
Representing a query of a view can happen a variety of ways: ReadRel and ExtensionLeafRel are 2 specific operators, but even ReadRel specifies a handful of particular approaches via the oneof read_type group of attributes. The provided example just uses ReadRel.named_table (I think).
Then, various systems will likely present views in different ways, though I assume many will resolve it at the catalog level: a "table name" that matches a view name will read from the view and be otherwise transparent.
Altogether, a logical example would be:
- Produce a substrait plan that specifies the name of a view in either a
ReadRelor anExtensionLeafRel. - When consuming the substrait plan, either:
- resolve the view directly (if the plan explicitly mentions a view name)
- resolve the view indirectly (e.g. if the plan specifies the view via
ReadRel.named_table)
- The query completes per usual.
How a producer does (1) and how a consumer does (2) is where you'd get a variety of interesting examples (maybe). If there's some particular examples you'd like then maybe you can propose them? I don't use iceberg, spark, or trino, so I don't have an environment in which I can produce examples.