Presto template processor functions are not available in Trino
Hi,
We are in the process of migrating from Presto to Trino and noticed template processor functions don't work with Trino. We have a lot of datasets using first_latest_partition function.
How to reproduce the bug
Below is working for presto driver, however, it doesn't work for Trino.
SELECT *
FROM foo
WHERE ds='{{ presto.first_latest_partition('foo') }}'
Expected results
Template processor function should also work for Trino.
Actual results
When using Trino driver, an error is thrown because there are not template processor functions for Trino.
https://github.com/apache/superset/blob/master/superset/jinja_context.py#L553
Screenshots
Environment
- superset version:
1.3.0
I looked at master code, it seems like this will also be an issue for 1.4.0
Checklist
Make sure to follow these steps before submitting your issue - thank you!
- [x] I have checked the superset logs for python stacktraces and included it here as text if there are any.
- [x] I have reproduced the issue with at least the latest released version of superset.
- [x] I have checked the issue tracker for the same issue and I haven't found one similar.
@aaronfeng any luck with this?
@pashkash no, the workaround is to use
max(ds)
but I believe that will scan all partitions.
Maybe @bkyryliuk @nytai @villebro can confirm if this is an Issue we ought to keep open or close as inactive/deprioritized?
This is at risk of being closed as stale, but meanwhile, hopefully @dosu-bot can give us a little added insight.
Hey @rusackas! I'm Dosu, a bot here to assist you with bugs, questions, and contributions while you're waiting for a human maintainer. Let's squash those bugs together! 🤖
To make the first_latest_partition template processor function work with Trino in Superset, you can use the TrinoTemplateProcessor class. This class ensures compatibility with Trino by inheriting from PrestoTemplateProcessor and setting up the context correctly for Trino.
Here's how you can set it up:
-
Define the
TrinoTemplateProcessorclass:class TrinoTemplateProcessor(PrestoTemplateProcessor): engine = "trino" def process_template(self, sql: str, **kwargs: Any) -> str: template = self.env.from_string(sql) kwargs.update(self._context) # Backwards compatibility if migrating from Presto. context = validate_template_context(self.engine, kwargs) context["presto"] = context["trino"] return template.render(context) -
Update the
DEFAULT_PROCESSORSdictionary in your Superset configuration:DEFAULT_PROCESSORS = { "presto": PrestoTemplateProcessor, "hive": HiveTemplateProcessor, "spark": SparkTemplateProcessor, "trino": TrinoTemplateProcessor, } -
Ensure that the
TrinoEngineSpecclass handles thelatest_partitionfunctionality. This is already implemented in theTrinoEngineSpecclass:class TrinoEngineSpec(PrestoBaseEngineSpec): engine = "trino" engine_name = "Trino" allows_alias_to_source_column = False @classmethod def get_extra_table_metadata( cls, database: Database, table: Table, ) -> dict[str, Any]: metadata = {} if indexes := database.get_indexes(table): col_names, latest_parts = cls.latest_partition( database, table, show_first=True, indexes=indexes, ) if not latest_parts: latest_parts = tuple([None] * len(col_names)) metadata["partitions"] = { "cols": sorted( list( { column_name for index in indexes if index.get("name") == "partition" for column_name in index.get("column_names", []) } ) ), "latest": dict(zip(col_names, latest_parts)), "partitionQuery": cls._partition_query( table=table, indexes=indexes, database=database, ), } if database.has_view(Table(table.table, table.schema)): with database.get_inspector( catalog=table.catalog, schema=table.schema, ) as inspector: metadata["view"] = inspector.get_view_definition( table.table, table.schema, ) return metadata -
Testing: Ensure that the template processor is tested for Trino. You can refer to the integration tests to see how the template processor is tested for different backends, including Trino:
def test_template_trino(app_context: AppContext, mocker: MockerFixture) -> None: lp_mock = mocker.patch( "superset.jinja_context.TrinoTemplateProcessor.latest_partition" ) lp_mock.return_value = "the_latest" database = mock.Mock() database.backend = "trino" template = "{{ trino.latest_partition('my_table') }}" tp = get_template_processor(database=database) assert tp.process_template(template) == "the_latest" # Backwards compatibility if migrating from Presto. template = "{{ presto.latest_partition('my_table') }}" tp = get_template_processor(database=database) assert tp.process_template(template) == "the_latest"
By following these steps, the first_latest_partition function should work with Trino in the same way it works with Presto [1][2][3].
To continue talking to Dosu, mention @dosu.