metering-operator icon indicating copy to clipboard operation
metering-operator copied to clipboard

presto other database support

Open kfox1111 opened this issue 6 years ago • 4 comments

Would it be possible to target presto to mysql/postgresql directly and support disabling deploying hive? It may be much simpler to use for those that don't have large clusters.

kfox1111 avatar Jul 11 '19 00:07 kfox1111

Supporting other databases is in the backlog, and is something I've wanted for a while but is just lower priority currently as we're working on a GA release. Adding support for other databases is relatively simple, but removing Hive is trickier.

Removing Hive is somewhat difficult because we use the map datatype in Presto/Hive for Prometheus metric data, which is basically only supported by Hive for Presto right now. We would need a way to configure prometheusMetricImporterDataSource's with a flat column structure without the maps, mapping labels from the results into columns via configuration in the datasource. Then we would need to update some of the ReportQueries to handle the new data model, but that would actually be fairly easy since we model the tables like this already, using views. After this, we could then use Mysql/Postgresql for storing metric data.

We're also exploring options like a native Prometheus connector for Presto, which would potentially allow us to stop importing metrics altogether, which could make this a lot easier. In this case, we would have dataSources which simply map directly to Prometheus time series tables in presto, and we would be able to write our reportQueries against those tables, and then reports could store data into mysql/postgresql.

chancez avatar Jul 11 '19 03:07 chancez

If you plan to fully support Postgres for storing metrics (jsonb could be an alternative to the map type), consider the Timescale extension which has magic table partitioning under the hood. It has some limitations (no surrogate primary keys etc) but otherwise it's 98% compatible with much better performance.

mmariani avatar Jul 11 '19 09:07 mmariani

Interesting. Thanks for the info.

kfox1111 avatar Jul 11 '19 15:07 kfox1111

@mmariani jsonb isn't supported by Presto, and leveraging timescaledb wouldn't help any since Presto would likely be unable to pushdown much of the filtering today.

chancez avatar Jul 11 '19 18:07 chancez