qbeast-spark
qbeast-spark copied to clipboard
Support schema evolution
Schema evolution is a feature of Delta Lake that allows users to easily change a table’s current schema to accommodate data that is changing over time. Most commonly, it’s used when performing an append or overwrite operation, to automatically adapt the schema to include one or more new columns.
Currently we don't support this type of change in Qbeast format, or at least we don't let the user specify any new columns to index with Qbeast. We should investigate on this topic, mostly on what could be the side effect. The proposal is to use SpaceRevision or Revision to actually save the new schema information and treat new revision as new indexes to query.
More information about the delta feature can be found in: https://databricks.com/blog/2019/09/24/diving-into-delta-lake-schema-enforcement-evolution.html
In the meantime, the Qbeast commit log structure should support the upcoming development of this feature. Meaning that some information like number of dimensions or columns indexed should be saved in the commit log.