qbeast-spark
                                
                                 qbeast-spark copied to clipboard
                                
                                    qbeast-spark copied to clipboard
                            
                            
                            
                        Add Convert To Qbeast
The only way of writing in Qbeast Format is to load your data and write it again with Spark Dataframes API.
It could be good to have some more easy ways to convert data in other formats to Qbeast, and that can be compatible with reading when no Metadata is found.
For that, we can think of two approaches:
- Write the data in the same place but organized with the Qbeast index. If more data is added while the conversion is taking place, we are targeting this data as non-indexed and reading all of them in case we need it.
- Write the data in the same place and mark it as replicated cubes. So we will only duplicate the data we need for optimizing.
Doubts/things we need to figure out:
- How to specify the columns to index in the API
- How to handle partitioning? Should be useful to index the columns that are in partition values?
- Study the feasibility of the second approach
- Study the integration with the Keeper
- Other design problems that could arise