qbeast-spark icon indicating copy to clipboard operation
qbeast-spark copied to clipboard

Allow CREATE TABLE without schema

Open osopardo1 opened this issue 1 year ago • 0 comments

Right now, when trying to create a Qbeast Table without schema, the following exception is raised:


spark.sql("CREATE TABLE t USING qbeast LOCATION '/tmp/test'")

org.apache.spark.sql.AnalysisException: Trying to create an External Table without any schema. Please specify the schema in the command or use a path of a populated table.

When executing the same code with Delta, it works creating a Delta Table with the following information:

spark.sql("CREATE TABLE t USING delta LOCATION '/tmp/test'")

{
  "commitInfo": {
    "timestamp": 1709215123871,
    "operation": "CREATE TABLE",
    "operationParameters": {
      "isManaged": "false",
      "description": null,
      "partitionBy": "[]",
      "properties": "{}"
    },
    "isolationLevel": "Serializable",
    "isBlindAppend": true,
    "operationMetrics": {},
    "engineInfo": "Apache-Spark/3.5.0 Delta-Lake/3.1.0",
    "txnId": "6cf7a54c-cfc4-4d1a-831e-b61729e123ed"
  }
}
{
  "metaData": {
    "id": "e9b1abc8-7ab7-4796-8761-7fb588e2eb6c",
    "format": {
      "provider": "parquet",
      "options": {}
    },
    "schemaString": "{\"type\":\"struct\",\"fields\":[]}",
    "partitionColumns": [],
    "configuration": {},
    "createdTime": 1709215123802
  }
}
{
  "protocol": {
    "minReaderVersion": 1,
    "minWriterVersion": 2
  }
}

We should allow the Qbeast Datasource to create a new table for a location, even if the schema is not specified. Once the first write is made, we should enforce the schema relying on Delta properties.

osopardo1 avatar Feb 29 '24 15:02 osopardo1