pinot icon indicating copy to clipboard operation
pinot copied to clipboard

[BUG] Can not override existing schemas even with override=true (POST /schemas API)

Open MeihanLi opened this issue 3 years ago • 5 comments

Hi, we just noticed there is one change introduced by #7374. A new checking schema.isBackwardCompatibleWith(oldSchema) is added for POST /schemas API. With the checking, we can not modify any existing column maxLength any more. This is a breaking change for us.

The previous behavior: If we set override to true, we can just update the existing schema with the new one. No extra checking.

  /**
   * API 2.0
   */

  /**
   * Schema APIs
   */
  public void addSchema(Schema schema, boolean override) {
    ZNRecord record = SchemaUtils.toZNRecord(schema);
    String schemaName = schema.getSchemaName();
    Schema oldSchema = ZKMetadataProvider.getSchema(_propertyStore, schemaName);

    if (oldSchema != null && !override) {
      throw new RuntimeException(String.format("Schema %s exists. Not overriding it as requested.", schemaName));
    }

    if (schema.equals(oldSchema)) {
      LOGGER.info("New schema is the same with the existing schema. Not updating schema " + schemaName);
      return;
    }

    PinotHelixPropertyStoreZnRecordProvider propertyStoreHelper =
        PinotHelixPropertyStoreZnRecordProvider.forSchema(_propertyStore);
    propertyStoreHelper.set(schemaName, record);
  }

The current behavior is that even we set override to true, we need to check if the new schema is backward compatible with the old one in isBackwardCompatibleWith function (Schema class). If not, we fail the request with SchemaBackwardIncompatibleException Exception.

  /**
   * Check whether the current schema is backward compatible with oldSchema.
   * Backward compatibility requires all columns and fieldSpec in oldSchema should be retained.
   *
   * @param oldSchema old schema
   */
  public boolean isBackwardCompatibleWith(Schema oldSchema) {
    Set<String> columnNames = getColumnNames();
    for (Map.Entry<String, FieldSpec> entry : oldSchema.getFieldSpecMap().entrySet()) {
      String oldSchemaColumnName = entry.getKey();
      if (!columnNames.contains(oldSchemaColumnName)) {
        return false;
      }
      FieldSpec oldSchemaFieldSpec = entry.getValue();
      FieldSpec fieldSpec = getFieldSpecFor(oldSchemaColumnName);
      if (!fieldSpec.equals(oldSchemaFieldSpec)) {
        return false;
      }
    }
    return true;
  }

MeihanLi avatar Aug 25 '22 21:08 MeihanLi

We saw the issue when we upgrade from 0.7 to 0.9 and this is breaking our current behaviors. Can we define what is backward compatible more clearly? Currently checking is too strict and does not make sense to us. Also there is other existing API that we can use. Either we can remove the checking from POST /schemas API and restore it to the previous behavior or we can change isBackwardCompatibleWith function to bypass some necessary updates. Please let me know your thoughts and concerns on it.

MeihanLi avatar Aug 25 '22 21:08 MeihanLi

cc: @yupeng9 @chenboat @Jackie-Jiang

MeihanLi avatar Aug 25 '22 21:08 MeihanLi

Thanks for reporting the issue. We should allow changing certain fields in the field, e.g. default value, max length

As a work-around, right now you may delete the schema and then add it

Jackie-Jiang avatar Aug 25 '22 21:08 Jackie-Jiang

@Jackie-Jiang i think there are schemas shared by realtime tables and offline tables.

will it be a problem if we drop it and recreate it later?

deemoliu avatar Aug 25 '22 21:08 deemoliu

@deemoliu Good point, Pinot won't allow you to delete the schema if a realtime table is using it. We should loose the checks, and then probably add a force flag in the API to force update the schema (user takes the risk of breaking the table)

Jackie-Jiang avatar Aug 25 '22 22:08 Jackie-Jiang

I raised the fix here: #9382 cc: @Jackie-Jiang @deemoliu

MeihanLi avatar Sep 12 '22 16:09 MeihanLi