iceberg-go icon indicating copy to clipboard operation
iceberg-go copied to clipboard

feat(types): Support Unknown Type for v3 table spec

Open dttung2905 opened this issue 2 months ago • 4 comments

  • [x] Write the logic for nested type checking
  • [ ] Add more test for the nested field validator
  • [ ] move validate unknown type from init() to somewhere like checkSchemaCompatibility

dttung2905 avatar Oct 19 '25 17:10 dttung2905

@zeroshade Not sure if my understanding of the specs is correct but do we support UnknownType as nested type or only in the top-level column type? :thinking: Could you advise me on this?

dttung2905 avatar Oct 26 '25 17:10 dttung2905

@dttung2905, I'll dig a bit in the java code to see how they're handling that. Without having done that - to me it makes perfect sense to allow Unknown types within nested types. There may be a bunch of known types mixed with a single unsupported type in such a nested type. Allowing unknown here enables reading that.

twuebi avatar Oct 27 '25 10:10 twuebi

@dttung2905, it's supported in nested types:

org.apache.iceberg.TestSchema#testUnknownSupport

  @Test
  public void testUnknownSupport() {
    // this needs a different schema because it cannot be used in required fields
    Schema schemaWithUnknown =
        new Schema(
            Types.NestedField.required(1, "id", Types.LongType.get()),
            Types.NestedField.optional(2, "top", Types.UnknownType.get()),
            Types.NestedField.optional(
                3, "arr", Types.ListType.ofOptional(4, Types.UnknownType.get())),
            Types.NestedField.required(
                5,
                "struct",
                Types.StructType.of(
                    Types.NestedField.optional(6, "inner_op", Types.UnknownType.get()),
                    Types.NestedField.optional(
                        7,
                        "inner_map",
                        Types.MapType.ofOptional(
                            8, 9, Types.StringType.get(), Types.UnknownType.get())),
                    Types.NestedField.optional(
                        10,
                        "struct_arr",
                        Types.StructType.of(
                            Types.NestedField.optional(11, "deep", Types.UnknownType.get()))))));

    assertThatThrownBy(() -> Schema.checkCompatibility(schemaWithUnknown, 2))
        .isInstanceOf(IllegalStateException.class)
        .hasMessage(
            "Invalid schema for v%s:\n"
                + "- Invalid type for top: %s is not supported until v%s\n"
                + "- Invalid type for arr.element: %s is not supported until v%s\n"
                + "- Invalid type for struct.inner_op: %s is not supported until v%s\n"
                + "- Invalid type for struct.inner_map.value: %s is not supported until v%s\n"
                + "- Invalid type for struct.struct_arr.deep: %s is not supported until v%s",
            2,
            Types.UnknownType.get(),
            MIN_FORMAT_VERSIONS.get(Type.TypeID.UNKNOWN),
            Types.UnknownType.get(),
            MIN_FORMAT_VERSIONS.get(Type.TypeID.UNKNOWN),
            Types.UnknownType.get(),
            MIN_FORMAT_VERSIONS.get(Type.TypeID.UNKNOWN),
            Types.UnknownType.get(),
            MIN_FORMAT_VERSIONS.get(Type.TypeID.UNKNOWN),
            Types.UnknownType.get(),
            MIN_FORMAT_VERSIONS.get(Type.TypeID.UNKNOWN));

    assertThatCode(() -> Schema.checkCompatibility(schemaWithUnknown, 3))
        .doesNotThrowAnyException();
  }

twuebi avatar Oct 27 '25 10:10 twuebi

thanks @twuebi for digging and confirming that

zeroshade avatar Oct 27 '25 16:10 zeroshade