ion-schema-kotlin icon indicating copy to clipboard operation
ion-schema-kotlin copied to clipboard

Schema.newType doesn't add Type to Schema

Open r0b0ji opened this issue 1 year ago • 9 comments

Intuitively, I'd expect Schema.newType() to not just create the type but also register the type to Schema, but it doesn't. For ex: in this test, after I create a Type [email protected], I would expect to get the type when I call Schema.getType("[email protected]") but I get null.

@Test
    public void testType() {
        Schema schema = IonSchemaSystemBuilder.standard().build().newSchema();
        Type type = schema.newType("type::{\n"
            + "  name:'[email protected]',\n"
            + "  type:struct,\n"
            + "  fields:{\n"
            + "    a:{\n"
            + "      type: string,\n"
            + "    }\n"
            + "  }\n"
            + "}");
        Type gotType = schema.getType("[email protected]");
        assertEquals(type, gotType);
    }

If not this then what is the way to add Type to Schema. The only way I see it to initialize the schema with all the type as construction time itself through IonSchemaSystem.loadSchema() or IonSchemaSystem.newSchema(). Why can't I add type to Schema after creation.

r0b0ji avatar Feb 18 '24 04:02 r0b0ji

Using the plusType() method, you can create a new copy of the schema that has the new type added to it.

popematt avatar Feb 19 '24 20:02 popematt

I was looking for usecase where I add type one by one. It looks odd that we do this schema = schema.plusType(type) and create lots of intermediate schemas just to throw away. Either 1) newType adding type to schema, or 2) a separate api addType looks more intuitive.

r0b0ji avatar Feb 19 '24 21:02 r0b0ji

I don't think it's a good idea to change the behavior of the existing newType() API, but I definitely think it could be useful to have a plusTypes() API that acts like plusType(), but accepts multiple types. Then you can add types, one-by-one, to a list, and then add them to a schema all at once.

Why don't we have an API that modifies a schema in place? If a schema is immutable, then it is safe to share between threads. If schemas are mutable, and a new type has the same name as an existing type (thus replacing it), then we might have to replace all references to that type in every other type that is loaded into the schema system.

Are you looking at adding types one by one mixed with validation, or would it work to have a SchemaBuilder (that cannot be used for validation) where you can add types one by one and then build a Schema?

popematt avatar Feb 19 '24 21:02 popematt

That makes sense, immutability will make it easier to share between threads.

I tried using plusType(), but looks like it is re-ordering types or probably adding the type at head and then it fails. For ex: with this test I am getting a failure like

com.amazon.ionschema.InvalidSchemaException: Ion Schema version marker must come before any header, types, and footer.

	at com.amazon.ionschema.internal.SchemaImpl_2_0.<init>(SchemaImpl_2_0.kt:97)
	at com.amazon.ionschema.internal.IonSchemaSystemImpl.createSchema(IonSchemaSystemImpl.kt:100)
	at com.amazon.ionschema.internal.IonSchemaSystemImpl.access$createSchema(IonSchemaSystemImpl.kt:32)
	at com.amazon.ionschema.internal.IonSchemaSystemImpl$newSchema$1.invoke(IonSchemaSystemImpl.kt:163)
	at com.amazon.ionschema.internal.IonSchemaSystemImpl$newSchema$1.invoke(IonSchemaSystemImpl.kt:32)
	at com.amazon.ionschema.internal.IonSchemaSystemImpl.usingReferenceManager(IonSchemaSystemImpl.kt:115)
	at com.amazon.ionschema.internal.IonSchemaSystemImpl.newSchema(IonSchemaSystemImpl.kt:163)
	at com.amazon.ionschema.internal.SchemaImpl_2_0.plusType(SchemaImpl_2_0.kt:346)
	at com.amazon.itemknowledge.registry.ISLSchemaTest.test_hierarchy(ISLSchemaTest.java:37)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:135)
	at org.testng.internal.invokers.TestInvoker.invokeMethod(TestInvoker.java:673)
	at org.testng.internal.invokers.TestInvoker.invokeTestMethod(TestInvoker.java:220)
	at org.testng.internal.invokers.MethodRunner.runInSequence(MethodRunner.java:50)
	at org.testng.internal.invokers.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:945)
	at org.testng.internal.invokers.TestInvoker.invokeTestMethods(TestInvoker.java:193)
	at org.testng.internal.invokers.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146)
	at org.testng.internal.invokers.TestMethodWorker.run(TestMethodWorker.java:128)
public class ISLSchemaTest {
    private IonSystem ionSystem;
    private IonSchemaSystem schemaSystem;

    @BeforeMethod
    public void setup() {
        ionSystem = IonSystemBuilder.standard().build();
        schemaSystem = IonSchemaSystemBuilder.standard().build();
    }

    @Test(enabled = true)
    public void test_hierarchy() {
        Schema schema = schemaSystem.newSchema("$ion_schema_2_0");
        Iterator<IonValue> it = getISL();
        while (it.hasNext()) {
            Type type = schema.newType((IonStruct) it.next());
            schema = schema.plusType(type);
        }
        Type type = schema.getType("[email protected]");
        assertNotNull(type);

        IonValue data = ionSystem.singleValue("{a:\"a\", b:1, c:\"string\"}");
        Violations result = type.validate(data);
        assertTrue(result.isValid());
    }

    private Iterator<IonValue> getISL() {
        String schema =
            ""
                + "type::{\n"
                + "  name:'[email protected]',\n"
                + "  type:struct,\n"
                + "  fields:{\n"
                + "    a:{\n"
                + "      type: string, occurs: 1\n"
                + "    }\n"
                + "  }\n"
                + "}\n"
                + "type::{\n"
                + "  name:'[email protected]',\n"
                + "  type:'[email protected]',\n"
                + "  fields:closed::{\n"
                + "    b:{\n"
                + "      type:int\n"
                + "    },"
                + "    c:ignore::{\n"
                + "      type:string, "
                + "    }\n"
                + "  }\n"
                + "}";
        Reader schemaReader = new StringReader(schema);
        return ionSystem.iterate(schemaReader);
    }
}

r0b0ji avatar Feb 19 '24 23:02 r0b0ji

Here, I want to create Schema v2 and that's why I start with schemaSystem.newSchema("$ion_schema_2_0"); to force it to create schema v2, but it fails saying now version marker is not the first. If I don't add the version marker then it creates schema v1.

r0b0ji avatar Feb 20 '24 00:02 r0b0ji

Thanks for this example. You've helped identify an edge case here where a schema with only one top-level Ion value is not being handled correctly. We'll have a fix for this bug soon.

popematt avatar Feb 20 '24 00:02 popematt

I'd also like to understand—do you need to be adding the types one by one or could you do something like this?

String islString = "type::{ name: foo, type: int }"; // Insert your own ISL.
ListIterator<IonValue> isl = new ArrayList(ionSystem.loader.load(islString)).listIterator();
isl.add(ionSystem.newSymbol("$ion_schema_2_0"));
Schema schema = schemaSystem.newSchema(isl);

I know it's not very ergonomic to have to insert the version marker, but we can try to improve that. Even with the extra List, this is still going to be more efficient than using plusType() in a tight loop.

popematt avatar Feb 20 '24 01:02 popematt

We have a usecase to add Type one by one. I started with schemaSystem.newSchema("$ion_schema_2_0"); just to force creation of ISL-v2 and then add type one by one. If there is any alternate way to force initialization of ISL-v2, something like schemaSystem.newSchema2() or schemaSystem.newSchema() with some parameter, I can use that.

r0b0ji avatar Feb 21 '24 23:02 r0b0ji

Though, ideally I'd have preferred to just use schema.addType() and updating the same schema instead of creating new schema on every add but I can understand about the benefit of immutability.

r0b0ji avatar Feb 21 '24 23:02 r0b0ji