sdk icon indicating copy to clipboard operation
sdk copied to clipboard

bug: Schema warning with `AnyType`

Open ReubenFrankel opened this issue 1 year ago • 4 comments

Singer SDK Version

0.34.1

Is this a regression?

  • [ ] Yes

Python Version

3.7 (EOL)

Bug scope

Taps (catalog, state, etc.)

Operating System

Ubuntu 22.04

Description

Unsure if I am using this correctly, but my understanding is that AnyType should represent a value of any JSON type (i.e. string, integer, object, array or null). Maybe this has been fixed in a later version, but 0.34.1 is the latest compatible version available since we haven't yet dropped Python 3.7 support for the tap.

Code

th.Property("value", th.AnyType)
Could not append type because the JSON schema for the dictionary `{}` appears to be invalid

ReubenFrankel avatar Jul 03 '24 13:07 ReubenFrankel

This seems to be the intended behaviour when using AnyType: https://github.com/meltano/sdk/blob/2874a15abefff83a994dc0e3b2805f660e0e0bf6/tests/core/test_jsonschema_helpers.py#L183-L186

Not sure why this is the case. The main issue here is the amount of times this warning message seems to get thrown out in the logs. For example:

$ meltano invoke tap-capsulecrm 2>&1 | grep -c 'Could not append type'

237

ReubenFrankel avatar Jul 03 '24 23:07 ReubenFrankel

This seems to be the intended behaviour when using AnyType:

https://github.com/meltano/sdk/blob/2874a15abefff83a994dc0e3b2805f660e0e0bf6/tests/core/test_jsonschema_helpers.py#L183-L186

Not sure why this is the case. The main issue here is the amount of times this warning message seems to get thrown out in the logs. For example:

$ meltano invoke tap-capsulecrm 2>&1 | grep -c 'Could not append type'

237

Looking at the code, my first guess is that it's because the schema property is not cached, so the PropertiesList gets serialized a bunch of times: https://github.com/Matatika/tap-capsulecrm/blob/f7c360329868a70c445d73943887174f07231a3b/tap_capsulecrm/client.py#L73-L75

edgarrmondragon avatar Jul 04 '24 03:07 edgarrmondragon

Thanks, that reduces the amount of warnings significantly (once for each stream with a schema referencing AnyType):

$ meltano invoke tap-capsulecrm 2>&1 | grep -c 'Could not append type'

5

I guess this is then more a matter of should the SDK throw that warning message for explicit usage of AnyType?

ReubenFrankel avatar Jul 04 '24 08:07 ReubenFrankel

Thanks, that reduces the amount of warnings significantly (once for each stream with a schema referencing AnyType):

$ meltano invoke tap-capsulecrm 2>&1 | grep -c 'Could not append type'

5

I guess this is then more a matter of should the SDK throw that warning message for explicit usage of AnyType?

Problem is AnyType is just a wrapper for a "<prop>": {} schema element, and that's the same warning you'd get. I increasingly dislike the append_type implementation. I'd rather it be part of the JSONTypeHelper base class. So this:

Property("name", StringType(nullable=True))  # {"properties": {"name": {"type": ["string", "null"]}}, "required": []}

instead of

Property("name", StringType, required=False)  # {"properties": {"name": {"type": ["string", "null"]}}, "required": []}

That'd allow us to refactor these lines

https://github.com/meltano/sdk/blob/dfdbdde32bd92fe11b1c3c44c25398db8362bed6/singer_sdk/typing.py#L688-L689

and get rid of most of the instances of that warning.

edgarrmondragon avatar Jul 05 '24 22:07 edgarrmondragon