optimade-python-tools icon indicating copy to clipboard operation
optimade-python-tools copied to clipboard

anyOf, allOf, etc. in /info/{entry_type} endpoints

Open CasperWA opened this issue 4 years ago • 3 comments

It seems via sheer luck this has not broken in the current implementation. The use of retrieve_queryable_properties() breaks if any of the attributes/properties are subclasses of pydantic.BaseModel or is an enumeration.

Since this function is only used for /info/structures and /info/references and neither of StructureResourceAttributes and ReferenceResourceAttributes have attributes with types of BaseModel subclasses or enumerations directly (they're always wrapped in a list or similar), this never becomes an issue - but it will be an issue for custom resources for implementations that are based on this package, using the utility functions, etc.

For the BaseModel subclasses one can utilize the __modify_schema__ method (see the pydantic docs here and here). However, for enumerations, one needs to set the type value directly when specifying the attribute that has the enumeration type, e.g.:

from enum import Enum
from pydantic import BaseModel, Field

class SomeEnum(Enum):
    test = "test"

class SomeModel(BaseModel):
    my_enum: SomeEnum = Field(..., description="My enumeration", type="enum")

Either one does it like this, or (perhaps more straightforward) we handle allOf and anyOf keys for an OpenAPI schema property, go to the referenced model(s) and retrieve the type from there, which should be the same, and present. The main issue here will be anyOf, which results from a Union type annotation, i.e., it could be a set of different types. This is however, not supported by OPTIMADE (essentially), and as such this should probably raise an exception if it's found that there is not a single unique type from all the possible models/types listed in the anyOf list value.

I hope this explanation makes a bit of sense - it's an issue I encountered in the OPTIMADE gateway development, and will lead me to have to develop a specialized retrieve_queryable_properties() function, essentially to deal with this.

CasperWA avatar Apr 08 '21 14:04 CasperWA

Note. The solution I am using now is to simply explicitly specify type in the Field() for each of the attributes with BaseModel and Enum types.

CasperWA avatar Apr 08 '21 14:04 CasperWA

I've also just run into this when trying to use retrieve_queryable_properties inside the entry mapper code (which means it also gets used for the enum-heavy LinksResource model...

ml-evs avatar May 27 '21 15:05 ml-evs

you can find a code snippet below for getting rid of all "ref"s:

import copy

def schema_without_refs(schema: dict) -> dict:
    
    definitions = set()
    schema = copy.deepcopy(schema)

    def get_reference(path):
        dct = schema
        for key in path:
            dct = dct[key]
        return dct

    # recursively walking through the dictionary
    def inner_loop(item):
        if isinstance(item, list):
            for i, v in enumerate(item):
                item[i] = inner_loop(v)

        elif isinstance(item, dict):
            if '$ref' in item.keys():
                path = item['$ref'][2:].split('/')
                definitions.add(path[0])
                return get_reference(path)
                
            for k, v in item.items():
                item[k] = inner_loop(v)
            
        return item
    
    new_schema = inner_loop(schema)
    
    # removing original definitions
    for k in definitions:
        del new_schema[k]

    return new_schema

usage:

from odbx.models.structure import MatadorHamiltonian
import json

m = MatadorHamiltonian.schema()
d = schema_without_refs(m)
print(json.dumps(d, indent=2))

fekad avatar Sep 25 '21 00:09 fekad