openff-bespokefit
openff-bespokefit copied to clipboard
Use base classes rather than unions in type hints
In BespokeWorkflowFactory, and possibly elsewhere, several field's types are given as unions of classes that share a base class. For example:
target_templates: List[Union[TorsionProfileTargetSchema, AbInitioTargetSchema, VibrationTargetSchema, OptGeoTargetSchema]]
fragmentation_engine: Optional[Union[WBOFragmenter, PfizerFragmenter]]
And several others. It might be better to replace these unions with the base class, both to make it clear to the user what they need to do to extend functionality themselves, and to make adding new options easier in the future.
We currently need to use the parent classes here in order for pydantic to be able to distinguish which type the field should actually be when validating, otherwise it will drop the extra fields on the child class or more likely raise an exception that extra fields are not allowed.
This is currently a difficulty with using pydantic that is a bit tricky to engineer around, and that currently makes it difficult for users to extend the fields to use their classes.
Spitballing: you could do some metaclass hackery for this. I haven't thought through if the added complexity is worth it.
class BaseSchemaMeta(ModelMetaclass):
registry = []
def __init__(self, name, *args, **kwargs):
if not name.startswith("Base"):
self.registry.append(self)
class BaseSchema(BaseModel, metaclass=BaseSchemaMeta):
...
target_templates: List[Union[*BaseSchemaMeta.registry]]
We've looked into this kind of approach in the past and IIRC this approach kind of worked, but had some limitations that I can't remember off the top of my head and did have hidden complexity. I think mostly related to making sure the users classes were properly found at the right time and in the right order.
Ah that's annoying. Is Pydantic likely to improve things here? Is there any work around for a user looking to extend BespokeFit? I'd like to mention it in the docs, would be a shame if users couldn't implement their own targets and so on.
I think this issue is one that will be run into across the OpenFF ecosystem as pydantic is adopted more (e.g. in @mattwthompson 's interchange), and a uniform solution should be developed that can be applied across the board.
This is likely something that @mattwthompson and @j-wags will need to work on and roll-out.