[connectors-sdk] Add ObservedData to models (#5397)
Proposed changes
- Add ObservedData to sdk models
- Add tests for it
Related issues
- https://github.com/OpenCTI-Platform/connectors/issues/5397
Checklist
- [x] I consider the submitted work as finished
- [x] I have signed my commits using GPG key.
- [x] I tested the code for its functionality using different use cases
- [x] I added/update the relevant documentation (either on github or on notion)
- [x] Where necessary I refactored code to improve the overall quality
Further comments
Thank you for your PR and good job for identifying this specific stix2 error for Observed Data (MutuallyExclusivePropertiesError).
I suggest a few adjustments even though the code works perfectly:
-
Renaming the field
-
object→entities(field mandatory) This renaming aligns the model with the business definition, on the front end,objectsare calledentities.
-
-
Add missing fields, consistent with the front end
-
labels(not mandatory, default=None) -
associated_files(not mandatory, default=None)- Add import:
from connectors_sdk.models.associated_file import AssociatedFile
- Add import:
-
labels: list[str] | None = Field(
default=None,
description="Labels of the observed data",
)
associated_files: list[AssociatedFile] | None = Field(
default=None,
description="Files to upload with the observed data, e.g. observed data as a PDF.",
)
The following fields are not added because they are already inherited from BaseIdentifiedEntity:
-
author -
markings -
external_references
PS: Although on the front end the entities field of ObservedData corresponds to objects in GraphQL,
connectors generally use object_refs, providing only the standard_id.
The conversion between object_refs and objects seems to be handled automatically by OpenCTI.
Context: STIX error with ObservedData
Here is the specific STIX error that occurs when object_refs (i.e., entities) is an empty list:
_check_mutually_exclusive_properties
raise MutuallyExclusivePropertiesError(self.__class__, list_of_properties)
stix2.exceptions.MutuallyExclusivePropertiesError: The (object_refs, objects) properties for ObservedData are mutually exclusive.
When object_refs is an empty list, the STIX library interprets it as a missing field and implicitly initializes objects = {}. However, in STIX, the object_refs and objects properties are mutually exclusive, which triggers this exception.
In our case:
- the
entitiesfield is mandatory and must never be empty - since connectors
object_refsis used when creating the STIX Observed Data object, and is also used forgenerate_id
Current error handling:
You handle this case correctly with an explicit error return from Pydantic when entities is empty:
def model_post_init(self, context__: Any) -> None:
"""Validate objects before calling id."""
if not self.objects:
raise ValueError("objects must contain at least one element")
super().model_post_init(context__)
We obtain:
pydantic_core._pydantic_core.ValidationError: 1 validation error for ObservedData
Value error, entities must contain at least one element
[type=value_error, input_value={'entities': [], 'first_o...', 'number_observed': 2}, input_type=dict]
Two different suggestions:
- Add a native constraint on entities (
min_length=1):
entities: list[BaseIdentifiedEntity] = Field(
min_length=1,
description="List of OpenCTI identified entities observed.",
)
We obtain:
pydantic_core._pydantic_core.ValidationError: 1 validation error for ObservedData
entities
List should have at least 1 item after validation, not 0
[type=too_short, input_value=[], input_type=list]
Simple and native, like the gt=0 constraint approach for number_observed, but the error message may seem less explicit from a business perspective.
- Use a
field_validatorwith an explicit error message as you did.
@field_validator("entities")
@classmethod
def entities_must_not_be_empty(cls, value):
if not value:
raise ValueError("The 'entities' field must contain at least one element.")
return value
We obtain:
pydantic_core._pydantic_core.ValidationError: 1 validation error for ObservedData
entities
Value error, The 'entities' field must contain at least one element.
[type=value_error, input_value=[], input_type=list]
Clear, business-oriented message (avoids a lower-level STIX error)
Personally, like you, I prefer an explicit, business-oriented error message.
However, this raises a broader question:
- What is our strategy in the SDK regarding the use of native constraints (
gt,min_length, etc.) vs.field_validatorwith explicit error messages?
If we systematically choose clear business messages, then we should consider replacing certain native constraints with explicit validators for greater consistency.
PS: We will probably need to adapt the tests!