connectors icon indicating copy to clipboard operation
connectors copied to clipboard

[connectors-sdk] Add ObservedData to models (#5397)

Open throuxel opened this issue 1 month ago • 1 comments

Proposed changes

  • Add ObservedData to sdk models
  • Add tests for it

Related issues

  • https://github.com/OpenCTI-Platform/connectors/issues/5397

Checklist

  • [x] I consider the submitted work as finished
  • [x] I have signed my commits using GPG key.
  • [x] I tested the code for its functionality using different use cases
  • [x] I added/update the relevant documentation (either on github or on notion)
  • [x] Where necessary I refactored code to improve the overall quality

Further comments

throuxel avatar Dec 12 '25 11:12 throuxel

image

mariot avatar Dec 12 '25 16:12 mariot

Thank you for your PR and good job for identifying this specific stix2 error for Observed Data (MutuallyExclusivePropertiesError).

I suggest a few adjustments even though the code works perfectly:

  • Renaming the field

    • objectentities (field mandatory) This renaming aligns the model with the business definition, on the front end, objects are called entities.
  • Add missing fields, consistent with the front end

    • labels (not mandatory, default=None)
    • associated_files (not mandatory, default=None)
      • Add import: from connectors_sdk.models.associated_file import AssociatedFile
labels: list[str] | None = Field(  
   default=None,  
   description="Labels of the observed data",  
)  
associated_files: list[AssociatedFile] | None = Field(  
   default=None,  
   description="Files to upload with the observed data, e.g. observed data as a PDF.",  
)

The following fields are not added because they are already inherited from BaseIdentifiedEntity:

  • author
  • markings
  • external_references

PS: Although on the front end the entities field of ObservedData corresponds to objects in GraphQL,
connectors generally use object_refs, providing only the standard_id.
The conversion between object_refs and objects seems to be handled automatically by OpenCTI.


Context: STIX error with ObservedData

Here is the specific STIX error that occurs when object_refs (i.e., entities) is an empty list:

_check_mutually_exclusive_properties
    raise MutuallyExclusivePropertiesError(self.__class__, list_of_properties)
stix2.exceptions.MutuallyExclusivePropertiesError: The (object_refs, objects) properties for ObservedData are mutually exclusive.

When object_refs is an empty list, the STIX library interprets it as a missing field and implicitly initializes objects = {}. However, in STIX, the object_refs and objects properties are mutually exclusive, which triggers this exception.

In our case:

  • the entities field is mandatory and must never be empty
  • since connectors object_refs is used when creating the STIX Observed Data object, and is also used for generate_id

Current error handling: You handle this case correctly with an explicit error return from Pydantic when entities is empty:

def model_post_init(self, context__: Any) -> None:
	"""Validate objects before calling id."""
	if not self.objects:
		raise ValueError("objects must contain at least one element")

	super().model_post_init(context__)

We obtain:

pydantic_core._pydantic_core.ValidationError: 1 validation error for ObservedData
  Value error, entities must contain at least one element
  [type=value_error, input_value={'entities': [], 'first_o...', 'number_observed': 2}, input_type=dict]

Two different suggestions:

  • Add a native constraint on entities (min_length=1):
entities: list[BaseIdentifiedEntity] = Field(  
    min_length=1,  
    description="List of OpenCTI identified entities observed.",  
)

We obtain:

pydantic_core._pydantic_core.ValidationError: 1 validation error for ObservedData
entities
  List should have at least 1 item after validation, not 0
  [type=too_short, input_value=[], input_type=list]

Simple and native, like the gt=0 constraint approach for number_observed, but the error message may seem less explicit from a business perspective.

  • Use a field_validator with an explicit error message as you did.
@field_validator("entities")  
@classmethod  
def entities_must_not_be_empty(cls, value):  
    if not value:  
        raise ValueError("The 'entities' field must contain at least one element.")  
    return value

We obtain:

pydantic_core._pydantic_core.ValidationError: 1 validation error for ObservedData
entities
  Value error, The 'entities' field must contain at least one element.
  [type=value_error, input_value=[], input_type=list]

Clear, business-oriented message (avoids a lower-level STIX error)

Personally, like you, I prefer an explicit, business-oriented error message.
However, this raises a broader question:

  • What is our strategy in the SDK regarding the use of native constraints (gt, min_length, etc.) vs. field_validator with explicit error messages?

If we systematically choose clear business messages, then we should consider replacing certain native constraints with explicit validators for greater consistency.

PS: We will probably need to adapt the tests!

Megafredo avatar Dec 15 '25 18:12 Megafredo