[BUG] Data validation when update_data components are not present in input_data
Describe the bug
When trying to validate using assert_valid_batch_data using update_data with a component that is not present in the input_data, a KeyError is raised.
To Reproduce
from power_grid_model import initialize_array
from power_grid_model.validation import assert_valid_batch_data
input_data = {"node": initialize_array("input", "node", 1)}
update_data = {"sym_load": initialize_array("update", "sym_load", (1,1))}
assert_valid_batch_data(input_data, update_data)
Expected behavior
A clear error message with ValidationError can be given out instead
Screenshots
Error:
Cell In[3], line 3
1 input_data = {"node": initialize_array("input", "node", 1)}
2 update_data = {"sym_load": initialize_array("update", "sym_load", (1,1))}
----> 3 assert_valid_batch_data(input_data, update_data)
File z:\zzz\zzz\.venv\Lib\site-packages\power_grid_model\validation\assertions.py:90, in assert_valid_batch_data(input_data, update_data, calculation_type, symmetric)
60 def assert_valid_batch_data(
61 input_data: SingleDataset,
62 update_data: BatchDataset,
63 calculation_type: Optional[CalculationType] = None,
64 symmetric: bool = True,
65 ):
66 """
67 The input dataset is validated:
68
(...)
88 ValidationException: if the contents are invalid.
89 """
---> 90 validation_errors = validate_batch_data(
91 input_data=input_data, update_data=update_data, calculation_type=calculation_type, symmetric=symmetric
92 )
93 if validation_errors:
94 raise ValidationException(validation_errors, "update_data")
...
--> 690 invalid = np.isin(data[component]["id"], ref_data[component]["id"], invert=True)
691 if invalid.any():
692 ids = data[component]["id"][invalid].flatten().tolist()
KeyError: 'sym_load'
I agree that throwing a ValidationError here would be necessary
@nitbharambe maybe we need to think about this. Does raising error always be logic?
Users may treat non-existing component as zero-length array for some reasons. Maybe the logic should be: if a component exists in batch dataset but not in input, we only raise error if the width of this batch component array is not zero.
@nitbharambe maybe we need to think about this. Does raising error always be logic?
Users may treat non-existing component as zero-length array for some reasons. Maybe the logic should be: if a component exists in batch dataset but not in input, we only raise error if the width of this batch component array is not zero.
Yes, good point! It's better to cover that situation too.