power-grid-model icon indicating copy to clipboard operation
power-grid-model copied to clipboard

[BUG] Data validation when update_data components are not present in input_data

Open nitbharambe opened this issue 1 year ago • 3 comments

Describe the bug

When trying to validate using assert_valid_batch_data using update_data with a component that is not present in the input_data, a KeyError is raised.

To Reproduce

from power_grid_model import initialize_array
from power_grid_model.validation import assert_valid_batch_data

input_data = {"node": initialize_array("input", "node", 1)}
update_data = {"sym_load": initialize_array("update", "sym_load", (1,1))}
assert_valid_batch_data(input_data, update_data)

Expected behavior

A clear error message with ValidationError can be given out instead

Screenshots

Error:

Cell In[3], line 3
      1 input_data = {"node": initialize_array("input", "node", 1)}
      2 update_data = {"sym_load": initialize_array("update", "sym_load", (1,1))}
----> 3 assert_valid_batch_data(input_data, update_data)

File z:\zzz\zzz\.venv\Lib\site-packages\power_grid_model\validation\assertions.py:90, in assert_valid_batch_data(input_data, update_data, calculation_type, symmetric)
     60 def assert_valid_batch_data(
     61     input_data: SingleDataset,
     62     update_data: BatchDataset,
     63     calculation_type: Optional[CalculationType] = None,
     64     symmetric: bool = True,
     65 ):
     66     """
     67     The input dataset is validated:
     68 
   (...)
     88         ValidationException: if the contents are invalid.
     89     """
---> 90     validation_errors = validate_batch_data(
     91         input_data=input_data, update_data=update_data, calculation_type=calculation_type, symmetric=symmetric
     92     )
     93     if validation_errors:
     94         raise ValidationException(validation_errors, "update_data")
...
--> 690     invalid = np.isin(data[component]["id"], ref_data[component]["id"], invert=True)
    691     if invalid.any():
    692         ids = data[component]["id"][invalid].flatten().tolist()
    
KeyError: 'sym_load'

nitbharambe avatar Sep 09 '24 07:09 nitbharambe

I agree that throwing a ValidationError here would be necessary

petersalemink95 avatar Sep 09 '24 08:09 petersalemink95

@nitbharambe maybe we need to think about this. Does raising error always be logic?

Users may treat non-existing component as zero-length array for some reasons. Maybe the logic should be: if a component exists in batch dataset but not in input, we only raise error if the width of this batch component array is not zero.

TonyXiang8787 avatar Sep 20 '24 09:09 TonyXiang8787

@nitbharambe maybe we need to think about this. Does raising error always be logic?

Users may treat non-existing component as zero-length array for some reasons. Maybe the logic should be: if a component exists in batch dataset but not in input, we only raise error if the width of this batch component array is not zero.

Yes, good point! It's better to cover that situation too.

nitbharambe avatar Sep 20 '24 09:09 nitbharambe