pydantic-numpy icon indicating copy to clipboard operation
pydantic-numpy copied to clipboard

Serialization / Deserialization error for empty arrays with dim>1

Open michaelbuehlmann opened this issue 7 months ago • 2 comments

I'm running into an issue when I serialize and deserialize numpy arrays with dimensions > 1 in cases where the array is empty.

Example Code:

import numpy as np
import pydantic_numpy.typing as pnt
from pydantic import BaseModel

class A(BaseModel):
    d: pnt.Np2DArray

a = A(d=np.empty((0,2), dtype=np.float32))

s = b.model_dump_json()
# s = "{"d":{"data_type":"float32","data":[]}}"


a2 = A.model_validate_json(s)
# ValidationError: 1 validation error for A
# d
#  Value error, Array 1-dimensional; the target dimensions is 2 [type=value_error, input_value={'data_type': 'float32', 'data': []}, input_type=dict]
#    For further information visit https://errors.pydantic.dev/2.11/v/value_error

# However, this works:
s2 = s.replace("[]", "[[]]")
a3 = A.model_validate_json(s2)

It seems like the dimension is lost when the array is serialized ([] vs [[]])

michaelbuehlmann avatar May 27 '25 22:05 michaelbuehlmann

Thanks for pointing out, it is definitely a bug.

This requires a medium-sized rewrite of the validation, or a small hacky solution. The issue stems from our use of numpy tolist, which only serializes populated arrays.

Do you need a solution urgently, or do you use the workaround?

caniko avatar May 29 '25 11:05 caniko

Thanks @caniko! For now I'm just using NpNDArray in cases where the data might be empty, but being able to type the dimensions will be useful in the future (not urgent).

michaelbuehlmann avatar May 29 '25 12:05 michaelbuehlmann