special edge case: len(output) == len(batch)

Open mathematicalmichael opened this issue 7 months ago • 0 comments

Situation:

Imagine you are creating a classifier which returns a response akin to {"class_A": 0.2, "class_B": 0.3, "class_C": 0.5} (the "decoding of probabilities" is handled in predict rather than encode_response), and when multiple requests are received, the structure is dict[str, list[float]] instead of list[dict[str, float]] (a reasonable mistake to make as there's no explicit enforcement or guidance for how to treat this situation... a user could presume that predict should return dict instead of list when switching to batching).

There's a very specific edge-case that can occur here if batching is implemented with these contracts.

Suppose batch_size > num classes. Suppose the # of requests that are received within batch_timeout = # of classes predicted In this case len(outputs) == len(inputs), so the following check will pass

https://github.com/Lightning-AI/LitServe/blob/9e9005ffbe5f9b747024944a00cac312176a5470/src/litserve/loops/simple_loops.py#L328

and the user will receive a garbage response with an incorrectly decoded prediction.

similar issues could occur with custom classes instead of dictionaries which implement __len__

In the case of types other than list or ArrayLike (numpy, torch tensors, etc), the user must be very careful with batching / unbatching, as the only check is on the length of the iterable.

I don't think enforcing type outputs on predict is necessary to avoid this (it would help with the logic), but perhaps something to make note of in documentation, as I don't even know how you'd robustly catch this:

"when writing predict methods that return fixed-size objects in the outer dimension, be aware of this edge-case..." ... just something to nudge people towards returning list[custom_class] or list[dict] instead, if they want to use batching.

May 20 '25 14:05 mathematicalmichael