special edge case: len(output) == len(batch)
Situation:
Imagine you are creating a classifier which returns a response akin to {"class_A": 0.2, "class_B": 0.3, "class_C": 0.5} (the "decoding of probabilities" is handled in predict rather than encode_response), and when multiple requests are received, the structure is
dict[str, list[float]] instead of list[dict[str, float]] (a reasonable mistake to make as there's no explicit enforcement or guidance for how to treat this situation... a user could presume that predict should return dict instead of list when switching to batching).
There's a very specific edge-case that can occur here if batching is implemented with these contracts.
Suppose batch_size > num classes.
Suppose the # of requests that are received within batch_timeout = # of classes predicted
In this case len(outputs) == len(inputs), so the following check will pass
https://github.com/Lightning-AI/LitServe/blob/9e9005ffbe5f9b747024944a00cac312176a5470/src/litserve/loops/simple_loops.py#L328
and the user will receive a garbage response with an incorrectly decoded prediction.
similar issues could occur with custom classes instead of dictionaries which implement __len__
In the case of types other than list or ArrayLike (numpy, torch tensors, etc), the user must be very careful with batching / unbatching, as the only check is on the length of the iterable.
I don't think enforcing type outputs on predict is necessary to avoid this (it would help with the logic), but perhaps something to make note of in documentation, as I don't even know how you'd robustly catch this:
"when writing predict methods that return fixed-size objects in the outer dimension, be aware of this edge-case..." ... just something to nudge people towards returning list[custom_class] or list[dict] instead, if they want to use batching.