1x1 input image can crash /v1/embeddings endpoint until pod restart while health check endpoint continues to return 200
System Info
We have an image processing pipeline where we're using Infinity behind KubeAI to compute CLIP embeddings for images.
curl -vv http://kubeai/openai/v1/embeddings \
--http1.1 -H "Authorization: Bearer blah" \
-H "Content-Type: application/json" \
-d '{
"input": "https://some.image.url/image.png",
"model": "infinity-hosted-clip-embedding-model",
"modality": "image"
}'
Every now and then we get into a situation where a single bad request comes along (working on trying to isolate it). From that point on, our steady stream of embeddings, health, and metrics calls ... becomes a steady stream of just health check and metric calls (both of which continue to return 200s). The model stays running but stops receiving any embedding requests from kubeai and we essentially lose 1/n of our throughput.
INFO: 10.15.5.245:57645 - "POST /v1/embeddings HTTP/1.1" 200 OK
ERROR 2025-06-26 13:23:51,860 infinity_emb ERROR: broken batch_handler.py:557
data stream when reading image file
Traceback (most recent call last):
File
"/app/infinity_emb/inference/batch_handler.py",
line 541, in _preprocess_batch
feat = self._model.encode_pre(items_for_pre)
File
"/app/infinity_emb/transformer/vision/torch_vision
.py", line 159, in encode_pre
preprocessed = self.processor(
File
"/app/.venv/lib/python3.10/site-packages/transform
ers/models/clip/processing_clip.py", line 109, in
__call__
image_features = self.image_processor(images,
return_tensors=return_tensors,
**image_processor_kwargs)
File
"/app/.venv/lib/python3.10/site-packages/transform
ers/image_processing_utils.py", line 41, in
__call__
return self.preprocess(images, **kwargs)
File
"/app/.venv/lib/python3.10/site-packages/transform
ers/models/clip/image_processing_clip.py", line
307, in preprocess
images = [convert_to_rgb(image) for image in
images]
File
"/app/.venv/lib/python3.10/site-packages/transform
ers/models/clip/image_processing_clip.py", line
307, in <listcomp>
images = [convert_to_rgb(image) for image in
images]
File
"/app/.venv/lib/python3.10/site-packages/transform
ers/image_transforms.py", line 776, in
convert_to_rgb
image = image.convert("RGB")
File
"/app/.venv/lib/python3.10/site-packages/PIL/Image
.py", line 995, in convert
self.load()
File
"/app/.venv/lib/python3.10/site-packages/PIL/Image
File.py", line 312, in load
raise _get_oserror(err_code, encoder=False)
OSError: broken data stream when reading image
file
Any ideas?
Information
- [x] Docker + cli
- [ ] pip + cli
- [ ] pip + usage of Python interface
Tasks
- [x] An officially supported CLI command
- [ ] My own modifications
Reproduction
I'm trying to find the image URL that reproduces this. Will update back if I find one.
I was able to find a repro that works consistently directly against infinity (updated issue to remove KubeAI).
curl -vv http://localhost:50574/v1/embeddings \
--http1.1 -H "Authorization: Bearer doesnt-matter" \
-H "Content-Type: application/json" \
-d '{
"input": "data:image/jpeg;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/x8AAusB9sXsdrgAAAAASUVORK5CYII=",
"model": "repro",
"modality": "image"
}'
The issue has to do with data: encoded images that are 1x1 with 1 or 3 channels (RGB) where the normalization that happens in huggingface/transformers can't determine the channel order in [1,1,3] (channels first or channels last) b/c it's ambiguous.
The bigger problem here is that infinity doesn't handle this error gracefully.
* Host localhost:50574 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
* Trying [::1]:50574...
* Connected to localhost (::1) port 50574
> POST /v1/embeddings HTTP/1.1
> Host: localhost:50574
> User-Agent: curl/8.7.1
> Accept: */*
> Authorization: Bearer doesnt-matter
> Content-Type: application/json
> Content-Length: 204
>
* upload completely sent off: 204 bytes
We never get a reply... It just eventually times out. All future requests to the /v1/embeddings endpoint also fail from that point on until we kill & replace the pod. The /health and /metrics endpoints keep returning 200s
KubeAI assumed the pod was healthy b/c of the 200's and spread traffic around to all infinity instances. And this becomes like a slow motion DoS (once each pod has encountered the bad input data once, no more CLIP embeddings--of text or images).
❯ curl -vv http://localhost:50574/v1/embeddings \
--http1.1 -H "Authorization: Bearer doesnt-matter" \
-H "Content-Type: application/json" \
-d '{
"input": "hi",
"model": "repro",
"modality": "text"
}'
* Host localhost:50574 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
* Trying [::1]:50574...
* Connected to localhost (::1) port 50574
> POST /v1/embeddings HTTP/1.1
> Host: localhost:50574
> User-Agent: curl/8.7.1
> Accept: */*
> Authorization: Bearer doesnt-matter
> Content-Type: application/json
> Content-Length: 90
>
* upload completely sent off: 90 bytes
Which version of infinity are you using? Never releases have validation to not allow 1x1 images.
Just checked, the current check is only applied to images provided via URL. The linked PR applies the check to all methods. With the check, the server returns a "Bad Request" for images smaller than 3x3 which also applies to the example provided. This will circumvent the problem described.
Thanks!. Just want to make sure this didn't get lost above:
This unhandled exception broke the /v1/embeddings call that triggered it (never got a response) and prevented all future executions of that endpoint from working at all. Maybe missing a with block or try/finally on a queue slot or something?
Image is: michaelf34/infinity:0.0.76
I added additional changes in this branch: #614
@scottt732 Could you give this branch a spin and report back if it solves your issue?
I think @scottt732 was not on the most recent branch. This issue should be fixed long time ago.