bug: ClientOSError aiohttp.streams in read
Describe the bug
I am using the pytorch and ResNet model serving with bentoml in production. Using this model, I developed an API to extract features of two images and compare similarities.
However, sometimes ClientOSError occurs, so I wonder if it is possible to correct the cause on the code.
ClientOSErroraiohttp.streams in read

To reproduce
[1] save_model.py
weights = ResNet50_Weights.DEFAULT
model = resnet50(weights=weights)
model.eval()
return_nodes = {'avgpool': 'avgpool'} # extract feature to save it to variable
new_model = create_feature_extractor(model, return_nodes=return_nodes)
bentoml.pytorch.save_model("model", new_model)
[2] service.py
runner = bentoml.pytorch.get("model:latest").to_runner()
svc = bentoml.Service("service", runners=[runner])
@svc.api(input=input_spec, output=output_spec)
async def predict(input_data: ImageModelFeatures) -> dict:
"""
input: {"source_url": "...jpg", "target_url": "....jpg"}
return: similarity
"""
ZERO_DISTANCE = 0.0
try:
source_img = url_to_processed_img(input_data.source_url)
source_embedding = await runner.async_run(source_img)
source_embedding = torch.flatten(source_embedding['avgpool'])
source_embedding = source_embedding.detach().numpy()
target_img = url_to_processed_img(input_data.source_url)
target_embedding = await runner.async_run(target_img)
target_embedding = torch.flatten(target_embedding['avgpool'])
target_embedding = target_embedding.detach().numpy()
distance = cos_sim(source_embedding, target_embedding)
logging.info(f"[Predict] distance: {str(distance)}")
await asyncio.sleep(0.001)
return {"distance": distance}
except (RuntimeError, UnidentifiedImageError) as e:
logging.error(f"[Predict][RuntimeError] {str(e)}", extra=dict(error=str(e)))
return {"distance": ZERO_DISTANCE}
except (ServerDisconnectedError, RemoteException) as e:
logging.error(f"[Predict][DisconnectedError] {str(e)}", extra=dict(error=str(e)))
return {"distance": ZERO_DISTANCE}
[3] save model and run
python save_model.py
bentoml serve service.py:svc
[4] Outout It is working well, but sometimes return the error.
Expected behavior
I wonder if the model and service code are well-organized to distribute to the production environment. And, I wonder what kind of defense code will be needed to prevent the error from occurring.
Environment
[1] python: python3.9-slim-buster docker image [2] requirements.txt
numpy==1.23.*
requests==2.28.*
bentoml==1.0.7
torch==1.12.*
torchvision==0.13.*
pydantic==1.10.*
sentry-sdk==1.11.*
Hi, does it reproduce on the latest version of BentoML? you can use the new service APIs.