server
server copied to clipboard
Async HTTP Python Client not working properly
Description
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa6 in position 5: invalid start byte when requesting a model config using an async http client.
When observing the http response directly using tritonclient.http.aio without wrapper, I've noticed that async client does not decompress the http response by itself, so it seems to be fixable with setting auto_decompress=True on aiohttp.ClientSession. I believe in my case the compression is imposed elsewhere (by nginx?).
If that was intended (as the auto_decompress=True is the default setting) then some additional logic is required to process the compressed response (calling to brotli.decompress() fixed it in my case).
In any case I'll be happy to provide the necessary fixes.
Triton Information
- Triton server version 2.38.0
- Triton container
23.09 tritonclient==2.33.0, 2.42.0nvidia-pytriton==0.2.5, 0.3.0and0.5.1when built from source
To Reproduce
import asyncio
import numpy
from pytriton.client import AsyncioModelClient, ModelClient
from pytriton.client.utils import get_model_config
HOST = "HOST:80"
model_name = "intent_classifier" #does not work
message = "Simple and correct query for testing"
async def run_classification(inferer_server_endpoint: str, clf_model_name: str, message: str, **_):
async with AsyncioModelClient(inferer_server_endpoint, clf_model_name) as client:
inference_client = client.create_client_from_url(inferer_server_endpoint)
config = await inference_client.get_model_config(model_name)
print(config)
def sync_classification(inferer_server_endpoint: str, clf_model_name: str, message: str, **_):
with ModelClient(inferer_server_endpoint, clf_model_name) as client:
inference_client = client.create_client_from_url(inferer_server_endpoint)
config = inference_client.get_model_config(model_name)
print(config)
if __name__ == "__main__":
sync_classification(HOST, model_name, message) # Works
#asyncio.run(run_classification(HOST, model_name, message)) # Does not
config is attached classifier_config.txt
Expected behavior Model configuration should be returned in a valid json format to be parsed into python dict
Thanks for reporting the issue, I have filed a ticket for us to investigate further.
In any case I'll be happy to provide the necessary fixes.
Any contribution is welcomed!
Hi @mutkach, I took a deeper look into the Python AsyncIO client and seems like we already have decompression built in. When calling the async infer(), it will:
- read the
Content-Encodingheader from the response header - pass the
Content-Encodingheader toInferResultclass that reads the response body - the
InferResultclass will auto-decompress the response body based on theContent-Encodingheader set by the server.
I believe in my case the compression is imposed elsewhere (by nginx?).
Would you be able to share the response headers received, when encountering this issue?