BentoML Support custom content types for I/O descriptors

We may want to support content types other than the default for these descriptors.

I think there are two options here:

add an additional parameter to the I/O descriptors which allows users to customize what content type they would like to accept, or
add a field or method content_type to the I/O descriptor API, which users can then override to manually set which content types they would like to accept.

I'm leaning somewhat towards the latter, though I also feel that we may want to ignore the content type for specifically the File I/O descriptor, as that should probably be somewhat of a catch-all for I/O descriptors we may want to implement in the future.

See also: https://github.com/bentoml/BentoML/discussions/2381

Apr 01 '22 18:04 sauyon

Notes:

Will allow content_type to be anything
input.name should return the file name when sending File with form-data
generate OpenAPI schema for both octet-stream and form-data
future work: add audio/video IO descriptor

Apr 07 '22 00:04 parano

Hello, I am running into this issue when using the Swagger UI to test my endpoint.

I am working on updating our models from bento v0.0.13 to v1.0.13 because we need to be able to have access to bentoml.Context. The current model I am workign with is "google/tapas-base-finetuned-wtq" from Hugging Face. I need to provide multiple inputs to the model in the form of a csv for data and list of questions.

This is what I have done so far:

Setup the svc.api as the following per the BentoML documentation:
This is how Swagger UI presents the input options
Add my data.csv file and submit the data
Receive a response of 500 Internal Server Error
Console output reports the following

2023-02-24T16:15:14-0500 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "service.py:svc" can be accessed at http://localhost:3000/metrics.
2023-02-24T16:15:14-0500 [INFO] [cli] Starting development HTTP BentoServer from "service.py:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)
2023-02-24T16:15:18-0500 [INFO] [dev_api_server:tapas-base-service] 127.0.0.1:37266 (scheme=http,method=GET,path=/,type=,length=) (status=200,type=text/html; charset=utf-8,length=2859) 0.425ms (trace=b68a91e57842e88a03ba1423f8b83a1b,span=cfe6e3d93e2526f1,sampled=0)
2023-02-24T16:15:18-0500 [INFO] [dev_api_server:tapas-base-service] 127.0.0.1:37266 (scheme=http,method=GET,path=/docs.json,type=,length=) (status=200,type=application/json,length=5394) 14.696ms (trace=70b2dff02b3b5235d396faf1dd3022ff,span=df6ef270a7d27631,sampled=0)
2023-02-24T16:16:47-0500 [ERROR] [dev_api_server:tapas-base-service] Exception on /qna [POST] (trace=8e21fc42b1ce1cb218e74d94d4184d2b,span=b86eb79ab2652c98,sampled=0)
Traceback (most recent call last):
  File "/home/nvadmin/projects/neurons/Neurons/pytorch/tapas/.env/lib/python3.9/site-packages/bentoml/_internal/server/http_app.py", line 311, in api_func
    input_data = await api.input.from_http_request(request)
  File "/home/nvadmin/projects/neurons/Neurons/pytorch/tapas/.env/lib/python3.9/site-packages/bentoml/_internal/io_descriptors/multipart.py", line 271, in from_http_request
    res[field] = await descriptor.from_http_request(form_values[field])
  File "/home/nvadmin/projects/neurons/Neurons/pytorch/tapas/.env/lib/python3.9/site-packages/bentoml/_internal/io_descriptors/file.py", line 235, in from_http_request
    raise BentoMLException(
bentoml.exceptions.BentoMLException: multipart File should have Content-Type 'application/octet-stream', got files with content types text/csv
2023-02-24T16:16:47-0500 [INFO] [dev_api_server:tapas-base-service] 127.0.0.1:43352 (scheme=http,method=POST,path=/qna,type=multipart/form-data; boundary=---------------------------80648029493694037053934203,length=562) (status=500,type=application/json,length=2) 1.871ms (trace=8e21fc42b1ce1cb218e74d94d4184d2b,span=b86eb79ab2652c98,sampled=0)

Previous API implementation looks like the following and still works as expected

Feb 24 '23 21:02 cdeeran

I think in your case it should be sufficient to use File(mime_type="text/csv"), though I do think we should change the file descriptor to be a catch-all. I'll put up a PR to fix that.

Mar 01 '23 09:03 sauyon

This will be supported in the upcoming new IO descriptor design!

Oct 31 '23 16:10 parano

Notes:

Will allow content_type to be anything

input.name should return the file name when sending File with form-data

generate OpenAPI schema for both octet-stream and form-data

future work: add audio/video IO descriptor

Are audio/video supported now?

Dec 25 '23 07:12 HaithemH

BentoML BentoML copied to clipboard

Support custom content types for I/O descriptors

BentoML
BentoML copied to clipboard