BentoML icon indicating copy to clipboard operation
BentoML copied to clipboard

offline batch serving with dataset on S3

Open vincentclaes opened this issue 2 years ago • 0 comments

Is your feature request related to a problem? Please describe. I would like to use bentoml to make offline batch predictions where my dataset is stored on S3. Datasets can be big, so I prefer to have the data chunked or streamed from S3.

Describe the solution you'd like call the bentoml cli

bentoml run SKlearnClassifier:latest batch_inference --input-file "s3://some-bucket/test_input.csv" --format "csv"

Describe alternatives you've considered

  • I tried the above but this does not work.
  • I looked at implementing a custom adapter: https://docs.bentoml.org/en/latest/guides/custom_input_adapter.html where I will subclass from StringInput or FileInput

I was wondering:

  • are there any plans of implementing s3 support for datasets?
  • If not, can you guide me in the correct direction to implement this?
  • Would you consider taking this as a new feature? i can help with this ...

best regards,

Vincent

vincentclaes avatar Jan 02 '22 16:01 vincentclaes