rust-bert icon indicating copy to clipboard operation
rust-bert copied to clipboard

Add a Dockerfile build for the converter only

Open jondot opened this issue 2 years ago • 3 comments

Currently, converting an existing HF model requires having (1) a Rust environment ready, (2) rust-bert repo available and, (3) setting up a Python environment, just for the conversion.

For the use case where

(a) a Rust developer wants to utilize an HF model, they would need a Python environment (b) a data scientist wants to experiment with different models, and a given Rust project that was created for them by Rust devs: they would need a Rust environment, and to set up a rust-bert repo

As it seems, the groups are mostly mutually exclusive.

I've created a Dockerfile, which I think is minimal, that only does the conversion. It:

  1. Builds the Rust project
  2. Sets up a python environment with the prebuilt Rust converter
  3. Takes a conversion command

And so, developers and data scientists need only to depend on Docker, and assuming the image is called rustbert-converter after it was built to only run:

docker run -v "$(pwd)"/<path to model on host>:/model rustbert-converter pytorch_mode.bin

The image expects a /model folder which is shared between the container and the host, where the raw pytorch model files are.

jondot avatar Sep 26 '23 06:09 jondot

Thank you @jondot - this is great! The model conversion for Python is currently tested in the CI here, would it be possible to add a test using Docker as well? This would ensure everything still works as expected and serves as a nice documentation illustrating how to run conversion in the tests.

guillaume-be avatar Sep 26 '23 17:09 guillaume-be

Sure, I can try. Do you mean we want to build the docker in the CI, and then run the docker to convert a sample model?

jondot avatar Sep 27 '23 06:09 jondot

Yes

guillaume-be avatar Sep 27 '23 18:09 guillaume-be