serve icon indicating copy to clipboard operation
serve copied to clipboard

Keep root dir for extra files when generating .mar file in model-archiver/model_packaging_utils.py

Open csJoax opened this issue 3 years ago • 0 comments

🚀 The feature

Maybe we need to keep the root dir because:

  • somtimes the structure is complicated;
  • somtimes there are the same filenames.
(torch) ➜  pytorch_serve git:(master) ✗ tree -L 2 examples/image_classifier/mnist 
examples/image_classifier/mnist
├── mnist_cnn.pt
├── mnist_handler.py
├── mnist.py
├── my_lib1
│   ├── my_file.py
│   └── ...
├── my_lib2
│   ├── my_file.py
│   └── ...
└── ...

my_file.py files may used by their own libs.

the structure of the .mar file should be like this:

(torch) ➜  pytorch_serve git:(mar_extra_root) ✗ unzip -l mnist.mar
Archive:  mnist.mar
  Length      Date    Time    Name
---------  ---------- -----   ----
     1367  2022-06-15 16:19   mnist_handler.py
  4800893  2022-06-15 16:19   mnist_cnn.pt
      757  2022-06-15 16:19   mnist.py
       58  2022-06-02 01:41   my_lib2/my_file.py
       59  2022-06-02 01:41   my_lib2/my_module/my_file1.py
       58  2022-06-02 01:14   my_lib1/my_file.py
       59  2022-06-02 01:13   my_lib1/my_module/my_file1.py
      265  2022-06-15 16:19   MAR-INF/MANIFEST.json
---------                     -------
  4803516                     8 files

Motivation, pitch

Current:

(torch) ➜  pytorch_serve git:(master) ✗     torch-model-archiver \
      --model-name mnist \
      --version 1.0 \
      --model-file examples/image_classifier/mnist/mnist.py \
      --serialized-file examples/image_classifier/mnist/mnist_cnn.pt \
      --handler  examples/image_classifier/mnist/mnist_handler.py \
      --extra-files examples/image_classifier/mnist/my_lib
(torch) ➜  pytorch_serve git:(master) ✗ unzip -l mnist.mar        
Archive:  mnist.mar
  Length      Date    Time    Name
---------  ---------- -----   ----
     1365  2022-06-02 01:35   mnist_handler.py
  4800893  2022-06-02 01:35   mnist_cnn.pt
       58  2022-06-02 01:14   my_file.py
      757  2022-06-02 01:35   mnist.py
       59  2022-06-02 01:13   my_module/my_file1.py
      265  2022-06-02 01:35   MAR-INF/MANIFEST.json
---------                     -------
  4803397                     6 files

if we want to remain my_lib1/ and my_lib2/ in the --extra-files option, then error occurs:

(torch) ➜  pytorch_serve git:(master) ✗     torch-model-archiver \
      --model-name mnist \
      --version 1.0 \
      --model-file examples/image_classifier/mnist/mnist.py \
      --serialized-file examples/image_classifier/mnist/mnist_cnn.pt \
      --handler  examples/image_classifier/mnist/mnist_handler.py \
      --extra-files examples/image_classifier/mnist/my_lib1,examples/image_classifier/mnist/my_lib2
Traceback (most recent call last):
  File "/opt/miniconda3/envs/torch/bin/torch-model-archiver", line 8, in <module>
    sys.exit(generate_model_archive())
  File "/opt/miniconda3/envs/torch/lib/python3.9/site-packages/model_archiver/model_packaging.py", line 56, in generate_model_archive
    package_model(args, manifest=manifest)
  File "/opt/miniconda3/envs/torch/lib/python3.9/site-packages/model_archiver/model_packaging.py", line 36, in package_model
    model_path = ModelExportUtils.copy_artifacts(model_name, **artifact_files)
  File "/opt/miniconda3/envs/torch/lib/python3.9/site-packages/model_archiver/model_packaging_utils.py", line 159, in copy_artifacts
    shutil.copytree(src, dst, False, None)
  File "/opt/miniconda3/envs/torch/lib/python3.9/shutil.py", line 565, in copytree
    return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
  File "/opt/miniconda3/envs/torch/lib/python3.9/shutil.py", line 466, in _copytree
    os.makedirs(dst, exist_ok=dirs_exist_ok)
  File "/opt/miniconda3/envs/torch/lib/python3.9/os.py", line 225, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: '/tmp/mnist/my_module'

Additionally, it is weird to drop the my_lib1 or my_lib2.

Alternatives

We can add a new option --keep-extra-root to deside to keep 'my_lib?` or not:

(torch) ➜  pytorch_serve git:(mar_extra_root) ✗ torch-model-archiver \
      --model-name mnist \
      --version 1.0 \
      --model-file examples/image_classifier/mnist/mnist.py \
      --serialized-file examples/image_classifier/mnist/mnist_cnn.pt \
      --handler  examples/image_classifier/mnist/mnist_handler.py \
      --extra-files examples/image_classifier/mnist/my_lib1,examples/image_classifier/mnist/my_lib2 \
      --keep-extra-root
(torch) ➜  pytorch_serve git:(mar_extra_root) ✗ unzip -l mnist.mar
Archive:  mnist.mar
  Length      Date    Time    Name
---------  ---------- -----   ----
     1367  2022-06-15 16:19   mnist_handler.py
  4800893  2022-06-15 16:19   mnist_cnn.pt
      757  2022-06-15 16:19   mnist.py
       58  2022-06-02 01:41   my_lib2/my_file.py
       59  2022-06-02 01:41   my_lib2/my_module/my_file1.py
       58  2022-06-02 01:14   my_lib1/my_file.py
       59  2022-06-02 01:13   my_lib1/my_module/my_file1.py
      265  2022-06-15 16:19   MAR-INF/MANIFEST.json
---------                     -------
  4803516                     8 files

and we have realized the feature.

Additional context

No response

csJoax avatar Jun 15 '22 09:06 csJoax