sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

HuggingFaceModel does not properly accept script mode environment variables

Open athewsey opened this issue 1 year ago • 1 comments

Describe the bug

While Model / FrameworkModel's prepare_container_def() supports (here) manually configuring script mode environment variables for an existing model.tar.gz package, HuggingFaceModel's override implementation does not (here). User-configured env={ "SAGEMAKER_PROGRAM", "SAGEMAKER_SUBMIT_DIRECTORY", ...} are ignored regardless of whether re-packing of new entrypoint code is requested.

This is important for importing large (multi-GB) pre-trained models to SageMaker inference, because it forces us to use the SDK class' re-packing functionality to add inference code... Which is significantly slower in some cases: Can reach tens of minutes extra delay.

To reproduce

  • Prepare a model.tar.gz in S3, already containing a code/inference.py alongside (whatever) model artifacts. For a simple reproduction, could use no model artifacts at all - and add trivial custom model loader to inference.py something like model_fn(data_dir): return lambda x: x.

In my current use case, my model artifacts are about 5GB and constructing/uploading this archive takes ~10min - regardless of whether the small script code is included.

  • Create and deploy a Hugging Face Model from the archive on S3 via SageMaker Python SDK, indicating what code directory and entry point should be used:
model = HuggingFaceModel(
    model_data = "s3://.../model.tar.gz",  # (Contains code/inference.py)  
    role=sagemaker.get_execution_role(),
    py_version="py38",
    pytorch_version="1.10",
    transformers_version="4.17",
    env={
        "SAGEMAKER_CONTAINER_LOG_LEVEL": "20",
        "SAGEMAKER_PROGRAM": "inference.py",
        "SAGEMAKER_REGION": "ap-southeast-1",
        "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code",
    },
)
predictor = model.deploy(instance_type="ml.g4dn.xlarge", initial_instance_count=1)

Observed behavior

The endpoint will fail to find the inference.py entry point (and therefore will not correctly use the model_fn() and fail to load).

This is because the HuggingFaceModel overrides the SAGEMAKER_PROGRAM and SAGEMAKER_SUBMIT_DIRECTORY environment variables to empty even though no entry_point or source_dir are provided.

Expected behavior

The HuggingFaceModel should correctly propagate the user-specified environment variables, to support using a pre-prepared model.tar.gz without re-packing. In this case, the container would find the pre-loaded inference.py entry point and correctly use the override model_fn.

Screenshots or logs

N/A

System information A description of your system. Please provide:

  • SageMaker Python SDK version: 2.92.1
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): HuggingFace
  • Framework version: 4.17
  • Python version: py38
  • CPU or GPU: GPU
  • Custom Docker image (Y/N): N

Additional context

I am able to deploy a working endpoint by having my code folder and inference.py locally and adding these options to the model: HuggingFaceModel(source_dir="code", entry_point="inference.py", ...).

The problem is this >doubles the time and resources taken to prepare the package:

  • 10min to produce an initial "model-raw.tar.gz" and load to S3
  • 10min for the SageMaker SDK to download that archive, extract and re-pack it to add code folder, and re-upload to a new location

Since the use case here is just to prepare the model from local artifacts+code, it would also be OK if model_data was able to accept a local, uncompressed folder: As the 10min tarball creation would still only need to be done once. From my tests though, this doesn't seem to be possible?

athewsey avatar Sep 16 '22 10:09 athewsey

As an interim measure, users could override this behaviour with a patch like this:

class PatchedHuggingFaceModel(HuggingFaceModel):
    """Modified Model class to allow manually setting SM Script Mode env vars"""

    def prepare_container_def(self, *args, **kwargs):
        # Call the parent function:
        result = super().prepare_container_def(*args, **kwargs)
        # ...But allow our manual env vars configuration to override the internals:
        manual_env = dict(self.env)
        result["Environment"].update(manual_env)
        return result

Now you are free to:

model = PatchedHuggingFaceModel(
    model_data="s3://doc-example-bucket/pre-packed/model.tar.gz"
    env={
        # "SAGEMAKER_CONTAINER_LOG_LEVEL": "20",
        "SAGEMAKER_PROGRAM": "inference.py",
        # "SAGEMAKER_REGION": region,
        "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code",
    },
    ...
)
model.deploy(...)

(...Where the pre-uploaded tarball already contains code/inference.py, code/requirements.txt and whatever else)

athewsey avatar Sep 19 '22 07:09 athewsey