yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

[BUG] Cant load logged transformer model that has remote code

Open hugocool opened this issue 5 months ago • 1 comments

Issues Policy acknowledgement

  • [X] I have read and agree to submit bug reports in accordance with the issues policy

Where did you encounter this bug?

Other

Willingness to contribute

Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.

MLflow version

  • Client: 2.8.1

System information

  • Amazon linux
  • python 3.9.10

Describe the problem

When attempting to load a custom transformers model using MLflow's transformers integration, the process fails with an AttributeError. The specific error is: module transformers has no attribute LSGDistilBertForSequenceClassification. This suggests an issue with how MLflow is handling or recognizing the custom model class.

The error occurs within the load_model function, indicating a possible issue with the way MLflow interfaces with the transformers library, particularly around custom models. It probably has to do with _add_code_from_conf_to_system_path function at mlflow/transformers/init.py line 866, which has a default for the CODE_KEY which does not match up (it should the 'auto_map' in this case) but i cant set that.

Tracking information

REPLACE_ME

Code to reproduce issue

from transformers import AutoModelForSequenceClassification
import mlflow

# Attempt to load the custom transformers model
model = AutoModelForSequenceClassification.from_pretrained("ccdv/lsg-distilbert-base-uncased-4096",trust_remote_code=True)
with mlflow.start_run():
    mlflow.transformers.log_model(model, "custom_transformer_model",code_paths=["/home/ec2-user/.cache/huggingface/modules/transformers_modules/ccdv/lsg-distilbert-base-uncased-4096/79cac77ade3abac244e8037579f4cba2cf62736c/modeling_lsg_distilbert.py"])

# Load model with MLflow
loaded_model = mlflow.transformers.load_model("path/to/custom_transformer_model")

Stack trace

Traceback (most recent call last):
  ...
  File "/path/to/mlflow/transformers/__init__.py", line 928, in _load_model
    model_instance = getattr(transformers, flavor_config[_PIPELINE_MODEL_TYPE_KEY])
  ...
AttributeError: module transformers has no attribute LSGDistilBertForSequenceClassification

Other info / logs

REPLACE_ME

What component(s) does this bug affect?

  • [X] area/artifacts: Artifact stores and artifact logging
  • [ ] area/build: Build and test infrastructure for MLflow
  • [ ] area/deployments: MLflow Deployments client APIs, server, and third-party Deployments integrations
  • [ ] area/docs: MLflow documentation pages
  • [ ] area/examples: Example code
  • [X] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • [X] area/models: MLmodel format, model serialization/deserialization, flavors
  • [ ] area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • [ ] area/projects: MLproject format, project running backends
  • [ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • [ ] area/server-infra: MLflow Tracking server backend
  • [ ] area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • [ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • [ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • [ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • [ ] area/windows: Windows support

What language(s) does this bug affect?

  • [ ] language/r: R APIs and clients
  • [ ] language/java: Java APIs and clients
  • [ ] language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • [ ] integrations/azure: Azure and Azure ML integrations
  • [ ] integrations/sagemaker: SageMaker integrations
  • [ ] integrations/databricks: Databricks integrations

hugocool avatar Jan 24 '24 15:01 hugocool

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

github-actions[bot] avatar Feb 01 '24 00:02 github-actions[bot]

I just found a simply temporary solution and it worked for me! In your module, where you are loading the mlflow model, put the following code:

import transformers 
transformers.LSGDistilBertForSequenceClassification = LSGDistilBertForSequenceClassification

rhajou avatar Feb 26 '24 18:02 rhajou