transformers `get_imports` failing to respect conditionals on imports

System Info

transformers version: 4.36.2
Platform: macOS-13.5.2-arm64-arm-64bit
Python version: 3.11.7
Huggingface_hub version: 0.20.2
Safetensors version: 0.4.1
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): 2.1.2 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: no
Using distributed or parallel set-up in script?: no

Who can help?

From git blame: @Wauplin @sgugger

From issue template (it's a LLM): @ArthurZucker @you

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[X] My own task or dataset (give details below)

Reproduction

Running the below snippet on a MacBook without an Nvidia GPU and transformers==4.36.2 will throw an ImportError to pip install flash_attn. However, flash_attn isn't actually a requirement for this model, so something's off here.

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1_5", trust_remote_code=True)

Leads to:

  File "/Users/user/code/project/venv/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 315, in get_cached_module_file
    modules_needed = check_imports(resolved_module_file)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/code/project/venv/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 180, in check_imports
    raise ImportError(
ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run `pip install flash_attn`
python-BaseException

Investigating this, it seems https://github.com/huggingface/transformers/blob/v4.36.2/src/transformers/dynamic_module_utils.py#L154 is picking up flash_attn from https://github.com/huggingface/transformers/blob/v4.36.2/src/transformers/models/phi/modeling_phi.py#L50-L52. However, if you look at the file, it's within an if statement.

Therein is the bug, that transformers.dynamic_module_utils.get_imports is not respecting conditionals before imports.

Please see https://huggingface.co/microsoft/phi-1_5/discussions/72 for more info.

Expected behavior

My goal is some way to avoid monkey patching get_imports to remove the extra inferred flash_attn dependency.

The most generalized solution is probably moving get_imports from regex searching the source to either use inspect (see here) or some other AST walking method. I am pretty sure there is a simple fix here, it just involves moving away from a regex.

Jan 11 '24 18:01 jamesbraza

For reference, this only happens when trust_remote_code=True. Thus, we switched from using if is_flash_attn_2_available(): to a try/except block when trying to import the flash_attn package.

Seems to be working!

Jan 12 '24 00:01 gugarosa

Thanks @gugarosa for finding a workaround, that works because get_imports includes a special regex for try-except: https://github.com/huggingface/transformers/blob/v4.36.2/src/transformers/dynamic_module_utils.py#L149.

To share, adding the below case to https://github.com/huggingface/transformers/blob/v4.36.2/tests/utils/test_dynamic_module_utils.py will expose the issue:

...

TOP_LEVEL_CONDITIONAL_IMPORT = """
import os
if False:
    import pathlib
"""

...

CASES = [
    ...,
    TOP_LEVEL_CONDITIONAL_IMPORT
]

Looking at the other test cases, to properly fix this bug, I am now thinking it will involve use of ast as shown in https://stackoverflow.com/a/42195575

Jan 12 '24 04:01 jamesbraza

Note a generalized importer should also be able to take into account contextlib.suppress:

import contextlib

with contextlib.suppress(ImportError):
    from flash_attn import flash_attn_func

Jan 12 '24 05:01 jamesbraza

same problem for deepseeker moe [https://github.com/deepseek-ai/DeepSeek-MoE](deepseeker moe)

Jan 19 '24 09:01 Fazziekey

Fixed by

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

model_name = "/root/models/deepseek-moe-16b-base"
# model_name = "/root/models/Llama-2-7B"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

# With Python 3.11.7, transformers==4.36.2
import os
from unittest.mock import patch

from transformers import AutoModelForCausalLM
from transformers.dynamic_module_utils import get_imports


def fixed_get_imports(filename: str | os.PathLike) -> list[str]:
    """Work around for https://huggingface.co/microsoft/phi-1_5/discussions/72."""
    if not str(filename).endswith("/modeling_deepseek.py"):
        return get_imports(filename)
    imports = get_imports(filename)
    imports.remove("flash_attn")
    return imports


with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports):
    # model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1_5", trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True)

model.generation_config = GenerationConfig.from_pretrained(model_name)
model.generation_config.pad_token_id = model.generation_config.eos_token_id

text = "An attention function can be described as mapping a query and a set of key-value pairs to an output, where the query, keys, values, and output are all vectors. The output is"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs.to(model.device), max_new_tokens=100)

result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
```

Jan 19 '24 09:01 Fazziekey

I think all custom models ( which need trust_remote_code=True) trigger this problem

Jan 19 '24 09:01 Fazziekey

(^ ping @LysandreJik about the trust_remote_code mechanism?)

Jan 19 '24 09:01 Wauplin

(^ ping @LysandreJik about the trust_remote_code mechanism?)

Yes, the code is here:

Jan 30 '24 11:01 Fazziekey

Yep have already heard of such feedback! Would you like to open a PR for a fix?

Jan 31 '24 00:01 ArthurZucker

Yep have already heard of such feedback! Would you like to open a PR for a fix?

Of course, I will make a PR for fix

Jan 31 '24 03:01 Fazziekey

Yep have already heard of such feedback! Would you like to open a PR for a fix? Her is my pull request. https://github.com/huggingface/transformers/pull/28811

Feb 01 '24 08:02 Fazziekey

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Feb 26 '24 08:02 github-actions[bot]

I think a fix here would be useful @github-actions, so let's keep it open

Feb 27 '24 14:02 jamesbraza

Looks like the issue has been fixed. I'm able to load the model without flash_attn installed.

% python3
Python 3.12.3 (main, Apr  9 2024, 08:09:14) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from transformers import AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1_5", trust_remote_code=True)
>>> import flash_attn
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'flash_attn'

May 24 '24 07:05 jla524

Thanks for sharing @jla524! cc @Rocketknight1 as I think you were working on a related issue?

May 24 '24 12:05 amyeroberts

Hi guys, I encountered the same error for version 4.41.2. I am confused which package dose flash_attn refered to? I have tried install both of the two packages (xformers and https://github.com/Dao-AILab/flash-attention), the problems still exists.

Jun 14 '24 09:06 congchan

Hi @congchan,

flash_attn refers to the package you linked to, i.e the one installed when running pip install flash-attn, however you may need to follow the specific installation instructions given your setup.

If flash attention is properly installed, you should be able to run python -c "import flash_attn; print(flash_attn.__version__)" and see the installed version. If running on cuda, you'll need version 2.1 or above to run a lot of the modeling code

Jun 14 '24 10:06 amyeroberts

Am on 4.41 , still got ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run pip install flash_attn

I can not use flash_attn, but the modelding should support both of thme.

Am using Florence2 modeling.

Jun 28 '24 12:06 lucasjinreal

@lucasjinreal Without full environment info (run transformers-cli env in the terminal and copy-paste the output) and a reproducible code snippet, we won't be able to help you

Jun 28 '24 12:06 amyeroberts

hey @amyeroberts, I am having this same issue running Florence 2 on Mac. env info:

- `transformers` version: 4.41.2
- Platform: macOS-14.1-arm64-arm-64bit
- Python version: 3.12.4
- Huggingface_hub version: 0.23.4
- Safetensors version: 0.4.3
- Accelerate version: not installed
- Accelerate config: not found
- PyTorch version (GPU?): 2.3.1 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: no
- Using distributed or parallel set-up in script?: no

script in case it helps (copy/pasted from Florence 2 tutorial):

from transformers import AutoProcessor, AutoModelForCausalLM
from PIL import Image
import requests
import copy
import matplotlib.pyplot as plt
import matplotlib.patches as patches

# Load model and processor
device = 'cpu'

model_id = 'microsoft/Florence-2-large-ft'
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True).eval().to(device)
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)

# Define the prediction function
def run_example(task_prompt, text_input=None):
    if text_input is None:
        prompt = task_prompt
    else:
        prompt = task_prompt + text_input
    inputs = processor(text=prompt, images=image, return_tensors="pt")
    generated_ids = model.generate(
        input_ids=inputs["input_ids"].to(device),
        pixel_values=inputs["pixel_values"].to(device),
        max_new_tokens=1024,
        early_stopping=False,
        do_sample=False,
        num_beams=3,
    )
    generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
    parsed_answer = processor.post_process_generation(
        generated_text,
        task=task_prompt,
        image_size=(image.width, image.height)
    )

    return parsed_answer

# Initialize the image
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true"
image = Image.open(requests.get(url, stream=True).raw)

# Run pre-defined tasks without additional inputs
task_prompt = '<CAPTION>'
print(run_example(task_prompt))

task_prompt = '<DETAILED_CAPTION>'
print(run_example(task_prompt))

task_prompt = '<MORE_DETAILED_CAPTION>'
print(run_example(task_prompt))

# Object detection
task_prompt = '<OD>'
results = run_example(task_prompt)
print(results)

def plot_bbox(image, data):
    fig, ax = plt.subplots()
    ax.imshow(image)
    for bbox, label in zip(data['bboxes'], data['labels']):
        x1, y1, x2, y2 = bbox
        rect = patches.Rectangle((x1, y1), x2-x1, y2-y1, linewidth=1, edgecolor='r', facecolor='none')
        ax.add_patch(rect)
        plt.text(x1, y1, label, color='white', fontsize=8, bbox=dict(facecolor='red', alpha=0.5))
    ax.axis('off')
    plt.show()

plot_bbox(image, results['<OD>'])

Jul 09 '24 00:07 derickmr

Hi @derickmr, thanks for sharing this snippet and env information!

As there's a few different behaviours being reported, I just want to confirm the issue you're experiencing: is it that the snippet does not run if flash attention isn't installed in the environment (conditions are respected when running from the hub); or is it that even when installing flash attention, you're prompted to install it?

Jul 18 '24 20:07 amyeroberts

transformers transformers copied to clipboard

`get_imports` failing to respect conditionals on imports

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

transformers
transformers copied to clipboard