transformers icon indicating copy to clipboard operation
transformers copied to clipboard

filter flash_attn optional imports loading remote code

Open eaidova opened this issue 1 year ago • 2 comments

What does this PR do?

in some remote available models code, we can meet optional import flash_attention module without try-except block. Examples: phi3-vision - https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/blob/main/modeling_phi3_v.py#L52 orion-14b - https://huggingface.co/OrionStarAI/Orion-14B-Chat/blob/main/modeling_orion.py#L36 deepseek-moe - https://huggingface.co/deepseek-ai/deepseek-moe-16b-base/blob/main/modeling_deepseek.py#L54 nanoLLAVA- https://huggingface.co/qnguyen3/nanoLLaVA/blob/main/modeling_llava_qwen2.py#L861

loading such model in environment where flash_attn package is not installed failed with trust_remote_code flag. It may be problematic to install and import this package (e.g. for environment where no cuda and torch installed for cpu only) for some environment. This PR update dependencies search logic for checking dynamic modules loading.

Fixes # (issue)

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ ] Did you read the contributor guideline, Pull Request section?
  • [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [ ] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

eaidova avatar May 22 '24 08:05 eaidova

cc @Rocketknight1 as I think you were working on a related issue

amyeroberts avatar May 22 '24 18:05 amyeroberts

@amyeroberts @Rocketknight1 do you have any good news for us here?

andrei-kochin avatar Jun 03 '24 12:06 andrei-kochin

@amyeroberts @Rocketknight1 Hi! any updates on this? we are working on improving SDPA operation support on openvino side, and using these models for testing our changes:

phi3-vision - https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/blob/main/modeling_phi3_v.py#L52 orion-14b - https://huggingface.co/OrionStarAI/Orion-14B-Chat/blob/main/modeling_orion.py#L36 deepseek-moe - https://huggingface.co/deepseek-ai/deepseek-moe-16b-base/blob/main/modeling_deepseek.py#L54 nanoLLAVA- https://huggingface.co/qnguyen3/nanoLLaVA/blob/main/modeling_llava_qwen2.py#L861

But unfortunately we encountered the same issue as described in the ticket. Do you have any plans to merge the fix?

itikhono avatar Jul 02 '24 12:07 itikhono

Gentle ping @Rocketknight1

amyeroberts avatar Jul 22 '24 16:07 amyeroberts

Ping me if you need help to fix the CI / a review 🤗

ArthurZucker avatar Aug 06 '24 14:08 ArthurZucker

@ArthurZucker thank you for review, I fixed code style CI related issue, but I have no idea about flax tests (I do not think that it is some how related to my changes)

eaidova avatar Aug 07 '24 09:08 eaidova

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@eaidova thanks for committing the suggestion, merging now!

Rocketknight1 avatar Aug 08 '24 16:08 Rocketknight1