vllm icon indicating copy to clipboard operation
vllm copied to clipboard

[Model] New model support for Phi-4-multimodal-instruct

Open congcongchen123 opened this issue 10 months ago • 2 comments

New model for https://huggingface.co/microsoft/Phi-4-multimodal-instruct/tree/main

co-author: Jacob Platin and Vadim Mazalov for speech encoder and Yen-Chun Chen for vision encoder.

FIX #13936

congcongchen123 avatar Mar 03 '25 07:03 congcongchen123

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

github-actions[bot] avatar Mar 03 '25 07:03 github-actions[bot]

Please update the tests as described here: https://docs.vllm.ai/en/latest/contributing/model/tests.html

DarkLight1337 avatar Mar 03 '25 07:03 DarkLight1337

Hi @DarkLight1337, I have addressed all the new comments, could you please approve this PR if it looks good to you? I would like to ship this as soon as possible since people are waiting on this to try out Phi-4-multimodal-instruct on vLLM.

congcongchen123 avatar Mar 04 '25 20:03 congcongchen123

BTW @ywang96 , should I merge latest main into my branch one more time to make it ready to ship?

congcongchen123 avatar Mar 04 '25 21:03 congcongchen123

BTW @ywang96 , should I merge latest main into my branch one more time to make it ready to ship?

Yep that would be great!

ywang96 avatar Mar 04 '25 21:03 ywang96

Hi @ywang96 the buildkite/ci/pr/basic-models-test test failed because of the following check in vllm/model_executor/models/vision_siglip_navit.py:

self.attn_backend: _Backend = get_vit_attn_backend(support_fa=True)
if self.attn_backend != _Backend.FLASH_ATTN:
    raise RuntimeError(
        "Phi-4-multimodal-instruct model does not support"\
            f" {self.attn_backend} backend now."
    )

Looks like the testing container does not have flash_attn package installed, and flash_attn is required by vllm/model_executor/models/vision_siglip_navit.py. What is the suggested way to fix that? Thanks


To answer myself, as a work around, maybe I could print out the logging messages instead of throwing runtime error. We will come back on this in the following PRs.

congcongchen123 avatar Mar 04 '25 23:03 congcongchen123

Hi @ywang96 the buildkite/ci/pr/basic-models-test test failed because of the following check in vllm/model_executor/models/vision_siglip_navit.py:

self.attn_backend: _Backend = get_vit_attn_backend(support_fa=True)
if self.attn_backend != _Backend.FLASH_ATTN:
    raise RuntimeError(
        "Phi-4-multimodal-instruct model does not support"\
            f" {self.attn_backend} backend now."
    )

Looks like the testing container does not have flash_attn package installed, and flash_attn is required by vllm/model_executor/models/vision_siglip_navit.py. What is the suggested way to fix that? Thanks

To answer myself, as a work around, maybe I could print out the logging messages instead of throwing runtime error. We will come back on this in the following PRs.

        # Detect attention implementation.
        self.attn_backend: _Backend = get_vit_attn_backend(support_fa=True)
        if self.attn_backend != _Backend.FLASH_ATTN:
            # Only print out errors for now to make ci/pr/basic-models-test
            # happy since the testing environment does not have flash_attn
            # installed.
            logger.error("Phi-4-multimodal-instruct model does not support "\
                         "%s backend now.", self.attn_backend)

was added to the wrong place, and I just corrected it.

ywang96 avatar Mar 04 '25 23:03 ywang96

The kernel test failure looks unrelated to this PR, is there a way to bypass that and merge this PR? If not, what steps should I take?

congcongchen123 avatar Mar 05 '25 04:03 congcongchen123

It is a known failure on main, I have force-merged this PR.

DarkLight1337 avatar Mar 05 '25 04:03 DarkLight1337

Got it! Thanks @DarkLight1337 !

congcongchen123 avatar Mar 05 '25 05:03 congcongchen123