TensorRT-LLM feat:Add support for Phi-4-mini and Phi-4-MM

This MR adds support for Phi-4-mini and Phi-4-multimodal models.

Mar 23 '25 19:03 brb-nv

/bot run

Mar 23 '25 19:03 brb-nv

/bot run

Mar 24 '25 02:03 kaiyux

PR_Github #210 [ run ] triggered by Bot

Mar 24 '25 02:03 niukuo

PR_Github #210 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #222 completed with status: 'SUCCESS'

Mar 24 '25 05:03 niukuo

Hi @amukkara @symphonylyh, I'm descoping this MR to have only changes for Phi-4-mini model. Changes for Phi-4-MM will be done in this MR: https://github.com/NVIDIA/TensorRT-LLM/pull/3177

I've addressed all your comments (including those related to the multimodal model). However, current changes for Phi4MM need some overhauling (using Pytorch impl just like Phi3 Vision for encoders instead of TRT engine). I felt separating the two out is cleaner instead of blocking one for the other.

Mar 31 '25 23:03 brb-nv

/bot run

Mar 31 '25 23:03 brb-nv

PR_Github #804 [ run ] triggered by Bot

Mar 31 '25 23:03 tensorrt-cicd

/bot run

Mar 31 '25 23:03 brb-nv

PR_Github #805 [ run ] triggered by Bot

Mar 31 '25 23:03 tensorrt-cicd

PR_Github #804 [ run ] completed with state ABORTED

Mar 31 '25 23:03 tensorrt-cicd

PR_Github #805 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #651 completed with status: 'SUCCESS'

Apr 01 '25 01:04 tensorrt-cicd

/bot run

Apr 01 '25 16:04 brb-nv

PR_Github #917 [ run ] triggered by Bot

Apr 01 '25 16:04 tensorrt-cicd

PR_Github #917 [ run ] completed with state SUCCESS /LLM/main/L0_MergeRequest_PR pipeline #724 completed with status: 'SUCCESS'

Apr 01 '25 18:04 tensorrt-cicd

/bot reuse-pipeline

Apr 02 '25 00:04 kaiyux

PR_Github #944 [ reuse-pipeline ] triggered by Bot

Apr 02 '25 00:04 tensorrt-cicd

PR_Github #944 [ reuse-pipeline ] completed with state SUCCESS Reusing PR_Github #917 for commit 179d06d

Apr 02 '25 00:04 tensorrt-cicd