tvm icon indicating copy to clipboard operation
tvm copied to clipboard

[Bug] Inference - Phi-4 mini instruct

Open j0h0k0i0m opened this issue 8 months ago • 0 comments

I had previously raised this issue on MLC LLM as well, but it seems that the root cause lies in PagedKVCache. With the recent release of Phi-4-mini-inst, the introduction of the partial_rotary_factor variable has led to a dimension mismatch issue. While manually adjusting rope_ext_factors allows inference to proceed, it results in garbage values. Therefore, I am reporting this issue here. Is there any way to resolve this issue?

Expected behavior

What you were expecting

Actual behavior

Traceback (most recent call last):
  File "/opt/anaconda3/envs/mlc/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/opt/anaconda3/envs/mlc/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "tvm/_ffi/_cython/./packed_func.pxi", line 339, in tvm._ffi._cy3.core.PackedFuncBase.__call__
  File "tvm/_ffi/_cython/./packed_func.pxi", line 270, in tvm._ffi._cy3.core.FuncCall
  File "tvm/_ffi/_cython/./packed_func.pxi", line 259, in tvm._ffi._cy3.core.FuncCall3
  File "tvm/_ffi/_cython/./base.pxi", line 185, in tvm._ffi._cy3.core.CHECK_CALL
  File "/opt/anaconda3/envs/mlc/lib/python3.11/site-packages/tvm/_ffi/base.py", line 465, in raise_last_ffi_error
    raise py_err
tvm._ffi.base.TVMError: TVMError: Assert fail: T.Cast("int32", fused_rope_longrope_scaling_ext_factors_handle_shape[0]) == 64, Argument fused_rope_longrope_scaling.ext_factors_handle.shape[0] has an unsatisfied constraint: 64 == T.Cast("int32", fused_rope_longrope_scaling_ext_factors_handle_shape[0])

Environment

Any environment details, such as: Operating System, TVM version, etc

Steps to reproduce

Preferably a minimal script to cause the issue to occur.

Triage

Please refer to the list of label tags here to find the relevant tags and add them below in a bullet format (example below).

  • needs-triage

j0h0k0i0m avatar Mar 07 '25 02:03 j0h0k0i0m