ipex-llm whisper failed on ARC B580

Describe the bug cannot import name 'top_k_top_p_filtering' from 'trl.core'

How to reproduce Steps to reproduce the error:

git commit commit 45f7bf6688abbd0beb12aba36ad51b14 for ipex-llm
start intelanalytics/multi-arc-serving:0.2.0-b1 docker image
cd ipex-llm/python/llm/dev/benchmark/whisper
python run_whisper.py /llm/model/whisper-medium --date_type other --device xpu Screenshots

Environment information root@b580:/llm/models/ipex-llm/python/llm/scripts# bash env-check.sh

PYTHON_VERSION=3.11.12

[W513 01:43:54.591534205 OperatorEntry.cpp:154] Warning: Warning only once for all operators, other operators may also be overridden. Overriding a previously registered kernel for the same operator and the same dispatch key operator: aten::_validate_compressed_sparse_indices(bool is_crow, Tensor compressed_idx, Tensor plain_idx, int cdim, int dim, int nnz) -> () registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6 dispatch key: XPU previous kernel: registered at /pytorch/build/aten/src/ATen/RegisterCPU.cpp:30477 new kernel: registered at /build/intel-pytorch-extension/build/Release/csrc/gpu/csrc/aten/generated/ATen/RegisterXPU.cpp:468 (function operator()) [W513 01:43:55.264482308 OperatorEntry.cpp:154] Warning: Warning only once for all operators, other operators may also be overridden. Overriding a previously registered kernel for the same operator and the same dispatch key operator: aten::_validate_compressed_sparse_indices(bool is_crow, Tensor compressed_idx, Tensor plain_idx, int cdim, int dim, int nnz) -> () registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6 dispatch key: XPU previous kernel: registered at /pytorch/build/aten/src/ATen/RegisterCPU.cpp:30477 new kernel: registered at /build/intel-pytorch-extension/build/Release/csrc/gpu/csrc/aten/generated/ATen/RegisterXPU.cpp:468 (function operator()) transformers=4.51.3

[W513 01:44:01.254327982 OperatorEntry.cpp:154] Warning: Warning only once for all operators, other operators may also be overridden. Overriding a previously registered kernel for the same operator and the same dispatch key operator: aten::_validate_compressed_sparse_indices(bool is_crow, Tensor compressed_idx, Tensor plain_idx, int cdim, int dim, int nnz) -> () registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6 dispatch key: XPU previous kernel: registered at /pytorch/build/aten/src/ATen/RegisterCPU.cpp:30477 new kernel: registered at /build/intel-pytorch-extension/build/Release/csrc/gpu/csrc/aten/generated/ATen/RegisterXPU.cpp:468 (function operator()) [W513 01:44:04.560325613 OperatorEntry.cpp:154] Warning: Warning only once for all operators, other operators may also be overridden. Overriding a previously registered kernel for the same operator and the same dispatch key operator: aten::_validate_compressed_sparse_indices(bool is_crow, Tensor compressed_idx, Tensor plain_idx, int cdim, int dim, int nnz) -> () registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6 dispatch key: XPU previous kernel: registered at /pytorch/build/aten/src/ATen/RegisterCPU.cpp:30477 new kernel: registered at /build/intel-pytorch-extension/build/Release/csrc/gpu/csrc/aten/generated/ATen/RegisterXPU.cpp:468 (function operator()) torch=2.6.0+xpu

ipex-llm Version: 2.3.0b20250427

[W513 01:44:10.333946657 OperatorEntry.cpp:154] Warning: Warning only once for all operators, other operators may also be overridden. Overriding a previously registered kernel for the same operator and the same dispatch key operator: aten::_validate_compressed_sparse_indices(bool is_crow, Tensor compressed_idx, Tensor plain_idx, int cdim, int dim, int nnz) -> () registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6 dispatch key: XPU previous kernel: registered at /pytorch/build/aten/src/ATen/RegisterCPU.cpp:30477 new kernel: registered at /build/intel-pytorch-extension/build/Release/csrc/gpu/csrc/aten/generated/ATen/RegisterXPU.cpp:468 (function operator()) [W513 01:44:12.144697430 OperatorEntry.cpp:154] Warning: Warning only once for all operators, other operators may also be overridden. Overriding a previously registered kernel for the same operator and the same dispatch key operator: aten::_validate_compressed_sparse_indices(bool is_crow, Tensor compressed_idx, Tensor plain_idx, int cdim, int dim, int nnz) -> () registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6 dispatch key: XPU previous kernel: registered at /pytorch/build/aten/src/ATen/RegisterCPU.cpp:30477 new kernel: registered at /build/intel-pytorch-extension/build/Release/csrc/gpu/csrc/aten/generated/ATen/RegisterXPU.cpp:468 (function operator()) ipex=2.6.10+xpu

CPU Information: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 20 On-line CPU(s) list: 0-19 Vendor ID: GenuineIntel Model name: 12th Gen Intel(R) Core(TM) i7-12700 CPU family: 6 Model: 151 Thread(s) per core: 2 Core(s) per socket: 12 Socket(s): 1 Stepping: 2 CPU(s) scaling MHz: 50% CPU max MHz: 4900.0000 CPU min MHz: 800.0000

Total CPU Memory: 61.317 GB

Operating System: Ubuntu 24.04.1 LTS \n \l

Linux b580 6.11.0-25-generic #25-Ubuntu SMP PREEMPT_DYNAMIC Fri Apr 11 23:29:18 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

env-check.sh: line 148: xpu-smi: command not found

env-check.sh: line 154: clinfo: command not found

Driver related package version: ii intel-level-zero-gpu 1.6.32961.7 amd64 Intel(R) Graphics Compute Runtime for oneAPI Level Zero. ii intel-level-zero-gpu-dbgsym 1.6.32961.7 amd64 debug symbols for intel-level-zero-gpu

igpu detected [level_zero:gpu][level_zero:1] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.6.32961.700000] [opencl:gpu][opencl:2] Intel(R) OpenCL Graphics, Intel(R) UHD Graphics 770 OpenCL 3.0 NEO [25.09.32961.7]

xpu-smi is not installed. Please install xpu-smi according to README.md

Additional context Add any other context about the problem here.

May 12 '25 01:05 aitss2017

Hi, this should be issue caused by the version of dependencies, you may try trll==0.11.0, which is recommended and tested by us.

Similar issue: https://github.com/intel/ipex-llm/issues/13087#issue-3001412276

May 12 '25 02:05 hkvision

Another error after install trl==0.11.0

root@b580:/llm/models/ipex-llm/python/llm/dev/benchmark/whisper# python run_whisper.py --model_path /llm/models/whisper-medium --data_type other --device xpu [W513 02:47:24.597734290 OperatorEntry.cpp:154] Warning: Warning only once for all operators, other operators may also be overridden. Overriding a previously registered kernel for the same operator and the same dispatch key operator: aten::_validate_compressed_sparse_indices(bool is_crow, Tensor compressed_idx, Tensor plain_idx, int cdim, int dim, int nnz) -> () registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6 dispatch key: XPU previous kernel: registered at /pytorch/build/aten/src/ATen/RegisterCPU.cpp:30477 new kernel: registered at /build/intel-pytorch-extension/build/Release/csrc/gpu/csrc/aten/generated/ATen/RegisterXPU.cpp:468 (function operator()) [W513 02:47:25.226080771 OperatorEntry.cpp:154] Warning: Warning only once for all operators, other operators may also be overridden. Overriding a previously registered kernel for the same operator and the same dispatch key operator: aten::_validate_compressed_sparse_indices(bool is_crow, Tensor compressed_idx, Tensor plain_idx, int cdim, int dim, int nnz) -> () registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6 dispatch key: XPU previous kernel: registered at /pytorch/build/aten/src/ATen/RegisterCPU.cpp:30477 new kernel: registered at /build/intel-pytorch-extension/build/Release/csrc/gpu/csrc/aten/generated/ATen/RegisterXPU.cpp:468 (function operator()) Repo card metadata block was not found. Setting CardData to empty. 2025-05-13 02:47:25,778 - huggingface_hub.repocard - WARNING - Repo card metadata block was not found. Setting CardData to empty. 2025-05-13 02:47:26,324 - ipex_llm.transformers.utils - WARNING - sym_int4 is deprecated, use woq_int4 instead, if you are loading saved sym_int4 low bit model, please resaved it with woq_int4 2025-05-13 02:47:26,324 - ipex_llm.transformers.utils - INFO - Converting the current model to woq_int4 format...... INFO 05-13 02:47:27 init.py:180] Automatically detected platform xpu. Map: 0%| | 0/500 [00:00<?, ? examples/s]/usr/local/lib/python3.11/dist-packages/transformers/models/whisper/tokenization_whisper.py:503: UserWarning: The private method _normalize is deprecated and will be removed in v5 of Transformers.You can normalize an input string using the Whisper English normalizer using the normalize method. warnings.warn( The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. onednn_verbose,v1,info,oneDNN v3.7.0 (commit 83033303c072c3b18e070d1abfedfd1f50248eac) onednn_verbose,v1,info,cpu,runtime:threadpool,nthr:10 onednn_verbose,v1,info,cpu,isa:Intel AVX2 with Intel DL Boost onednn_verbose,v1,info,gpu,runtime:DPC++ onednn_verbose,v1,info,gpu,engine,sycl gpu device count:2 onednn_verbose,v1,info,gpu,engine,0,backend:Level Zero,name:Intel(R) Arc(TM) B580 Graphics,driver_version:1.6.32961,binary_kernels:enabled onednn_verbose,v1,info,gpu,engine,1,backend:Level Zero,name:Intel(R) UHD Graphics 770,driver_version:1.6.32961,binary_kernels:enabled onednn_verbose,v1,info,graph,backend,0:dnnl_backend onednn_verbose,v1,info,experimental features are enabled onednn_verbose,v1,info,use batch_normalization stats one pass is enabled onednn_verbose,v1,info,GPU convolution v2 is disabled onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time onednn_verbose,v1,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,implementation,backend,exec_time onednn_verbose,v1,common,error,runtime,device not found in the given context,src/xpu/sycl/engine_factory.cpp:66 Map: 0%| | 0/500 [00:10<?, ? examples/s] Traceback (most recent call last): File "/llm/models/ipex-llm/python/llm/dev/benchmark/whisper/run_whisper.py", line 74, in result = speech_dataset.map(map_to_pred, keep_in_memory=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/datasets/arrow_dataset.py", line 557, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/datasets/arrow_dataset.py", line 3074, in map for rank, done, content in Dataset._map_single(**dataset_kwargs): File "/usr/local/lib/python3.11/dist-packages/datasets/arrow_dataset.py", line 3492, in _map_single for i, example in iter_outputs(shard_iterable): File "/usr/local/lib/python3.11/dist-packages/datasets/arrow_dataset.py", line 3466, in iter_outputs yield i, apply_function(example, i, offset=offset) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/datasets/arrow_dataset.py", line 3389, in apply_function processed_inputs = function(*fn_args, *additional_args, **fn_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/llm/models/ipex-llm/python/llm/dev/benchmark/whisper/run_whisper.py", line 61, in map_to_pred predicted_ids = model.generate(input_features.to(args.device), forced_decoder_ids=forced_decoder_ids, use_cache=True)[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/models/whisper/generation_whisper.py", line 774, in generate ) = self.generate_with_fallback( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/models/whisper/generation_whisper.py", line 950, in generate_with_fallback seek_outputs = super().generate( ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/ipex_llm/transformers/lookup.py", line 125, in generate return original_generate(self, ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/ipex_llm/transformers/speculative.py", line 127, in generate return original_generate(self, ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/ipex_llm/transformers/pipeline_parallel.py", line 283, in generate return original_generate(self, ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 2280, in generate model_kwargs = self._prepare_encoder_decoder_kwargs_for_generation( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 778, in _prepare_encoder_decoder_kwargs_for_generation model_kwargs["encoder_outputs"]: ModelOutput = encoder(**encoder_kwargs) # type: ignore ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/transformers/models/whisper/modeling_whisper.py", line 1029, in forward inputs_embeds = nn.functional.gelu(self.conv1(input_features)) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/conv.py", line 375, in forward return self._conv_forward(input, self.weight, self.bias) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/conv.py", line 370, in _conv_forward return F.conv1d( ^^^^^^^^^ RuntimeError: could not create an engine

May 12 '25 02:05 aitss2017

Synced offline, the error is encountered when running torch operations on xpu. The user will try to run ipex on this machine to further check the environment.

May 12 '25 07:05 hkvision