Sergey Shlyapnikov comments

Results 9 comments of


                                            Sergey Shlyapnikov

[Misc]: How to use intel-gpu in openvino

Hi @liuxingbin Can you share how you are running VLLM? Did you try setting a lower max_model_length value? We assume there should be enough GPU memory to run max_model_length tokens...

[Good First Issue]: if operation is incorrectly reporting profiling data in GPU

Hi @awayzjj , thank you for checking the issue! Let me add more details. The issue is connected with an incorrect performance profiling report for the IF operation. It is...

[Bug]: [GPU] ProgramBuilder build failed on 2025.2, was inferring fine for years until 2024.6 included

Hi @JulienMaille, Could you please share the installed GPU driver version? Also, could you please check if the issue can be reproduced using [benchmark_app](https://docs.openvino.ai/nightly/get-started/learn-openvino/openvino-samples/benchmark-tool.html#examples-of-running-the-tool) tool?

[GPU] Enable custom op with dynamic shape

By the way, the current version implements dynamism through kernel recompilation for each new dynamic shape configuration. However, we could support a shape_agnostic kernel version that can be compiled once...

[GPU] Enable custom op with dynamic shape

@xipingyan , can you please check CI test failures? ``` ov_gpu_func_tests-0 INFO: FAILED TESTS (1/39269): ov_gpu_func_tests-0 INFO: 2909 ms: ov_gpu_func_tests smoke_CustomOpDynamic.Accuracy ```

[TRANSFORMATIONS][CPU][GPU] Backport RoPE related optimizations to OV 2024.2

@AKochin , @dmitry-gorokhov, could you please review the changes from Transformations and CPU sides?

[OpenVINO] Enable GPU support for OpenVINO vLLM backend

Hi @WoosukKwon, could you please take a look at these changes?

[OpenVINO] Enable GPU support for OpenVINO vLLM backend

@mgoin, thank you for your comments! I [applied them](https://github.com/vllm-project/vllm/pull/8192/commits/1723d77e7352d7138b14d1427cc16f1987ef5761) and rebased the branch on top of the recent main, please take a look

[GPU] Avoid memory allocation for any node can reuse previous memory

@Kotomi-Du, how about the following implementation? 1) Keep the existing order of allocations and memory reuse for the sum post-op 2) Move the logic related to onednn impls node memory...