openvino issues

[CPU] dnn3.8 test

### Details: - *item1* - *...* ### Tickets: - *ticket-id*

azhai219

category: CPU

category: build

category: CI

category: dependency_changes

do_not_review

github_actions

category: dockerfiles

Bump paddlepaddle from 2.6.2 to 3.0.0 in /tests

Bumps [paddlepaddle](https://github.com/paddlepaddle/paddle) from 2.6.2 to 3.0.0. Release notes Sourced from paddlepaddle's releases. PaddlePaddle 3.0.0 Release Note 完整中文版本 English Version PaddlePaddle 3.0 正式版概述作为中国首个自主研发的产业级深度学习平台，飞桨一直坚持开源路线，支撑产业智能化升级。飞桨框架3.0版本不仅延续了飞桨框架2.0系列动静统一、训推一体的特性，更在自动并行、神经网络编译器、高阶自动微分等方面取得突破，为大模型时代的技术创新与产业应用提供了强大支撑，为开发者打造了一站式、高性能的深度学习开发体验。无论是前沿算法研究还是产业级大模型落地，飞桨框架3.0都将成为开发者的首选利器。重点特性说明如下：动静统一自动并行：这一功能大幅度降低了产业开发和训练的成本。用户只需在单卡基础上进行少量的张量切分标记，飞桨框架便会自动完成分布式切分信息的推导，并添加通信算子以确保逻辑的正确性。同时，根据模型结构和集群信息，结合显存和调度层的优化，飞桨能自动寻找最高效的分布式并行策略，从而大幅降低混合并行训练的开发成本，使开发者能够更专注于模型和算法的创新。自动并行架构进行了深入的验证和打磨，以更好地支持纯文稠密模型、纯文稀疏模型（MoE）和多模态理解模型等常见大模型场景的预训练+精调流程；完善算子的切分推导规则，并支持将自动并行训练参数转化成手动并行参数进行下游推理，自动并行达到了全面可用的状态，帮助用户降低大模型并行程序的开发成本。同时，为了进一步简化用户的分布式开发流程，推出全新的paddle.distributed.parallel接口，基于对分布式张量标记语法的封装，支持用户在模型组网外不侵入地配置数据并行、模型并行、流水并行等常见的并行策略。此外，静态图自动并行架构基于PIR完成了全面的升级，底层的基础组件、核心模块、并行策略和性能优化策略均统一基于扩展的PIR DistDialect进行实现，进一步增强了自动并行的动静一致性，并在Llama系列模型上性能达到了持平甚至领先手动并行方式的水平。大模型训推一体：自2.0版本起，飞桨便采用了“动静统一、训推一体”的设计理念，3.0版本也将继续秉持这一理念。得益于动静统一的架构和接口设计，飞桨能够完整支持动态图和静态图这两种不同的运行模式，并且具备出色的整图导出能力。飞桨的动转静整图导出成功率高达95%，高于PyTorch的62%。“训推一体”意味着能够在同一套框架下，尽可能复用训练和推理的代码，特别是复用模型组网代码。在完成模型的开发训练后，只需进行少量的开发工作，即可实现快速推理部署。这一特性为产业提供了极致的开发体验。它使训练和推理的能力能够相互复用，为大模型的全流程提供了统一的开发体验和极致的训练效率。通过动转静的工作，训练和推理的工作得以无缝衔接。支持多款主流大模型、DeepSeek-R1满血版实现单机部署，吞吐提升一倍。科学计算高阶微分：飞桨框架3.0为科学计算提供了高阶自动微分、编译优化和分布式训练能力的支撑。英伟达Modulus的41个不同方程实验显示，飞桨的微分方程求解速度比PyTorch开启编译器优化后的版本平均快...

dependabot[bot]

ExternalPR

category: dependency_changes

python

dependencies

[Draft] [CPU] Improve performance of top-1 stable sort in the innermost dimension

### Details: - *Improve performance of top-1 stable sort in the innermost dimension for Topk node.* ### Tickets: - *[CVS-160564](https://jira.devtools.intel.com/browse/CVS-160564)*

xuchen-intel

category: CPU

Xp/qwen2 5 vl

### Details: - *item1* - *...* ### Tickets: - *ticket-id*

xipingyan

category: Core

category: GPU

category: Python API

category: transformations

do not merge

do_not_review

category: CPP API

do_not_merge

[GPU][QWen2-VL][QWen2.5-VL] improve SDPA performance with cu_seqlens and cu_window_seqlens

### Details: - *item1* - *...* ### Tickets: - *[168519](https://jira.devtools.intel.com/browse/CVS-168519)* Should work along with - https://github.com/openvinotoolkit/openvino.genai/pull/2330

ceciliapeng2011

category: Core

category: GPU

category: Python API

category: transformations

category: CPP API

[Feature Request]: Support for PP-OCRv5's New Model Format (inference.json, inference.pdiparams, inference.yml)

### Request Description PaddleOCR recently updated the model format for PP-OCRv5 . The previous format used three files: > inference.pdmodel > > inference.pdiparams > > inference.pdiparams.info The new format now...

lxw112190

enhancement

good first issue

feature

category: PDPD FE

[CPU] Optimize tail 'Convert' nodes time cost in f16 precision mark-up transformation

1

### Details: - *First, all tail nodes (which are non-computationally intensive nodes) of the model will be collected. The process begins from the model's output, traversing upwards until encountering 'Convert'...

liubo-intel

category: CPU

Setting parallel to tbb static partitioner or tbb auto partitoner by thread pool

### Details: - *Setting parallel to TBB STATIC partitioner or TBB AUTO partitioner through by thread pool* - *...* ### Tickets: - *CVS-165229*

sunxiaoxia2022

category: inference

category: Core

category: CPU

category: build

category: Python API

do_not_review

category: CPP API

do_not_merge

update mem_bandwidth_pressure_tolerance() for CPU plugin

### Details: - *When user set inference Precision to bf16, mem_bandwidth_pressure_tolerance() will get f32 from graph during threading scheduling* ### Tickets: - *ticket-id*

wangleis

category: inference

category: CPU

[Good First Issue]: Enable Tensor.copyTo() method

21

### Context OpenVINO works in Node.js environment! We are looking for new contributors who can help with enabling C++ API methods in JavaScript side. First of all read Node.js API...

almilosz

good first issue

category: JS API

no_stale

gsoc-prerequisite-task

javascript

openvino
openvino copied to clipboard

Metadata

[CPU] dnn3.8 test

Bump paddlepaddle from 2.6.2 to 3.0.0 in /tests

[Draft] [CPU] Improve performance of top-1 stable sort in the innermost dimension

Xp/qwen2 5 vl

[GPU][QWen2-VL][QWen2.5-VL] improve SDPA performance with cu_seqlens and cu_window_seqlens

[Feature Request]: Support for PP-OCRv5's New Model Format (inference.json, inference.pdiparams, inference.yml)

[CPU] Optimize tail 'Convert' nodes time cost in f16 precision mark-up transformation

Setting parallel to tbb static partitioner or tbb auto partitoner by thread pool

update mem_bandwidth_pressure_tolerance() for CPU plugin

[Good First Issue]: Enable Tensor.copyTo() method

← Metadata

Owner

Metadata

openvino openvino copied to clipboard

Metadata

← Metadata

Owner

Metadata

openvino
openvino copied to clipboard