Edward Shogulin issues

Results 21 issues of


                                            Edward Shogulin

[DOC] Quantization Scheme

### Details: - *[DOC] Quantization Scheme*

category: docs

[CPU] [ARM] [INT8] FullyConnected

### Details: - *[ARM] [INT8] FullyConnected* ### Tickets: - *ticket-id*

category: IE Tests

category: GPU

category: CPU

category: build

[Good First Issue] [ARM]: Implement CPU plugin just-in-time emitter for Sqrt operation

### Context [JIT Emitters](https://github.com/openvinotoolkit/openvino/blob/42f1cb095143f19c0b9ee25836c29748bc8d9bf2/src/plugins/intel_cpu/src/emitters/README.md) are part of code generation feature (a.k.a. tensor compiler) that automatically produces highly-efficient optimized fused subgraph binary code. Each emitter implements specific operation from low level...

good first issue

category: CPU

no_stale

[Good First Issue] [ARM]: Implement CPU plugin just-in-time emitter for LessEqual operation

good first issue

category: CPU

platform: arm

no_stale

[Good First Issue] [ARM]: Implement CPU plugin just-in-time emitter for LogicalXor operation

good first issue

category: CPU

platform: arm

no_stale

NEGEMMLowpMatrixMultiplyCore: performance issue int8 vs fp16

Issues: 1. Default selected low precision kernel is not optimal for described below platform. 2. We have only 30% performance gain for low precision kernel VS fp16 in multithreaded mode....

Question

NEGEMMLowpMatrixMultiplyCore: QASYMM8 src1 & QASYMM8_SIGNED src2 support

In accordance with documentation [NEGEMMLowpMatrixMultiplyCore](https://arm-software.github.io/ComputeLibrary/v24.06/classarm__compute_1_1_n_e_g_e_m_m_lowp_matrix_multiply_core.xhtml) suports only limited combinations of `QSYMM8` and `QASYMM8_SIGNED` precisions on inputs: src0 | src1 | src2 | dst -- | -- | -- | --...

Help wanted

Feature Request

NEGEMMLowpMatrixMultiplyCore: set_pretranspose_A & set_pretranspose_B support

Model: ```mermaid graph TD; Input1["Input src1: fp32"] Quantise1["NEQuantizationLayer q_src1: QASYMM8_SIGNED"] Input2["Input src2: fp32"] Quantise2["NEQuantizationLayer q_src2: QASYMM8_SIGNED"] MatMul["NEGEMMLowpMatrixMultiplyCore q_res: S8"] Input1-->Quantise1; Input2-->Quantise2; Quantise1-->MatMul; Quantise2-->MatMul; MatMul-->Result; ``` Can you confirm that `NEGEMMLowpMatrixMultiplyCore`...

Help wanted

NEGEMMLowpMatrixMultiplyCore: GEMMLowpOutputStageInfo fusing to speed up inference

Hi guys, I'm extremelly interested to speed up int8 `MatMul` inference with ARM Compute Library kernel. My model is: ```mermaid graph TD; Input1["Input out: fp32"] Quantise1["NEQuantizationLayer out: signed int8"] Input2["Input...

Help wanted

[TEST] [CPU] [ARM] max src number

### Details: - *item1* - *...* ### Tickets: - *ticket-id*

category: CPU

platform: arm

do_not_review

do_not_merge