openvino [GPU] Recognize parameters as valid inputs for compressed weights

Details:

The change allows parameters to be recognized alongside constants as valid weight inputs for transformations producing FullyConnectedCompressed nodes

Description of the issue:

At present, the FC_COMPRESSED_WEIGHT_PATTERN macro contains a pattern for dequantization of a constant integer weight. This pattern is used to recognize and fold cases where fused weight dequantization can be used, replacing them with FullyConnectedCompressed nodes. Due to expecting a constant weight input, this pattern fails to recognize quantized LoRA weights, which are provided as parameters: fc_compressed_param_before With the changes in this patch, these weights can be recognized, and the transformations can proceed and produce nodes that would then leverage oneDNN fused QGEMM for execution: fc_compressed_param_after

Tickets:

CVS-172090

Oct 02 '25 12:10 mdvoretc-intel

build_jenkins

Oct 28 '25 16:10 mdvoretc-intel

build_jenkins

Oct 29 '25 11:10 mdvoretc-intel

@CuriousPanCake please review.

Nov 03 '25 16:11 mdvoretc-intel

@CuriousPanCake please review.

Nov 11 '25 14:11 mdvoretc-intel

@Lyamin-Roman please take a look

Nov 14 '25 09:11 CuriousPanCake

Optimized kernels are called for the matrix multiplication itself, but there is a perf cost from transpose nodes for the parameter weights that cannot be optimized away.

Nov 14 '25 11:11 mdvoretc-intel

Consider add new tests

Tests added.

Nov 28 '25 14:11 mdvoretc-intel

@Lyamin-Roman please review.

Nov 28 '25 14:11 mdvoretc-intel

build_jenkins

Dec 03 '25 05:12 p-durandin

build_jenkins

Dec 06 '25 06:12 susbhere

build_jenkins

Dec 08 '25 05:12 p-durandin