openvino.genai
openvino.genai copied to clipboard
Fix max token slicing in compression data transform function
Changes
Fix how max token slicing is applied inside data-aware compression transform_fn
.
Reason for change
input_ids
and attention_mask
have the shape of (B, N)
, but max token slicing is applied at first instead of the second dimension.