XNNPACK Introduce flags for qb4 scale format in xnn_define_blockwise_quantized_tensor

Introduce flags for qb4 scale format in xnn_define_blockwise_quantized_tensor_value

Open GregoryComer opened this issue 6 months ago • 0 comments

This change extends the xnn_define_blockwise_quantized_tensor_value API to accept flags to control block scale format, though only bf16 is currently supported. The intent of this change is to allow for other block scale formats (fp16 or fp32) in the future without breaking API backwards compatibility.

Aug 20 '24 06:08 GregoryComer

XNNPACK XNNPACK copied to clipboard

Introduce flags for qb4 scale format in xnn_define_blockwise_quantized_tensor_value

XNNPACK
XNNPACK copied to clipboard