tpp-mlir
tpp-mlir copied to clipboard
Study type packing (VNNI/BFDOT/BFMMLA/etc) as a single operation
Today we're working on a type packing for VNNI with the operation tpp.vnni_pack
. But this isn't the only type of packing we may want, and they're all very similar, so we could probably come up with a single op (say tpp.type_pack
) with generic arguments that converts a standard packed tensor into a type packed tensor, given some shapes.
We then map the existing types (VNNI/BFDOT/BFMMLA/etc) into this op and lower them as library calls when they match a recognised shape.
We shouldn't create this op if we don't know that the type packing makes sense (ex. by #561 calls), so we shouldn't need to lower this op to loop unless requested by the pipeline (tpp-to-loops
).
@KavithaTipturMadhu
References:
- VNNI: https://www.intel.com/content/www/us/en/developer/articles/guide/deep-learning-with-avx512-and-dl-boost.html
- BFDOT: https://developer.arm.com/documentation/ddi0602/2021-06/SVE-Instructions/BFDOT--vectors---BFloat16-floating-point-dot-product-
- BFMMLA: https://developer.arm.com/documentation/ddi0602/2023-03/SIMD-FP-Instructions/BFMMLA--BFloat16-floating-point-matrix-multiply-accumulate-into-2x2-matrix-?lang=en