XNNPACK
XNNPACK copied to clipboard
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Should we refer to one of the publications listed in README?
BF16 GEMM microkernels
Add kernel_elements as a parameter to depthwise convolution microkernels This will allow us to change depthwise convolution microkernels to support kernel elements up to primary tile size, instead of only...
Add fused operators support to convolution operators A specified set of operators can be fused into convolution using the new function xnn_create_convolution2d_nhwc_f32_fused, this creates a convolution operator with a list...
WIP pipe fused params through subgraph and operators
Add eager API for transpose.
Variable size transpose ukernels are treated as 1d ukernels. They are inherently 2d ukernels, element size contains one dimensions. When the input or output dimension is strided, then adding element...
SpaceToDepth xnnpack delegate
Add Space to depth to XNNPACK
Add avx & avx2 transpose microkernel generator