PaddleMIX
PaddleMIX copied to clipboard
Re-network the DIT, fix some parameters, and simplify the model networking code
Latest optimization: Re-network DIT, simplify the original model dynamic graph into a high-performance model network,
- For the core that consumes more time: the transformer part uses
paddle.incubate.jit.inference
to do dynamic and static conversion, and removes redundant parts in the loop; - We also use some triton operators for artificial operator fusion;
- We also use horizontal fusion operators to merge horizontal operators for calculation;
- We use the cutlass library for optimization and acceleration;
Currently facebook-DIT takes: 219.936 ms