PaddleMIX icon indicating copy to clipboard operation
PaddleMIX copied to clipboard

Re-network the DIT, fix some parameters, and simplify the model networking code

Open chang-wenbin opened this issue 6 months ago • 2 comments

Latest optimization: Re-network DIT, simplify the original model dynamic graph into a high-performance model network,

  1. For the core that consumes more time: the transformer part uses paddle.incubate.jit.inference to do dynamic and static conversion, and removes redundant parts in the loop;
  2. We also use some triton operators for artificial operator fusion;
  3. We also use horizontal fusion operators to merge horizontal operators for calculation;
  4. We use the cutlass library for optimization and acceleration;

Currently facebook-DIT takes: 219.936 ms

chang-wenbin avatar Jul 29 '24 04:07 chang-wenbin