feng_shuai

Results 5 issues of feng_shuai

### PR types New features ### PR changes OPs ### Describe 将multihead_matmul convert到oss:主要是针对下面的情况,attention的输入里每个batch的length都一样,这样可以不用设定mask, pos, mask等全局变量,提升性能也顺便解决用户使用不方便的特性。

Lastest infer cpp

paddle_infer_cpp

prune_utils.py

https://github.com/3dem/relion/blob/dfea290207df2ea63fee36f5edce34b8d5c7a1d0/src/ml_optimiser_mpi.cpp#L1205 cudaDeviceSynchronize() will only synchronize host with the currently set GPU, so the function just set the last device which be setted in MlDeviceBundle *b = new MlDeviceBundle(this);.