Zhong Hui
Zhong Hui
@Narsil please help
``` sf save: 2.6092689037323 sf load: 1.7551813125610352 sf load2: 2.1516408920288086 pickle save: 2.2464730739593506 pickle load: 0.9010257720947266 ``` The `load` API is even more slow than `load_file` ```python import pickle import...
@mishig25 can you give some help?
@LysandreJik can you give some help?
## For `load_file` API The core problem is memcpy for mmap memory is very slow. see: https://stackoverflow.com/questions/52845387/improving-mmap-memcpy-file-read-performance for my case, `open(filename); f.read()` is 2 GB/s, `for memcpy(mmap(filename))` is 1.3 GB/s....
see https://github.com/PyO3/pyo3/issues/4058#issuecomment-2046471081 https://stackoverflow.com/questions/52845387/improving-mmap-memcpy-file-read-performance my os ubuntu 18.04. you can have a test using above scripts. There are some suggestions for using `madvice(.., MADV_SEQUENTIAL);` https://github.com/PyO3/pyo3/issues/4058#issuecomment-2048119528
Still a big problem.
热启动模型对齐精度。第一step精度,看二step loss diff
https://xly.bce.baidu.com/paddlepaddle/Paddle-NLP/newipipe/detail/10720664/job/26276076 这个PR的 rope 接口改动,貌似导致自动并行代码挂了