XiangShan icon indicating copy to clipboard operation
XiangShan copied to clipboard

timing(IPrefetch): add 1 cycle to s2_finish

Open ngc7331 opened this issue 1 year ago • 3 comments

Cut critical path prefetchPipe s2 -> toMSHRArbiter.valid(i) -> toMSHR.paddr -> missUnit hit -> missUnit.req.ready -> prefetchPipe toMSHRArbiter.ready -> s2_finish -> s2_ready -> s1_ready -> toFtq.ready for timing.

This can be thought of as adding 1 cycle to the prefetchPipe s2_finish, but only a minor performance change is expected, since the timing of issuing the first miss request is unchanged, and the additional waiting delay for subsequent miss requests can be hidden by the l2 cache access delay.

ngc7331 avatar Sep 11 '24 06:09 ngc7331

[Generated by IPC robot] commit: ca2c79f26008574c9a1e0f9aeb2363452ae7f8b8

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
ca2c79f 1.888 0.460 2.709 1.198 2.829 2.487 2.395 0.924 1.384 1.377 3.349 2.755 2.423 3.195

master branch:

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
b30cb8b 0.460 2.695 2.401 0.919 1.379 2.751
a53daa0 0.460 2.695 1.186 2.401 0.919 1.379 1.454 3.362 2.751 3.212
8b2f7ab 1.865 0.460 2.695 1.186 2.822 2.490 2.401 0.919 1.379 1.454 3.362 2.751 2.418 3.212
dd286b6 0.460 2.695 1.186 2.822 2.490 2.401 0.919 1.379 1.454 3.362 2.751 3.212
e6f36bc 1.855 0.460 2.695 1.186 2.822 2.490 2.401 0.919 1.379 1.454 3.362 2.751 2.418 3.212
3088616 1.855 0.460 2.695 1.186 2.822 2.490 2.401 0.919 1.379 1.454 3.362 2.751 2.418 3.212
497660c 1.855 0.460 2.695 1.186 2.822 2.490 2.401 0.919 1.379 1.454 3.362 2.751 2.418 3.212
65e844f 1.865 0.460 2.695 1.186 2.822 2.490 2.401 0.919 1.379 1.454 3.362 2.751 2.418 3.212
0d7009b 1.855 0.460 2.695 1.186 2.822 2.490 2.401 0.919 1.379 1.454 3.362 2.751 2.418 3.212

XiangShanRobot avatar Sep 11 '24 14:09 XiangShanRobot

This PR might have some performance drawbacks. Since frontend timing currently satisfies the requirements, this PR is not necessary for now.

Yan-Muzi avatar Sep 20 '24 06:09 Yan-Muzi

SPEC 06 (0.3 coverage) tests show overall performance essentially unchanged (~0.03% increase), with ±1% fluctuations at individual test points (GemsFDTD -1.11%; zeusmp +0.74%). I guess this is acceptable.

ngc7331 avatar Sep 26 '24 04:09 ngc7331