OpenROAD icon indicating copy to clipboard operation
OpenROAD copied to clipboard

CUDA version of GPL

Open RuijiaoSun opened this issue 3 years ago • 15 comments

RuijiaoSun avatar Jul 20 '22 20:07 RuijiaoSun

@cdleary @hzeller would you mind reviewing Rachel's first PR?

QuantamHD avatar Jul 20 '22 21:07 QuantamHD

You should build your code on a machine without cuda installed and make sure there is no issue

maliberty avatar Jul 20 '22 21:07 maliberty

@maliberty You need to approve the scan workflow since Rachel's a first time contributor.

QuantamHD avatar Jul 25 '22 23:07 QuantamHD

@maliberty You need to approve the scan workflow since Rachel's a first time contributor.

Unfortunately I have to approve it on every push as she is still a first time contributor. I just did so again.

maliberty avatar Jul 25 '22 23:07 maliberty

Do you have any measurements of the benefit of this change?

maliberty avatar Aug 02 '22 02:08 maliberty

RuijiaoSun avatar Aug 02 '22 15:08 RuijiaoSun

Do you have any measurements of the benefit of this change?

Am doing it.

RuijiaoSun avatar Aug 02 '22 18:08 RuijiaoSun

Hi Matt, I tested the code on my working computer and another cloudtop without a GPU. It can work on both of the computers. But here are some not successful checks. Do you have any ideas on this?https://jenkins.openroad.tools/blue/organizations/jenkins/OpenROAD-Public/detail/PR-2076-head/14/pipeline image

RuijiaoSun avatar Aug 03 '22 22:08 RuijiaoSun

Have you looked at gpl incremental02? You can click the details link to see the Jenkins results.

maliberty avatar Aug 03 '22 23:08 maliberty

Hello Matt @maliberty , I found a library CUSP, which is used to solve linear systems based on GPU. It is very fast. I have tested it successfully on GPL. Here is the performance: image

I think the current direct CUDA solver is done. It is slow but correct. Can you please check it once more and let me know if I have any more places to modify? If there is no problem, my thought is that we close this PR as soon as possible and open a new PR to upload the CUSP version.

RuijiaoSun avatar Aug 08 '22 22:08 RuijiaoSun

Have you looked at gpl incremental02? You can click the details link to see the Jenkins results.

RuijiaoSun avatar Aug 08 '22 22:08 RuijiaoSun

re "my thought is that we close this PR as soon as possible and open a new PR to upload the CUSP version." Is there a reason not to integrate that change into this PR?

maliberty avatar Aug 08 '22 22:08 maliberty

re "my thought is that we close this PR as soon as possible and open a new PR to upload the CUSP version." Is there a reason not to integrate that change into this PR?

Yes. I think CUSP gives a big improvement on the performance. It's worth using it as the final version for the GPU solver. But the last update of CUSP is years ago and I am not sure if it will be maintained in the future. The direct CUDA solver is a version not dependent on high-level libraries. One day we can continue to develop our iterative GPU solver on the basis of it.

RuijiaoSun avatar Aug 08 '22 23:08 RuijiaoSun

re "my thought is that we close this PR as soon as possible and open a new PR to upload the CUSP version." Is there a reason not to integrate that change into this PR?

Yes. I think CUSP gives a big improvement on the performance. It's worth using it as the final version for the GPU solver. But the last update of CUSP is years ago and I am not sure if it will be maintained in the future. The direct CUDA solver is a version not dependent on high-level libraries. One day we can continue to develop our iterative GPU solver on the basis of it.

What is the performance of the current version if we don't integrate CUSP? " It is slow but correct."

maliberty avatar Aug 08 '22 23:08 maliberty

Much slower. 20 times slower for some experiments. Its existance just provides a frame and some records for the CUDA direct solver. Not useful for users but can be used as a reference for developers.

RuijiaoSun avatar Aug 09 '22 01:08 RuijiaoSun

@maliberty I think this is ready for another review.

QuantamHD avatar Aug 12 '22 17:08 QuantamHD

Are the runtime from the CUSP solver column now what this PR will produce?

maliberty avatar Aug 13 '22 16:08 maliberty

This is the updated one, CUSP solver with preconditioner. The direct CUSP solver is very fast but can't converge well for big chips. The current version can provide a correct result and bring a 30% acceleration. image

RuijiaoSun avatar Aug 14 '22 16:08 RuijiaoSun

Are the runtime from the CUSP solver column now what this PR will produce?

Hi Matt, do you have more suggestions or questions for this PR?

RuijiaoSun avatar Aug 16 '22 15:08 RuijiaoSun