bdenhollander
bdenhollander
Currently working on [P13426 R6015 C3 G5](https://apps.foldingathome.org/wu#project=13426&run=6015&clone=3&gen=5), which was already completed 3 days ago. Gen 0 is the worst I've seen for reassigns. The first successful OK was past the...
@ThWuensche Your trace has a significant amount of time in `findBlocksWithInteractions`. Can you try changing 32 to 64 in findInteractingBlocks.cl to check if performance is better or worse? `#if SIMD_WIDTH
> > > That makes it spend even more time there. And more of a question, also the time in computeNonbonded is increasing. How could that be related to the...
> > > But why does that have influence on computeNonbonded? I don't know enough about what's happening internally to know whether the kernels are stalling each other or the...
> > > The other question is why these calculations (`findBlocksWithInteractions` and `computeNonbonded`) take more time for lower atom counts than for higher atom counts. As a percentage of total...
> > > Think they got something wrong. My Radeon VII is gfx906, that's part of the output of clinfo: > > ``` > Local memory size: 65536 > ```...
Found this tidbit about Vega LDS size: > > >[C]urrently Vega has 64 KB local/shared memory enabled on Linux, but 32 KB on Windows. https://community.amd.com/thread/246353
Returned to check if any progress had been made on this issue and noticed that it was unintentionally closed via commit message. I checked the latest doc site and can...
I believe that file is installed by the graphics driver rather than the HIP SDK. Are you using a 23.x version of AMD Adrenaline Edition driver package?
I profiled your code on Windows on gfx1032. The majority of the time was spent in memcpy rather than creating and destroying streams. This code may be more of host...