openmmexampleplugin icon indicating copy to clipboard operation
openmmexampleplugin copied to clipboard

CPU platform

Open zonca opened this issue 7 years ago • 14 comments

do you have any plan to add the CPU platform to this example plugin?

zonca avatar Nov 22 '17 21:11 zonca

CpuPlatform extends ReferencePlatform, so the same kernel works on both Reference and CPU. See http://docs.openmm.org/latest/developerguide/developer.html#the-cpu-plaform. If you're writing a kernel that you expect to be performance critical, you can include separate implementations for the two platforms. Otherwise, there's no need.

peastman avatar Nov 22 '17 21:11 peastman

Thanks @peastman, it would be nice to have an example on how to setup a ThreadPool just for computing all the pair of interactions in parallel. I'm trying to learn from CPUNonbondedForce and it is tough.

zonca avatar Nov 22 '17 21:11 zonca

You might take a look at CpuCustomNonbondedForce, since it's a bit simpler. CpuNonbondedForce uses vector instructions to compute 4 or 8 interactions at once, which makes the code complicated. CpuCustomNonbondedForce computes interactions individually, so the code is simpler (but also slower).

peastman avatar Nov 22 '17 21:11 peastman

By the way, if you just want an example of using ThreadPool to parallelize a straightforward calculation, there are lots of those. For example, see https://github.com/pandegroup/openmm/blob/master/platforms/cpu/src/CpuBondForce.cpp#L172-L176.

peastman avatar Nov 22 '17 21:11 peastman

that is exactly what I want to do, but I don't understand where I create and potentially initialize the threads variable

zonca avatar Nov 22 '17 21:11 zonca

You can just create a thread pool as

ThreadPool threads;

To avoid the overhead of constantly creating and destroying threads, though, the CPU platform has a shared thread pool for use by kernels. You can access it as data.threads, where data is the CpuPlatform::PlatformData. You can get it from the ContextImpl by calling CpuPlatform::getPlatformData(context), though in practice the kernel factory just passes it to the constructors of all the kernels (see https://github.com/pandegroup/openmm/blob/master/platforms/cpu/src/CpuKernelFactory.cpp).

peastman avatar Nov 22 '17 22:11 peastman

thanks, I think I got it, I'll try to implement it. Though it would be nice to have an example of this in the example plugin.

zonca avatar Nov 22 '17 22:11 zonca

thanks, in my plugin I have 5 kernels but I'd like only to parallelize 2,

  • should I still use the registerKernelFactories mentioned in the guide at http://docs.openmm.org/latest/developerguide/developer.html#the-cpu-plaform ?
  • that goes in the Reference Platform Kernel Factory right?
  • do you have an example of CPUKernelFactory that only implements a subset of kernels?

zonca avatar Dec 13 '17 21:12 zonca

Just to be clear: there's nothing stopping you from using multiple threads in your reference implementation. We generally don't, because we want to keep the reference implementation as simple as possible, but you can do whatever you want.

If you want to have a different implementation for the CPU platform, just register your kernels with that platform.

do you have an example of CPUKernelFactory that only implements a subset of kernels?

https://github.com/pandegroup/openmm/blob/master/platforms/cpu/src/CpuKernelFactory.cpp

peastman avatar Dec 13 '17 23:12 peastman

I'd like to keep the same conventions of OpenMM and have a separate CPU platform if I understand how to implement it.

From the guide I don't understand if the registerKernelFactories below is needed:

  • only if I just want to use the Reference code for both Reference and CPU platform
  • or also when I want to reimplement some kernels
extern "C" void registerKernelFactories() {
    for (int i = 0; i < Platform::getNumPlatforms(); i++) {
        Platform& platform = Platform::getPlatform(i);
        if (dynamic_cast<ReferencePlatform*>(&platform) != NULL) {
            // Create and register your KernelFactory.
        }
    }
}

Moreover does this go in the Reference Kernel Factory or the CPU Kernel Factory?

Thanks!

zonca avatar Dec 14 '17 19:12 zonca

That code works because CpuPlatform is a subclass of ReferencePlatform, so the cast succeeds for either one. If you want to use different implementations for the two platforms, you'll register different kernel factories for them. Here's an example from CudaRpmdKernelFactory.cpp:

extern "C" OPENMM_EXPORT void registerKernelFactories() {
    try {
        Platform& platform = Platform::getPlatformByName("CUDA");
        CudaRpmdKernelFactory* factory = new CudaRpmdKernelFactory();
        platform.registerKernelFactory(IntegrateRPMDStepKernel::Name(), factory);
    }
    catch (std::exception ex) {
        // Ignore
    }
}

It looks up the specific platform it wants by name, then registers a new kernel factory with it. You'll want to do the same thing.

This also assumes the reference and CPU platforms are contained in separate libraries. (That's what the example plugin does. Each platform is a separate library.) Each library can only have a single registerKernelFactories() function. It doesn't matter what file it's defined in, but if you have two of them you'll get a compilation error due to the multiple definitions of the same symbol.

peastman avatar Dec 14 '17 20:12 peastman

thanks, I got the kernel registration working, now I have a problem with ThreadPool not being properly initialized.

In the Kernel Factory, I get the platform data with:

KernelImpl* MBPolCpuKernelFactory::createKernelImpl(std::string name, const Platform& platform, ContextImpl& context) const {       
    CpuPlatform::PlatformData& data = *static_cast<CpuPlatform::PlatformData*>(context.getPlatformData());          

However if I print out data.threads.getNumThreads(), it is always different, generally around 20000. Is there any initialization I need to perform?

zonca avatar Feb 01 '18 10:02 zonca

You need to instead call:

CpuPlatform::PlatformData& data = CpuPlatform::getPlatformData(context);

The CPU platform is a bit different from others because it's a subclass of ReferencePlatform. That means the "platform data" returned by context.getPlatformData() is actually a ReferencePlatform::PlatformData, not a CpuPlatform::PlatformData. The CPU platform then creates a second "platform data" object for the extra data it needs to store (beyond what it inherits from ReferencePlatform), and it needs to create a separate interface for accessing that.

peastman avatar Feb 01 '18 19:02 peastman

This fixed it, thanks!

On Thu, Feb 1, 2018 at 8:30 PM peastman [email protected] wrote:

You need to instead call:

CpuPlatform::PlatformData& data = CpuPlatform::getPlatformData(context);

The CPU platform is a bit different from others because it's a subclass of ReferencePlatform. That means the "platform data" returned by context.getPlatformData() is actually a ReferencePlatform::PlatformData, not a CpuPlatform::PlatformData. The CPU platform then creates a second "platform data" object for the extra data it needs to store (beyond what it inherits from ReferencePlatform), and it needs to create a separate interface for accessing that.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/peastman/openmmexampleplugin/issues/15#issuecomment-362375791, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXYcrlfMz-WW-HtXLYRlL1ahApKV0BFks5tQhDGgaJpZM4QoBE_ .

zonca avatar Feb 01 '18 20:02 zonca