parenchyma
parenchyma copied to clipboard
Advanced configuration (e.g., multiple devices)
OpenCL reference card Porting CUDA Applications to OpenCL
OpenCL
Contexts
Current implementation allows for a single context to encapsulate a single device only.
What's possible:
- A single context dedicated to a single device (many contexts -> many devices)
- A single context encapsulating multiple devices
- Many contexts associated with the same device
All the setups above require advanced scheduling + cross device execution is quite rare. Multiple platforms can exist on a single machine. Targeting multiple platforms is fine as long as contexts do not cross - meaning, one context per platform is required. In other words, an OpenCL context can only encapsulate devices from a single platform.
Queues
At least one command queue per device is required.
What's possible:
- multiple command queues? (not sure..)
OpenCL objects such as memory, program and kernel objects are created using a context. Operations on these objects are performed using a command-queue. The command-queue can be used to queue a set of operations (referred to as commands) in order. Having multiple command-queues allows applications to queue multiple independent commands without requiring synchronization. Note that this should work as long as these objects are not being shared. Sharing of objects across multiple command-queues will require the application to perform appropriate synchronization. This is described in Appendix A of the specification.
CUDA
Current implementation allows for a single context to encapsulate a single device only.
...
Once a system has multiple devices, there are two main complications: deciding which device to place the com- putation for each node in the graph, and then managing the required communication of data across device bound- aries implied by these placement decisions.