parenchyma issues

CUDA maintainer

7

I don't have a CUDA-enabled GPU yet, so I can only work on and test the OpenCL and Native components. Would anyone with CUDA running on his/her machine be interested...

jonysy

help wanted

discussion

array-interop: request for feedback

Please see anowell/are-we-learning-yet#14. I would like for the major rust array crates to be able to have interop without friction. I would love your feedback

vitiral

Just-in-time (JIT) compilation

1

Add a graph along w/ a JIT compiler. **Design**: ..

jonysy

enhancement

discussion

Simplify package extension/build system

Consider using [`lazy_static`] for CUDA and OpenCL kernels. [`lazy_static`]: https://crates.io/crates/lazy_static

jonysy

enhancement

Async + Mitigate host-device memory transfer bottlenecks

3

An application is only as fast as its slowest part.. Taken from the SO question: [mitigate host + device memory tranfer bottlenecks in OpenCL/CUDA](http://stackoverflow.com/q/3972260/2561805) > There are a couple things...

jonysy

enhancement

help wanted

discussion

Fallback mechanism

2

Parenchyma should gracefully fallback to a compatible framework. From the original [README](https://github.com/autumnai/collenchyma/blob/master/README.md): > Collenchyma does not require OpenCL or Cuda on the machine and automatically falls back to the native...

jonysy

enhancement

help wanted

discussion

Consider dropping the length check (OpenCL 2.0)

@drahnr's point via Gitter: > I think you can safely drop the [length check][1] in opencl 2.0 > OpenCL 2.0 has something called remainder workgroups. [1]: https://github.com/jonysy/parenchyma-dnn/blob/master/src/frameworks/cl/source/activation.cl

jonysy

enhancement

RFCs to consider

### Extending coherence with workspaces proposal If merged, the "extending coherence with workspaces" proposal would allow authors to implement traits for types defined within the workspace it's associated with. Under...

jonysy

enhancement

language limit

Transfer Matrix

1

There is the need to handle transfers between devices more easily. The current attempt to sync from backend to another is not sufficient/does not scale with more backends. There are...

drahnr

discussion

Advanced configuration (e.g., multiple devices)

1

[OpenCL reference card](http://www.slideshare.net/piyushmittalin/opencl-12quickreferencecard) [Porting CUDA Applications to OpenCL](http://developer.amd.com/tools-and-sdks/opencl-zone/opencl-resources/programming-in-opencl/porting-cuda-applications-to-opencl/) ## OpenCL ### Contexts Current implementation allows for a single context to encapsulate a single device only. **What's possible**: - A single...

jonysy

enhancement

help wanted

information

discussion

parenchyma
parenchyma copied to clipboard

Metadata

CUDA maintainer

array-interop: request for feedback

Just-in-time (JIT) compilation

Simplify package extension/build system

Async + Mitigate host-device memory transfer bottlenecks

Fallback mechanism

Consider dropping the length check (OpenCL 2.0)

RFCs to consider

Transfer Matrix

Advanced configuration (e.g., multiple devices)

← Metadata

Owner

Metadata

parenchyma parenchyma copied to clipboard

Metadata

← Metadata

Owner

Metadata

parenchyma
parenchyma copied to clipboard