Ralph Kube comments

Results 17 comments of


                                            Ralph Kube

Support Linux ppc64le

power9 support would very helpful. Many HPC clusters are power9 based, it would be great to use vscode to edit files on these systems.

Support for qr decomposition pullback

I've ported this pullback to CUDA.jl: https://gist.github.com/rkube/b17ef683409d76a3f01bcc590b85de6e Where would be a good place for that code?

rrule for casting LinearAlgebra.QRCompactWYQ into a Matrix

Cool, that works. Here is a gist that includes code from https://github.com/JuliaDiff/ChainRules.jl/pull/469 into a minimum working example: https://gist.github.com/rkube/b965267944115af7d13b3f00e7533572 This code gives the same results as comparable code for pytorch. But...

Fixed reducedim operations for CuQRPackedQ

Those four reduction operators are all dispatched in a similar manner: https://github.com/JuliaLang/julia/blob/bb2d8630f3aeb99c38c659a35ee9aa57bb71012a/base/reducedim.jl#L885 So I thought to handle them all equally. Feel free to close this if it's not needed.

Reverse differentiation through nlsolve

Thanks @niklasschmitz for providing the updated code. The runtime for gradient is now about 100x that for gmres. @antoine-levitt reported 20x. Do you know what happened here? Also the number...

Mulitprocessing errors in pre-processing?

I've replaced pathos with [https://docs.python.org/3/library/multiprocessing.html](https://docs.python.org/3/library/multiprocessing.html) and I'm getting the same errors.

Clarifying where to preprocess

To preprocess the dataset on traverse I need to limit the number of threads used for preprocessing https://github.com/PPPLDeepLearning/plasma-python/issues/82

Clarifying where to preprocess

Each traverse node has 2 processors, 16 cores per processor and 4 threads per core. When I run pre-processing with 126 threads it starts off well but throws errors after...

Clarifying where to preprocess

Summit and traverse are very similar, but no 100% identical. ``` (frnn) [rkube@traverse examples]$ lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core:...

Installation instructions for Traverse

They can be merged.