Valentin Churavy comments

Results 1411 comments of


                                            Valentin Churavy

Add waitfirst/fetchfirst operator

I don't think we need to recurse into that just yet. @jpsamaroo are these tasks that are both local and remote?

SGE: can't even add processors

I am not sure I understand this issue. You are running on a SGE cluster and you are trying to add processes with `addprocs_qrsh`? That won't work. > For some...

SGE: can't even add processors

Yeah so either you get stuck in allocating forever or ~the time-out doesn't trigger.~ Looks like qsub doesn't have a time-out. You may want to instrument this code. https://github.com/JuliaParallel/ClusterManagers.jl/blob/e375f50f2c4eab3d3f4cefcea3465c82734cfb71/src/qsub.jl#L83

[WIP/DNM]make similar produce the right chunktype

This has become a pile of various changes I needed for a project, I will separate them out as well as cleaning them up.

[WIP/DNM]make similar produce the right chunktype

Not that old :) and I would keep it open as a reminder to myself that this work needs to get cleaned up and finished or migrated over to Dagger.

Comprehensive tests

Oh that is rather interesting! I see you are running it on travis.

Basic vector functionality (push, append, etc)

Yeah just came across this the whole vector testsuite is not being run. We wanted to do `push!(CuArray([1]), 1)`.

Broadcast uses wrong abstract interpreter, breaks device-only functionality

That is an excellent find! I think the short-term solution is for GPUArrays to use a custom `combine_eltypes` that uses the device interpreter. I don't think something like `with_interpeter` (besides...

Add a hostcall interface

> I had expected this when running with a single thread, because the main task isn't preemtible, but even with multiple threads the main task getting stuck apparently blocks the...

Performance hit of using cudadevrt

I discussed this with @trws yesterday and he saw good or even better performance using LTO with openmp offloading and cudadevrt. x-ref: https://github.com/JuliaGPU/CUDA.jl/blob/de004245e51e4f27b24d6952cc6dba989bc1ba98/src/compiler/execution.jl#L437-L438 Do we need to use nvlink or...