model-zoo icon indicating copy to clipboard operation
model-zoo copied to clipboard

VAE on GPU

Open xukai92 opened this issue 5 years ago • 5 comments

My attempt to make VAE work on GPU.

Currently have the error below:

WARNING: using Distributions.params in module Main conflicts with an existing identifier.
[ Info: Epoch 1
┌ Warning: calls to Base intrinsics might be GPU incompatible
│   exception =
│    You called exp(x::T) where T<:Union{Float32, Float64} in Base.Math at special/exp.jl:75, maybe you intended to call exp(x::Float32) in CUDAnative at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/device/cuda/math.jl:90 instead?
│    Stacktrace:
│     [1] exp at special/exp.jl:75
│     [2] #23 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:49
└ @ CUDAnative ~/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:77
┌ Warning: calls to Base intrinsics might be GPU incompatible
│   exception =
│    You called log(x::Float64) in Base.Math at special/log.jl:254, maybe you intended to call log(x::Float64) in CUDAnative at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/device/cuda/math.jl:65 instead?
│    Stacktrace:
│     [1] log at special/log.jl:254
│     [2] randn_unlikely at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:49
│     [3] randn at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:165
│     [4] #23 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:49
└ @ CUDAnative ~/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:77
┌ Warning: calls to Base intrinsics might be GPU incompatible
│   exception =
│    You called log(x::Float64) in Base.Math at special/log.jl:254, maybe you intended to call log(x::Float64) in CUDAnative at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/device/cuda/math.jl:65 instead?
│    Stacktrace:
│     [1] log at special/log.jl:254
│     [2] randn_unlikely at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:49
│     [3] randn at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:165
│     [4] #23 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:49
└ @ CUDAnative ~/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:77
┌ Warning: calls to Base intrinsics might be GPU incompatible
│   exception =
│    You called exp(x::T) where T<:Union{Float32, Float64} in Base.Math at special/exp.jl:75, maybe you intended to call exp(x::Float64) in CUDAnative at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/device/cuda/math.jl:89 instead?
│    Stacktrace:
│     [1] exp at special/exp.jl:75
│     [2] randn_unlikely at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:49
│     [3] randn at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:165
│     [4] #23 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:49
└ @ CUDAnative ~/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:77
ERROR: LoadError: GPU compilation of #23(CuArrays.CuKernelState, CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},typeof(z),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Tuple{Bool,Bool},Tuple{Int64,Int64}},Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Tuple{Bool,Bool},Tuple{Int64,Int64}}}}) failed
KernelError: recursion is currently not supported

Try inspecting the generated code with any of the @device_code_... macros.

Stacktrace:
 [1] randn_unlikely at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:49
 [2] randn at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:37
 [3] randn_unlikely at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:49
 [4] randn at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:165
 [5] #23 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:49
Stacktrace:
 [1] (::getfield(CUDAnative, Symbol("#hook_emit_function#66")){CUDAnative.CompilerJob,Array{Core.MethodInstance,1}})(::Core.MethodInstance, ::Core.CodeInfo, ::UInt64) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:59
 [2] compile_method_instance at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:97 [inlined]
 [3] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:216 [inlined]
 [4] irgen(::CUDAnative.CompilerJob, ::Core.MethodInstance, ::UInt64) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:110
 [5] #codegen#116(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Function, ::Symbol, ::CUDAnative.CompilerJob) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:216
 [6] #codegen at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/driver.jl:0 [inlined]
 [7] #compile#115(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Function, ::Symbol, ::CUDAnative.CompilerJob) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/driver.jl:47
 [8] #compile#114 at ./none:0 [inlined]
 [9] compile at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/driver.jl:28 [inlined] (repeats 2 times)
 [10] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/execution.jl:378 [inlined]
 [11] #cufunction#156(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.cufunction), ::getfield(GPUArrays, Symbol("##23#24")), ::Type{Tuple{CuArrays.CuKernelState,CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},typeof(z),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Tuple{Bool,Bool},Tuple{Int64,Int64}},Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Tuple{Bool,Bool},Tuple{Int64,Int64}}}}}}) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/execution.jl:347
 [12] cufunction(::Function, ::Type) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/execution.jl:347
 [13] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/execution.jl:174 [inlined]
 [14] macro expansion at ./gcutils.jl:87 [inlined]
 [15] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/execution.jl:171 [inlined]
 [16] _gpu_call(::CuArrays.CuArrayBackend, ::Function, ::CuArray{Float32,2}, ::Tuple{CuArray{Float32,2},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},typeof(z),Tuple{Base.Broadcast.Extruded{CuArray{Float32,2},Tuple{Bool,Bool},Tuple{Int64,Int64}},Base.Broadcast.Extruded{CuArray{Float32,2},Tuple{Bool,Bool},Tuple{Int64,Int64}}}}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CuArrays/PwSdF/src/gpuarray_interface.jl:59
 [17] gpu_call at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/abstract_gpu_interface.jl:151 [inlined]
 [18] gpu_call at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/abstract_gpu_interface.jl:128 [inlined]
 [19] copyto! at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:48 [inlined]
 [20] copyto! at ./broadcast.jl:797 [inlined]
 [21] copy at ./broadcast.jl:773 [inlined]
 [22] materialize at ./broadcast.jl:753 [inlined]
 [23] broadcast(::typeof(z), ::CuArray{Float32,2}, ::CuArray{Float32,2}) at ./broadcast.jl:707
 [24] ∇broadcast at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Tracker/rQ0eB/src/lib/array.jl:475 [inlined]
 [25] materialize(::Base.Broadcast.Broadcasted{Tracker.TrackedStyle,Nothing,typeof(z),Tuple{TrackedArray{…,CuArray{Float32,2}},TrackedArray{…,CuArray{Float32,2}}}}) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Tracker/rQ0eB/src/lib/array.jl:506
 [26] L̄(::CuArray{Float32,2}) at /afs/inf.ed.ac.uk/user/s16/s1672897/tmp/playground2.jl:40
 [27] loss(::CuArray{Float32,2}) at /afs/inf.ed.ac.uk/user/s16/s1672897/tmp/playground2.jl:42
 [28] #15 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Flux/U6ueY/src/optimise/train.jl:72 [inlined]
 [29] gradient_(::getfield(Flux.Optimise, Symbol("##15#21")){typeof(loss),Tuple{CuArray{Float32,2}}}, ::Tracker.Params) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Tracker/rQ0eB/src/back.jl:97
 [30] #gradient#24(::Bool, ::Function, ::Function, ::Tracker.Params) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Tracker/rQ0eB/src/back.jl:164
 [31] gradient at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Tracker/rQ0eB/src/back.jl:164 [inlined]
 [32] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Flux/U6ueY/src/optimise/train.jl:71 [inlined]
 [33] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Juno/TfNYn/src/progress.jl:133 [inlined]
 [34] #train!#12(::getfield(Flux, Symbol("#throttled#18")){getfield(Flux, Symbol("##throttled#10#14")){Bool,Bool,getfield(Main, Symbol("##7#8")),Int64}}, ::Function, ::Function, ::Tracker.Params, ::Base.Iterators.Zip{Tuple{Array{CuArray{Float32,2},1}}}, ::ADAM) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Flux/U6ueY/src/optimise/train.jl:69
 [35] (::getfield(Flux.Optimise, Symbol("#kw##train!")))(::NamedTuple{(:cb,),Tuple{getfield(Flux, Symbol("#throttled#18")){getfield(Flux, Symbol("##throttled#10#14")){Bool,Bool,getfield(Main, Symbol("##7#8")),Int64}}}}, ::typeof(Flux.Optimise.train!), ::Function, ::Tracker.Params, ::Base.Iterators.Zip{Tuple{Array{CuArray{Float32,2},1}}}, ::ADAM) at ./none:0
 [36] top-level scope at /afs/inf.ed.ac.uk/user/s16/s1672897/tmp/playground2.jl:56
 [37] include at ./boot.jl:326 [inlined]
 [38] include_relative(::Module, ::String) at ./loading.jl:1038
 [39] include(::Module, ::String) at ./sysimg.jl:29
 [40] exec_options(::Base.JLOptions) at ./client.jl:267
 [41] _start() at ./client.jl:436

xukai92 avatar Apr 30 '19 23:04 xukai92

@MikeInnes Here is the VAE example I was talking about. I was originally mentioned here: https://github.com/FluxML/model-zoo/issues/20

Not sure if it has anything to do with the rand() function, which usually need some care to send to GPU explicitly in other frameworks (e.g. Knet.jl, PyTorch).

xukai92 avatar Apr 30 '19 23:04 xukai92

Thanks for the patch!

Could you possibly comment out the using CuArrays bit so it's easier to run out of the box on CPUs?

DhairyaLGandhi avatar May 13 '19 09:05 DhairyaLGandhi

I could do that but I guess we need to figure out why the current is not working on GPU first.

xukai92 avatar May 20 '19 22:05 xukai92

I still haven't figured out why this thing doesn't run on GPUs :(

xukai92 avatar Jul 07 '19 21:07 xukai92

OK it works on GPU now. I did the following changes

  • fix a bug in sending data to GPU
  • remove broadcast ops in logp_x_z and kl_q_p
    • I tested remove each and either alon doesn't work

@dhairyagandhi96 I aslo commented out using CuArrays following your suggestion.

xukai92 avatar Jul 07 '19 21:07 xukai92