model-zoo
model-zoo copied to clipboard
VAE on GPU
My attempt to make VAE work on GPU.
Currently have the error below:
WARNING: using Distributions.params in module Main conflicts with an existing identifier.
[ Info: Epoch 1
┌ Warning: calls to Base intrinsics might be GPU incompatible
│ exception =
│ You called exp(x::T) where T<:Union{Float32, Float64} in Base.Math at special/exp.jl:75, maybe you intended to call exp(x::Float32) in CUDAnative at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/device/cuda/math.jl:90 instead?
│ Stacktrace:
│ [1] exp at special/exp.jl:75
│ [2] #23 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:49
└ @ CUDAnative ~/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:77
┌ Warning: calls to Base intrinsics might be GPU incompatible
│ exception =
│ You called log(x::Float64) in Base.Math at special/log.jl:254, maybe you intended to call log(x::Float64) in CUDAnative at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/device/cuda/math.jl:65 instead?
│ Stacktrace:
│ [1] log at special/log.jl:254
│ [2] randn_unlikely at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:49
│ [3] randn at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:165
│ [4] #23 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:49
└ @ CUDAnative ~/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:77
┌ Warning: calls to Base intrinsics might be GPU incompatible
│ exception =
│ You called log(x::Float64) in Base.Math at special/log.jl:254, maybe you intended to call log(x::Float64) in CUDAnative at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/device/cuda/math.jl:65 instead?
│ Stacktrace:
│ [1] log at special/log.jl:254
│ [2] randn_unlikely at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:49
│ [3] randn at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:165
│ [4] #23 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:49
└ @ CUDAnative ~/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:77
┌ Warning: calls to Base intrinsics might be GPU incompatible
│ exception =
│ You called exp(x::T) where T<:Union{Float32, Float64} in Base.Math at special/exp.jl:75, maybe you intended to call exp(x::Float64) in CUDAnative at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/device/cuda/math.jl:89 instead?
│ Stacktrace:
│ [1] exp at special/exp.jl:75
│ [2] randn_unlikely at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:49
│ [3] randn at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:165
│ [4] #23 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:49
└ @ CUDAnative ~/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:77
ERROR: LoadError: GPU compilation of #23(CuArrays.CuKernelState, CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},typeof(z),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Tuple{Bool,Bool},Tuple{Int64,Int64}},Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Tuple{Bool,Bool},Tuple{Int64,Int64}}}}) failed
KernelError: recursion is currently not supported
Try inspecting the generated code with any of the @device_code_... macros.
Stacktrace:
[1] randn_unlikely at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:49
[2] randn at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:37
[3] randn_unlikely at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:49
[4] randn at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Random/src/normal.jl:165
[5] #23 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:49
Stacktrace:
[1] (::getfield(CUDAnative, Symbol("#hook_emit_function#66")){CUDAnative.CompilerJob,Array{Core.MethodInstance,1}})(::Core.MethodInstance, ::Core.CodeInfo, ::UInt64) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:59
[2] compile_method_instance at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:97 [inlined]
[3] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:216 [inlined]
[4] irgen(::CUDAnative.CompilerJob, ::Core.MethodInstance, ::UInt64) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/irgen.jl:110
[5] #codegen#116(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Function, ::Symbol, ::CUDAnative.CompilerJob) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/TimerOutputs/7zSea/src/TimerOutput.jl:216
[6] #codegen at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/driver.jl:0 [inlined]
[7] #compile#115(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Function, ::Symbol, ::CUDAnative.CompilerJob) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/driver.jl:47
[8] #compile#114 at ./none:0 [inlined]
[9] compile at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/compiler/driver.jl:28 [inlined] (repeats 2 times)
[10] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/execution.jl:378 [inlined]
[11] #cufunction#156(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.cufunction), ::getfield(GPUArrays, Symbol("##23#24")), ::Type{Tuple{CuArrays.CuKernelState,CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},typeof(z),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Tuple{Bool,Bool},Tuple{Int64,Int64}},Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Float32,2,CUDAnative.AS.Global},Tuple{Bool,Bool},Tuple{Int64,Int64}}}}}}) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/execution.jl:347
[12] cufunction(::Function, ::Type) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/execution.jl:347
[13] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/execution.jl:174 [inlined]
[14] macro expansion at ./gcutils.jl:87 [inlined]
[15] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CUDAnative/wU0tS/src/execution.jl:171 [inlined]
[16] _gpu_call(::CuArrays.CuArrayBackend, ::Function, ::CuArray{Float32,2}, ::Tuple{CuArray{Float32,2},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64},Base.OneTo{Int64}},typeof(z),Tuple{Base.Broadcast.Extruded{CuArray{Float32,2},Tuple{Bool,Bool},Tuple{Int64,Int64}},Base.Broadcast.Extruded{CuArray{Float32,2},Tuple{Bool,Bool},Tuple{Int64,Int64}}}}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CuArrays/PwSdF/src/gpuarray_interface.jl:59
[17] gpu_call at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/abstract_gpu_interface.jl:151 [inlined]
[18] gpu_call at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/abstract_gpu_interface.jl:128 [inlined]
[19] copyto! at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/GPUArrays/CjRPU/src/broadcast.jl:48 [inlined]
[20] copyto! at ./broadcast.jl:797 [inlined]
[21] copy at ./broadcast.jl:773 [inlined]
[22] materialize at ./broadcast.jl:753 [inlined]
[23] broadcast(::typeof(z), ::CuArray{Float32,2}, ::CuArray{Float32,2}) at ./broadcast.jl:707
[24] ∇broadcast at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Tracker/rQ0eB/src/lib/array.jl:475 [inlined]
[25] materialize(::Base.Broadcast.Broadcasted{Tracker.TrackedStyle,Nothing,typeof(z),Tuple{TrackedArray{…,CuArray{Float32,2}},TrackedArray{…,CuArray{Float32,2}}}}) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Tracker/rQ0eB/src/lib/array.jl:506
[26] L̄(::CuArray{Float32,2}) at /afs/inf.ed.ac.uk/user/s16/s1672897/tmp/playground2.jl:40
[27] loss(::CuArray{Float32,2}) at /afs/inf.ed.ac.uk/user/s16/s1672897/tmp/playground2.jl:42
[28] #15 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Flux/U6ueY/src/optimise/train.jl:72 [inlined]
[29] gradient_(::getfield(Flux.Optimise, Symbol("##15#21")){typeof(loss),Tuple{CuArray{Float32,2}}}, ::Tracker.Params) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Tracker/rQ0eB/src/back.jl:97
[30] #gradient#24(::Bool, ::Function, ::Function, ::Tracker.Params) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Tracker/rQ0eB/src/back.jl:164
[31] gradient at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Tracker/rQ0eB/src/back.jl:164 [inlined]
[32] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Flux/U6ueY/src/optimise/train.jl:71 [inlined]
[33] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Juno/TfNYn/src/progress.jl:133 [inlined]
[34] #train!#12(::getfield(Flux, Symbol("#throttled#18")){getfield(Flux, Symbol("##throttled#10#14")){Bool,Bool,getfield(Main, Symbol("##7#8")),Int64}}, ::Function, ::Function, ::Tracker.Params, ::Base.Iterators.Zip{Tuple{Array{CuArray{Float32,2},1}}}, ::ADAM) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Flux/U6ueY/src/optimise/train.jl:69
[35] (::getfield(Flux.Optimise, Symbol("#kw##train!")))(::NamedTuple{(:cb,),Tuple{getfield(Flux, Symbol("#throttled#18")){getfield(Flux, Symbol("##throttled#10#14")){Bool,Bool,getfield(Main, Symbol("##7#8")),Int64}}}}, ::typeof(Flux.Optimise.train!), ::Function, ::Tracker.Params, ::Base.Iterators.Zip{Tuple{Array{CuArray{Float32,2},1}}}, ::ADAM) at ./none:0
[36] top-level scope at /afs/inf.ed.ac.uk/user/s16/s1672897/tmp/playground2.jl:56
[37] include at ./boot.jl:326 [inlined]
[38] include_relative(::Module, ::String) at ./loading.jl:1038
[39] include(::Module, ::String) at ./sysimg.jl:29
[40] exec_options(::Base.JLOptions) at ./client.jl:267
[41] _start() at ./client.jl:436
@MikeInnes Here is the VAE example I was talking about. I was originally mentioned here: https://github.com/FluxML/model-zoo/issues/20
Not sure if it has anything to do with the rand()
function, which usually need some care to send to GPU explicitly in other frameworks (e.g. Knet.jl, PyTorch).
Thanks for the patch!
Could you possibly comment out the using CuArrays
bit so it's easier to run out of the box on CPUs?
I could do that but I guess we need to figure out why the current is not working on GPU first.
I still haven't figured out why this thing doesn't run on GPUs :(
OK it works on GPU now. I did the following changes
- fix a bug in sending data to GPU
- remove broadcast ops in
logp_x_z
andkl_q_p
- I tested remove each and either alon doesn't work
@dhairyagandhi96 I aslo commented out using CuArrays
following your suggestion.