FFTW.jl
FFTW.jl copied to clipboard
Error when using pmap and plan_fft!
Version info:
julia> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i5-10400 CPU @ 2.90GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, skylake)
Environment:
JULIA_HOME = C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\bin
JULIA_NUM_THREADS = 3
JULIA_PKG_SERVER = https://mirrors.tuna.tsinghua.edu.cn/julia
julia> using FFTW
julia> FFTW.get_provider()
"fftw"
Code to reproduce:
using Distributed
using ClusterManagers
addprocs(2)
@everywhere begin
using FFTW
FFTW.set_num_threads(1)
##
N=round(Int, 2048)
##
Tpx=plan_fft(rand(ComplexF64,N))
##
function mapfunc(Tpx)
test2->Tpx* test2
end
f=mapfunc(Tpx)
end
##
test2=[rand(ComplexF64,N), rand(ComplexF64,N)]
F=pmap(f,test2)
print(F)
##
Error message:
julia debug.jl
From worker 2:
From worker 2: Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
From worker 2: Exception: EXCEPTION_ACCESS_VIOLATION at 0x54496564 -- .text at C:\Users\dell\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3-3.dll (unknown line)
From worker 2: in expression starting at none:0
From worker 2: .text at C:\Users\dell\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3-3.dll (unknown line)
From worker 2: unsafe_execute! at C:\Users\dell\.julia\packages\FFTW\SDUwi\src\fft.jl:466 [inlined]
From worker 2: * at C:\Users\dell\.julia\packages\FFTW\SDUwi\src\fft.jl:721 [inlined]
From worker 2: #3 at F:\Simulation\Trap simulation\2layers trap\BEM2Layers\ElectronTrapPost\DC_1D\th2\debug.jl:13
From worker 2: unknown function (ip: 0000000052d93c66)
From worker 2: jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
From worker 2: do_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\builtins.c:713
From worker 2: #106 at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:278
From worker 2: run_work_thunk at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:63
From worker 2: macro expansion at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:278 [inlined]
From worker 2: #105 at .\task.jl:423
From worker 2: unknown function (ip: 0000000052d858a3)
From worker 2: jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
From worker 2: start_task at /cygdrive/c/buildbot/worker/package_win64/build/src\task.c:877
From worker 2: Allocations: 7967560 (Pool: 7964384; Big: 3176); GC: 10
From worker 3:
From worker 3: Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
From worker 3: Exception: EXCEPTION_ACCESS_VIOLATION at 0x544e6564 -- .text at C:\Users\dell\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3-3.dll (unknown line)
From worker 3: in expression starting at none:0
From worker 3: .text at C:\Users\dell\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3-3.dll (unknown line)
From worker 3: unsafe_execute! at C:\Users\dell\.julia\packages\FFTW\SDUwi\src\fft.jl:466 [inlined]
From worker 3: * at C:\Users\dell\.julia\packages\FFTW\SDUwi\src\fft.jl:721 [inlined]
From worker 3: #3 at F:\Simulation\Trap simulation\2layers trap\BEM2Layers\ElectronTrapPost\DC_1D\th2\debug.jl:13
From worker 3: unknown function (ip: 0000000052df4566)
From worker 3: jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
From worker 3: do_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\builtins.c:713
From worker 3: #106 at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:278
From worker 3: run_work_thunk at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:63
From worker 3: macro expansion at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:278 [inlined]
From worker 3: #105 at .\task.jl:423
From worker 3: unknown function (ip: 0000000052de61a3)
From worker 3: jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
From worker 3: start_task at /cygdrive/c/buildbot/worker/package_win64/build/src\task.c:877
From worker 3: Allocations: 7968181 (Pool: 7965007; Big: 3174); GC: 10
Worker 2 terminated.
ERROR: LoadError: Worker 3 terminated.ProcessExitedException
(2)
Stacktrace:
[1] (::Base.var"#892#894")(x::Task)
@ Base .\asyncmap.jl:177
[2] foreach(f::Base.var"#892#894", itr::Vector{Any})
@ Base .\abstractarray.jl:2694
[3] maptwice(wrapped_f::Function, chnl::Channel{Any}, worker_tasks::Vector{Any}, c::Vector{Vector{ComplexF64}})
@ Base .\asyncmap.jl:177
[4] wrap_n_exec_twice
@ .\asyncmap.jl:153 [inlined]
[5] #async_usemap#877
@ .\asyncmap.jl:103 [inlined]
[6] #asyncmap#876
@ .\asyncmap.jl:81 [inlined]
[7] pmap(f::Function, p::WorkerPool, c::Vector{Vector{ComplexF64}}; distributed::Bool, batch_size::Int64, on_error::Nothing, retry_delays::Vector{Any}, retry_check::Nothing)
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\pmap.jl:126
[8] pmap(f::Function, p::WorkerPool, c::Vector{Vector{ComplexF64}})
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\pmap.jl:101
[9] pmap(f::Function, c::Vector{Vector{ComplexF64}}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\pmap.jl:156
[10] pmap(f::Function, c::Vector{Vector{ComplexF64}})
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\pmap.jl:156
[11] top-level scope
@ F:\Simulation\Trap simulation\2layers trap\BEM2Layers\ElectronTrapPost\DC_1D\th2\debug.jl:19
in expression starting at F:\Simulation\Trap simulation\2layers trap\BEM2Layers\ElectronTrapPost\DC_1D\th2\debug.jl:19
Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
[1] (::Base.var"#wait_locked#645")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
@ Base .\stream.jl:892
[2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
@ Base .\stream.jl:900
[3] unsafe_read
@ .\io.jl:724 [inlined]
[4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
@ Base .\io.jl:723
[5] read!
@ .\io.jl:725 [inlined]
[6] deserialize_hdr_raw
@ C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\messages.jl:167 [inlined]
[7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:165
[8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:126
[9] (::Distributed.var"#99#100"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
@ Distributed .\task.jl:423
Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
[1] (::Base.var"#wait_locked#645")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
@ Base .\stream.jl:892
[2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
@ Base .\stream.jl:900
[3] unsafe_read
@ .\io.jl:724 [inlined]
[4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
@ Base .\io.jl:723
[5] read!
@ .\io.jl:725 [inlined]
[6] deserialize_hdr_raw
@ C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\messages.jl:167 [inlined]
[7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:165
[8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:126
[9] (::Distributed.var"#99#100"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
@ Distributed .\task.jl:423
Without FFTW but with an anoymous function, works good
using Distributed
using ClusterManagers
addprocs(2)
@everywhere begin
##
N=2048
Tpx=1
##
function mapfunc(Tpx)
test2->Tpx * test2
end
f=mapfunc(Tpx)
end
##
test2=[rand(ComplexF64,N), rand(ComplexF64,N)]
F=pmap(f,test2)
print("Completed")
##
julia debug.jl
Completed
With FFTW without anoymous function, works good too
using Distributed
using ClusterManagers
addprocs(2)
@everywhere begin
using FFTW
FFTW.set_num_threads(1)
##
N=round(Int, 2048)
##
Tpx=plan_fft(rand(ComplexF64,N))
##
# function mapfunc(Tpx)
# test2->Tpx* test2
# end
f(test2)=Tpx* test2
end
##
test2=[rand(ComplexF64,N), rand(ComplexF64,N)]
F=pmap(f,test2)
print("Completed")
##
julia debug.jl
Completed
With FFTW and anoymous function, same error again
using Distributed
using ClusterManagers
addprocs(2)
@everywhere begin
using FFTW
FFTW.set_num_threads(1)
##
N=round(Int, 2048)
##
Tpx=plan_fft(rand(ComplexF64,N))
##
function mapfunc(Tpx)
test2->Tpx* test2
end
f=mapfunc(Tpx)
end
##
test2=[rand(ComplexF64,N), rand(ComplexF64,N)]
F=pmap(f,test2)
print("Completed")
##
julia debug.jl
From worker 3:
From worker 3: Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
From worker 2:
From worker 2: Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
From worker 3: Exception: EXCEPTION_ACCESS_VIOLATION at 0x54486564 -- .text at C:\Users\dell\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3-3.dll (unknown line)
From worker 3: in expression starting at none:0
From worker 3: .text at C:\Users\dell\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3-3.dll (unknown line)
From worker 2: Exception: EXCEPTION_ACCESS_VIOLATION at 0x54496564 -- .text at C:\Users\dell\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3-3.dll (unknown line)
From worker 2: in expression starting at none:0
From worker 2: .text at C:\Users\dell\.julia\artifacts\b7dd1809d0626eac3bf6f97ba8ccfbb6cc63c509\bin\libfftw3-3.dll (unknown line)
From worker 2: unsafe_execute! at C:\Users\dell\.julia\packages\FFTW\SDUwi\src\fft.jl:466 [inlined]
From worker 2: * at C:\Users\dell\.julia\packages\FFTW\SDUwi\src\fft.jl:721 [inlined]
From worker 2: #3 at F:\Simulation\Trap simulation\2layers trap\BEM2Layers\ElectronTrapPost\DC_1D\th2\debug.jl:13
From worker 2: unknown function (ip: 0000000052d93c66)
From worker 3: unsafe_execute! at C:\Users\dell\.julia\packages\FFTW\SDUwi\src\fft.jl:466 [inlined]
From worker 3: * at C:\Users\dell\.julia\packages\FFTW\SDUwi\src\fft.jl:721 [inlined]
From worker 3: #3 at F:\Simulation\Trap simulation\2layers trap\BEM2Layers\ElectronTrapPost\DC_1D\th2\debug.jl:13
From worker 3: unknown function (ip: 0000000052d94566)
From worker 3: jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
From worker 3: do_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\builtins.c:713
From worker 3: #106 at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:278
From worker 3: run_work_thunk at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:63
From worker 3: macro expansion at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:278 [inlined]
From worker 3: #105 at .\task.jl:423
From worker 3: unknown function (ip: 0000000052d861a3)
From worker 2: jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
From worker 2: do_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\builtins.c:713
From worker 3: jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
From worker 3: start_task at /cygdrive/c/buildbot/worker/package_win64/build/src\task.c:877
From worker 3: Allocations: 7966979 (Pool: 7963802; Big: 3177); GC: 10
From worker 2: #106 at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:278
From worker 2: run_work_thunk at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:63
From worker 2: macro expansion at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:278 [inlined]
From worker 2: #105 at .\task.jl:423
From worker 2: unknown function (ip: 0000000052d858a3)
From worker 2: jl_apply at /cygdrive/c/buildbot/worker/package_win64/build/src\julia.h:1788 [inlined]
From worker 2: start_task at /cygdrive/c/buildbot/worker/package_win64/build/src\task.c:877
From worker 2: Allocations: 7967560 (Pool: 7964384; Big: 3176); GC: 10
Worker 3 terminated.
ERROR: LoadError: Worker 2 terminated.ProcessExitedException
(2)
Stacktrace:
[1] (::Base.var"#892#894")(x::Task)
@ Base .\asyncmap.jl:177
[2] foreach(f::Base.var"#892#894", itr::Vector{Any})
@ Base .\abstractarray.jl:2694
[3] maptwice(wrapped_f::Function, chnl::Channel{Any}, worker_tasks::Vector{Any}, c::Vector{Vector{ComplexF64}})
@ Base .\asyncmap.jl:177
[4] wrap_n_exec_twice
@ .\asyncmap.jl:153 [inlined]
[5] #async_usemap#877
@ .\asyncmap.jl:103 [inlined]
[6] #asyncmap#876
@ .\asyncmap.jl:81 [inlined]
[7] pmap(f::Function, p::WorkerPool, c::Vector{Vector{ComplexF64}}; distributed::Bool, batch_size::Int64, on_error::Nothing, retry_delays::Vector{Any}, retry_check::Nothing)
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\pmap.jl:126
[8] pmap(f::Function, p::WorkerPool, c::Vector{Vector{ComplexF64}})
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\pmap.jl:101
[9] pmap(f::Function, c::Vector{Vector{ComplexF64}}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\pmap.jl:156
[10] pmap(f::Function, c::Vector{Vector{ComplexF64}})
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\pmap.jl:156
[11] top-level scope
@ Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
[1] (::Base.var"#wait_locked#645")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
@ Base .\stream.jl:892
[2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
@ Base .\stream.jl:900
[3] unsafe_read
@ .\io.jl:724 [inlined]
[4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
@ Base .\io.jl:723
[5] read!
@ .\io.jl:725 [inlined]
[6] deserialize_hdr_raw
@ C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\messages.jl:167 [inlined]
[7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:165
[8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:126
[9] (::Distributed.var"#99#100"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
@ Distributed .\task.jl:423
Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
[1] (::Base.var"#wait_locked#645")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
@ Base .\stream.jl:892
[2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
@ Base .\stream.jl:900
[3] unsafe_read
@ .\io.jl:724 [inlined]
[4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
@ Base .\io.jl:723
[5] read!
@ .\io.jl:725 [inlined]
[6] deserialize_hdr_raw
@ C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\messages.jl:167 [inlined]
[7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:165
[8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
@ Distributed C:\Users\dell\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Distributed\src\process_messages.jl:126
[9] (::Distributed.var"#99#100"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
@ Distributed .\task.jl:423
F:\Simulation\Trap simulation\2layers trap\BEM2Layers\ElectronTrapPost\DC_1D\th2\debug.jl:19
in expression starting at F:\Simulation\Trap simulation\2layers trap\BEM2Layers\ElectronTrapPost\DC_1D\th2\debug.jl:19
For the record, this issue has been cross-posted at https://github.com/JuliaLang/julia/issues/45406
Sorry for that I just think it may relative to package Distributed and FFTW, so I posted it at two places
It's always a good practice to leave a reference when you cross-post issues, to avoid multiple people wasting their times independently, instead of working together in a single place.
(copying from Slack)
I think it's because you're communicating plans between workers, which you're not allowed to do (https://discourse.julialang.org/t/fft-plan-cant-be-sent-between-processes/877). The function f that you pass to pmap uses the Tpx defined on the root processor, not the workers, so you'd wanna fix that.
When I ran into this issue I just hacked up this to make my life easier, although it's not necessarily the best option if you want more control over your plans: https://github.com/gaurav-arya/WavePropagation.jl/blob/main/src/planned_fft.jl
I know your point, but what makes these two fs below different to pmap?
@everywhere begin
function mapfunc(Tpx)
test2->Tpx* test2
end
f=mapfunc(Tpx)
end
@everywhere f(test2)=Tpx* test2
Seems like this package should add some checks to make sure the plan isn't a C_NULL pointer