XLSX.jl
XLSX.jl copied to clipboard
EXCEPTION_ACCESS_VIOLATION
I am trying to write large excel file (.xlsx). It is supposed to have 4.3 MB. Excel data size is within excel column and row limits.
Smaller excels write ok (like 2MB), but with the largest I get the error.
I am using Win 10 x64 (fully updated) with Julia 1.4.1 64, and everything is the latestest and clean installed.
I hope you can help. Bellow is terminal print during execution.
Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks. Exception: EXCEPTION_ACCESS_VIOLATION at 0x7ffa27464740 -- memcpy at C:\WINDOWS\System32\msvcrt.dll (unknown line) in expression starting at D:\pglib\data_transfer.jl:28 memcpy at C:\WINDOWS\System32\msvcrt.dll (unknown line) read_buf at C:\Users\karlo.julia\artifacts\12dda53f058e2ad8360473e1df8d31d709724a38\bin\libz.dll (unknown line) fill_window at C:\Users\karlo.julia\artifacts\12dda53f058e2ad8360473e1df8d31d709724a38\bin\libz.dll (unknown line) deflate_slow at C:\Users\karlo.julia\artifacts\12dda53f058e2ad8360473e1df8d31d709724a38\bin\libz.dll (unknown line) deflate at C:\Users\karlo.julia\artifacts\12dda53f058e2ad8360473e1df8d31d709724a38\bin\libz.dll (unknown line) write at C:\Users\karlo.julia\packages\ZipFile\tuh8v\src\Zlib.jl:134 write at C:\Users\karlo.julia\packages\ZipFile\tuh8v\src\Zlib.jl:152 [inlined] write at C:\Users\karlo.julia\packages\ZipFile\tuh8v\src\Zlib.jl:190 [inlined] unsafe_write at .\io.jl:208 unknown function (ip: 0000000032DDA488) unsafe_write at C:\Users\karlo.julia\packages\ZipFile\tuh8v\src\ZipFile.jl:561 write at .\strings\io.jl:183 [inlined] print at .\strings\io.jl:185 [inlined] dump_node at C:\Users\karlo.julia\packages\EzXML\ZNwhK\src\node.jl:340 print at C:\Users\karlo.julia\packages\EzXML\ZNwhK\src\node.jl:310 [inlined] print at C:\Users\karlo.julia\packages\EzXML\ZNwhK\src\document.jl:55 [inlined] #writexlsx#35 at C:\Users\karlo.julia\packages\XLSX\ezOOQ\src\write.jl:67 writexlsx##kw at C:\Users\karlo.julia\packages\XLSX\ezOOQ\src\write.jl:48 [inlined] #openxlsx#14 at C:\Users\karlo.julia\packages\XLSX\ezOOQ\src\read.jl:133 unknown function (ip: 0000000032DA03F7) openxlsx##kw at C:\Users\karlo.julia\packages\XLSX\ezOOQ\src\read.jl:119 jl_apply at /cygdrive/d/buildbot/worker/package_win64/build/src\julia.h:1700 [inlined] do_call at /cygdrive/d/buildbot/worker/package_win64/build/src\interpreter.c:369 eval_value at /cygdrive/d/buildbot/worker/package_win64/build/src\interpreter.c:458 eval_stmt_value at /cygdrive/d/buildbot/worker/package_win64/build/src\interpreter.c:409 [inlined] eval_body at /cygdrive/d/buildbot/worker/package_win64/build/src\interpreter.c:799 jl_interpret_toplevel_thunk at /cygdrive/d/buildbot/worker/package_win64/build/src\interpreter.c:911 jl_toplevel_eval_flex at /cygdrive/d/buildbot/worker/package_win64/build/src\toplevel.c:814 jl_parse_eval_all at /cygdrive/d/buildbot/worker/package_win64/build/src\ast.c:872 include_string at .\loading.jl:1080 #200 at C:\Users\karlo.julia\packages\Atom\wlPiw\src\eval.jl:164 withpath at C:\Users\karlo.julia\packages\CodeTools\kosGY\src\utils.jl:30 withpath at C:\Users\karlo.julia\packages\Atom\wlPiw\src\eval.jl:9 #199 at C:\Users\karlo.julia\packages\Atom\wlPiw\src\eval.jl:161 [inlined] with_logstate at .\logging.jl:398 with_logger at .\logging.jl:505 [inlined] #198 at C:\Users\karlo.julia\packages\Atom\wlPiw\src\eval.jl:160 [inlined] hideprompt at C:\Users\karlo.julia\packages\Atom\wlPiw\src\repl.jl:140 macro expansion at C:\Users\karlo.julia\packages\Media\ItEPc\src\dynamic.jl:24 [inlined] evalall at C:\Users\karlo.julia\packages\Atom\wlPiw\src\eval.jl:150 jl_apply at /cygdrive/d/buildbot/worker/package_win64/build/src\julia.h:1700 [inlined] do_apply at /cygdrive/d/buildbot/worker/package_win64/build/src\builtins.c:643 macro expansion at C:\Users\karlo.julia\packages\Atom\wlPiw\src\eval.jl:39 [inlined] #172 at .\task.jl:358 unknown function (ip: 00000000198E4E53) jl_apply at /cygdrive/d/buildbot/worker/package_win64/build/src\julia.h:1700 [inlined] start_task at /cygdrive/d/buildbot/worker/package_win64/build/src\task.c:687 Allocations: 177090907 (Pool: 176573560; Big: 517347); GC: 569
Julia has exited. Press Enter to start a new session.
Can you post a code example that triggers the error?
I am supplying just part of the code. The real code has many more similar entries. The code uses only PowerModels and XLSX packages.
It parses a network with PowerModels. Then, it puts data into vectors with codes similar to this:
angmin = Vector()
push!(angmin, ["n$(n)" for (n,m) in keys(data[:buspairs])])
push!(angmin, ["n$(m)" for (n,m) in keys(data[:buspairs])])
push!(angmin, [data[:buspairs][(n,m)]["angmin"] for (n,m) in keys(data[:buspairs])])
After that there is only writing to the excel.
XLSX.openxlsx(destination_file, mode="w") do xf
XLSX.rename!(xf["Sheet1"], "Name")
sheet = xf["Name"]
sheet["A1"] = "Name"
sheet["A2"] = String(data[:name])
XLSX.addsheet!(xf, "Set1")
sheet = xf["Set1"]
sheet["A1"] = "arcs_from"
XLSX.writetable!(sheet, arcs_from,["l","n","n"], anchor_cell=XLSX.CellRef("A2"))
XLSX.addsheet!(xf, "Set2")
sheet = xf["Set2"]
sheet["A1"] = "load"
XLSX.writetable!(sheet, load,["lo"], anchor_cell=XLSX.CellRef("A2"))
end
The error occurs within the part where it writes excel. (PowerModel works fine. I checked it.)
I have found something useful in the zipfile release notes: Upgrade to Julia 1.3+, use JLL packages to provide binary dependencies (#59) (staticfloat)
Once I downgraded Julia to 1.2, everything works. 1.3 and 1.4 give me an error.
@KSepetanc thanks for the info! I´m sorry if I was not clear. Is it possible for you to provide a self-contained example that triggers the error? Hopefully some code that depends only on XLSX.
A file that reproduces an error is attached just rename ".txt" to ".jl" and avoid C:\ for excel destination due to Windows (10 x64) writing restrictions. The error occurs at the end of the writing which is something like after 4 minutes at 4GHz. If you don't get an error (which should happen), just increase a bit data size, e.g. +20 to 25%.
Writing times are long and I have a feeling they do not increase linearly with data size. Something could be done on that topic as well but that is separate issue.
@felipenoris can you replicate the bug?
@chris-b1, @fhs this might as well be ZipFile bug. It might be worth tagging there as well?
@KSepetanc I've created a PR for ZipFile v0.9.2: https://github.com/JuliaRegistries/General/pull/14950
@KSepetanc I can reproduce this, and looks like a windows issue. This works fine on macOS.
Windows env:
D:\src\XLSX.jl (master -> origin)
(XLSX) pkg> st
Project XLSX v0.7.0-dev
Status `D:\src\XLSX.jl\Project.toml`
[8f5d6c58] EzXML v1.1.0
[bd369af6] Tables v1.0.4
[a5390f91] ZipFile v0.9.2
[ade2ca70] Dates
[de0858da] Printf
julia> versioninfo()
Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
JULIA_NUM_THREADS = 8
I don't think the bug is in this package. But maybe https://github.com/felipenoris/XLSX.jl/issues/61 can get this fixed.
FYI, the above bug_report.txt file works fine for me on Windows. It took 420 seconds to write the file.
[......]
sheet["I1"] = "data"
XLSX.writetable!(sheet, data,["l","n","n"], anchor_cell=XLSX.CellRef("I2"))
sheet["M1"] = "data"
XLSX.writetable!(sheet, data,["l","n","n"], anchor_cell=XLSX.CellRef("M2"))
end
417.810974 seconds (122.59 M allocations: 108.247 GiB, 6.52% gc time)
julia> versioninfo()
Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
JULIA_BINDIR = C:\Julia-1.4.X\bin\
JULIA_HOME = C:\Julia-1.4.X\bin\
JULIA_EDITOR = "C:\Program Files\Microsoft VS Code\Code.exe"
JULIA_NUM_THREADS =
julia>
@kafisatz try to increase data size or simply try again. bug_report.txt is border case for this error to occur and occasionally it succeeds to write it. It is data size related. With small data sets there is no error.
As you have noticed writing time is very long, but the actual data set is not really that big. 2 MB excel file in the end I think.
@KSepetanc I will try larger sizes. Indeed this is VERY slow for such a tiny excel file. I have used a pyton library (with PyCall) in the past, which seems much faster.
On my computer for 20k rows in data I get these numbers. Allocations are huge, but the GC time does not seem very large at 6.5%
417.810974 seconds (122.59 M allocations: 108.247 GiB, 6.52% gc time)
EDIT: I can reproduce the above error with n=40000
@KSepetanc I assume you have meanwhile found a workaround for your use case. If not, here is a link to a very simple python wrapper I have used in the past. The runtests.jl contains an example similar to the one of bugreport.txt above. It runs in a few seconds (although the functionality is VERY limited) https://github.com/kafisatz/ExcelWriter.jl
@kafisatz yes this package is known to be very slow writing files. The original intent of this package was to only read Excel files. At some point I noticed that I could make it write Excel files on a non-destructive way (meaning that it would preserve any data or plot already existent in the Excel file), but this means that writing is very slow, as it most likely grow in exponential time with the amount cells written to the file.
#61 is to address this point. Unfortunately I'm lacking the resources (aka "a spare weekend") to work on this.
@kafisatz thank you for the link. I will try it out. :)
My current workaround is to run it on Julia 1.2. It is still slow but at least there is no bug. I think the computation time grows much more rapidly then linear with data size. At current state it is impossible to write much larger files.