openvdb
openvdb copied to clipboard
[BUG] double free or corruption when compressing VDBs
Environment
Operating System: Ubuntu 20.04 Version / Commit SHA: master ** Other ** : Using the Python interface
Describe the bug
I'm not quite sure how to describe this bug, I'm still hunting it and trying to develop a small test that always fails.
The problem is the following, I'm trying to write .vdb
from the python interface....and SOMETIMES, it does crash (double-free error). After chasing a bit I knew it was related to the blos-c
library but I couldn't get a debug backtrace so far. It looks like something is being freed there twice.
The last point on the OpenVDB library is: https://github.com/AcademySoftwareFoundation/openvdb/blob/ad209a385f13cad88315b114db1c728f1bbcd2dd/openvdb/openvdb/io/Compression.cc#L207
Since a raw pointer is being used (C-interface), I guess that this error might happen indeed.
I have a gut feeling that this might be related to the Python GIL, but I'm not quite sure. I'm opening this issue to keep track of the problems and hopefully help others out there that are also struggling with the same.
Typicall Error output
One case:
double free or corruption (out)
The other case:
free(): invalid size
To Reproduce
WIP ....
Additional context
Here is the full backtrace
>>> bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f83e61cd859 in __GI_abort () at abort.c:79
#2 0x00007f83e62383ee in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f83e6362285 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007f83e624047c in malloc_printerr (str=str@entry=0x7f83e6364670 "double free or corruption (out)") at malloc.c:5347
#4 0x00007f83e6242120 in _int_free (av=0x7f83e6393b80 <main_arena>, p=0x7f82100cbe10, have_lock=<optimized out>) at malloc.c:4314
#5 0x00007f823c69a1b0 in do_job () from /usr/local/lib/libblosc.so.1
#6 0x00007f823c69a45d in blosc_compress_context () from /usr/local/lib/libblosc.so.1
#7 0x00007f823c69a766 in blosc_compress_ctx () from /usr/local/lib/libblosc.so.1
#8 0x00007f8236f96684 in openvdb::v9_0::io::bloscToStreamSize(char const*, unsigned long, unsigned long) () from /usr/local/lib/libopenvdb.so.9.0
#9 0x00007f8236f85d79 in openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafTransformer<openvdb::v9_0::io::(anonymous namespace)::PopulateDelayedLoadMetadataOp::operator()<openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > >(openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > const&) const::{lambda(openvdb::v9_0::tree::LeafNode<float, 3u> const&, unsigned long)#1}>::operator()(openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange const&) const () from /usr/local/lib/libopenvdb.so.9.0
#10 0x00007f8236f86c88 in void tbb::interface9::internal::dynamic_grainsize_mode<tbb::interface9::internal::adaptive_mode<tbb::interface9::internal::auto_partition_type> >::work_balance<tbb::interface9::internal::start_for<openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange, openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafTransformer<openvdb::v9_0::io::(anonymous namespace)::PopulateDelayedLoadMetadataOp::operator()<openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > >(openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > const&) const::{lambda(openvdb::v9_0::tree::LeafNode<float, 3u> const&, unsigned long)#1}>, tbb::auto_partitioner const>, openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange>(tbb::interface9::internal::start_for<openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange, openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafTransformer<openvdb::v9_0::io::(anonymous namespace)::PopulateDelayedLoadMetadataOp::operator()<openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > >(openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > const&) const::{lambda(openvdb::v9_0::tree::LeafNode<float, 3u> const&, unsigned long)#1}>, tbb::auto_partitioner const>&, openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange&) () from /usr/local/lib/libopenvdb.so.9.0
#11 0x00007f8236f86f2e in tbb::interface9::internal::start_for<openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange, openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafTransformer<openvdb::v9_0::io::(anonymous namespace)::PopulateDelayedLoadMetadataOp::operator()<openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > >(openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > const&) const::{lambda(openvdb::v9_0::tree::LeafNode<float, 3u> const&, unsigned long)#1}>, tbb::auto_partitioner const>::execute() () from /usr/local/lib/libopenvdb.so.9.0
#12 0x00007f82281fb545 in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#13 0x00007f82281fb80f in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#14 0x00007f82281f4bd7 in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#15 0x00007f82281f3498 in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#16 0x00007f82281ef880 in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#17 0x00007f82281efa8d in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#18 0x00007f83e618e609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#19 0x00007f83e62ca293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
I found a way to reproduce this bug, I have a problematic "grid" that I serialized with pickle
to load it and reproduce this errors
import pickle
import pyopenvdb as vdb
grid = pickle.load(open("./grid.pkl", "rb"))
vdb.write("test.vdb", grid)
[grid.zip](https://github.com/AcademySoftwareFoundation/openvdb/files/8102779/grid.zip)
This gives:
double free or corruption (out)
[1] 4177624 abort (core dumped) ipython3
Please note that the grid is working "fine" once it's deserialized with the pickle interface:
import pickle
import pyopenvdb as vdb
grid = pickle.load(open("./grid.pkl", "rb"))
print(grid.evalActiveVoxelBoundingBox())
print(grid.metadata)
print(grid.evalMinMax())
Spits:
((625, 548, -17), (1704, 1791, 61))
{'class': 'level set', 'name': '000349'}
(-0.29852765798568726, 0.2999999225139618)
Hi @nachovizzo - I suspect this due to you trying to use a delay loaded byte streamed VDB into a vdb.write
call. If you try to write out a delay loaded grid as a byte stream (which is what I imagine happened with your plk
file originally) you will get into all sorts of weirdness. I would try your test again, but this time with the environment variable OPENVDB_DISABLE_DELAYED_LOAD=1
exported. With that set, regenerate your plk file and try your read/write test.
I suspect this due to you trying to use a delay loaded byte streamed VDB into a
vdb.wri
I'm not intentionally doing this :) So, this happens randomly when using the VDBFusion pipeline, sometimes when the python client is trying to write a vdb it just crashes to the above message. But it looks like it's not a problem of OpenVDB
What solved your problen @nachovizzo, do you still get the errors?
Hello @lthiet I haven't :)