openvdb icon indicating copy to clipboard operation
openvdb copied to clipboard

[BUG] double free or corruption when compressing VDBs

Open nachovizzo opened this issue 3 years ago • 2 comments

Environment

Operating System: Ubuntu 20.04 Version / Commit SHA: master ** Other ** : Using the Python interface

Describe the bug

I'm not quite sure how to describe this bug, I'm still hunting it and trying to develop a small test that always fails. The problem is the following, I'm trying to write .vdb from the python interface....and SOMETIMES, it does crash (double-free error). After chasing a bit I knew it was related to the blos-c library but I couldn't get a debug backtrace so far. It looks like something is being freed there twice.

The last point on the OpenVDB library is: https://github.com/AcademySoftwareFoundation/openvdb/blob/ad209a385f13cad88315b114db1c728f1bbcd2dd/openvdb/openvdb/io/Compression.cc#L207

Since a raw pointer is being used (C-interface), I guess that this error might happen indeed.

I have a gut feeling that this might be related to the Python GIL, but I'm not quite sure. I'm opening this issue to keep track of the problems and hopefully help others out there that are also struggling with the same.

Typicall Error output

One case:

double free or corruption (out)

The other case:

free(): invalid size

To Reproduce

WIP ....

Additional context

Here is the full backtrace

>>> bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f83e61cd859 in __GI_abort () at abort.c:79
#2  0x00007f83e62383ee in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f83e6362285 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x00007f83e624047c in malloc_printerr (str=str@entry=0x7f83e6364670 "double free or corruption (out)") at malloc.c:5347
#4  0x00007f83e6242120 in _int_free (av=0x7f83e6393b80 <main_arena>, p=0x7f82100cbe10, have_lock=<optimized out>) at malloc.c:4314
#5  0x00007f823c69a1b0 in do_job () from /usr/local/lib/libblosc.so.1
#6  0x00007f823c69a45d in blosc_compress_context () from /usr/local/lib/libblosc.so.1
#7  0x00007f823c69a766 in blosc_compress_ctx () from /usr/local/lib/libblosc.so.1
#8  0x00007f8236f96684 in openvdb::v9_0::io::bloscToStreamSize(char const*, unsigned long, unsigned long) () from /usr/local/lib/libopenvdb.so.9.0
#9  0x00007f8236f85d79 in openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafTransformer<openvdb::v9_0::io::(anonymous namespace)::PopulateDelayedLoadMetadataOp::operator()<openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > >(openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > const&) const::{lambda(openvdb::v9_0::tree::LeafNode<float, 3u> const&, unsigned long)#1}>::operator()(openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange const&) const () from /usr/local/lib/libopenvdb.so.9.0
#10 0x00007f8236f86c88 in void tbb::interface9::internal::dynamic_grainsize_mode<tbb::interface9::internal::adaptive_mode<tbb::interface9::internal::auto_partition_type> >::work_balance<tbb::interface9::internal::start_for<openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange, openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafTransformer<openvdb::v9_0::io::(anonymous namespace)::PopulateDelayedLoadMetadataOp::operator()<openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > >(openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > const&) const::{lambda(openvdb::v9_0::tree::LeafNode<float, 3u> const&, unsigned long)#1}>, tbb::auto_partitioner const>, openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange>(tbb::interface9::internal::start_for<openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange, openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafTransformer<openvdb::v9_0::io::(anonymous namespace)::PopulateDelayedLoadMetadataOp::operator()<openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > >(openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > const&) const::{lambda(openvdb::v9_0::tree::LeafNode<float, 3u> const&, unsigned long)#1}>, tbb::auto_partitioner const>&, openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange&) () from /usr/local/lib/libopenvdb.so.9.0
#11 0x00007f8236f86f2e in tbb::interface9::internal::start_for<openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafRange, openvdb::v9_0::tree::LeafManager<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > const>::LeafTransformer<openvdb::v9_0::io::(anonymous namespace)::PopulateDelayedLoadMetadataOp::operator()<openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > >(openvdb::v9_0::Grid<openvdb::v9_0::tree::Tree<openvdb::v9_0::tree::RootNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::InternalNode<openvdb::v9_0::tree::LeafNode<float, 3u>, 4u>, 5u> > > > const&) const::{lambda(openvdb::v9_0::tree::LeafNode<float, 3u> const&, unsigned long)#1}>, tbb::auto_partitioner const>::execute() () from /usr/local/lib/libopenvdb.so.9.0
#12 0x00007f82281fb545 in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#13 0x00007f82281fb80f in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#14 0x00007f82281f4bd7 in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#15 0x00007f82281f3498 in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#16 0x00007f82281ef880 in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#17 0x00007f82281efa8d in ?? () from /lib/x86_64-linux-gnu/libtbb.so.2
#18 0x00007f83e618e609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#19 0x00007f83e62ca293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

nachovizzo avatar Feb 19 '22 18:02 nachovizzo

I found a way to reproduce this bug, I have a problematic "grid" that I serialized with pickle to load it and reproduce this errors

import pickle
import pyopenvdb as vdb

grid = pickle.load(open("./grid.pkl", "rb"))
vdb.write("test.vdb", grid)
[grid.zip](https://github.com/AcademySoftwareFoundation/openvdb/files/8102779/grid.zip)

This gives:

double free or corruption (out)
[1]    4177624 abort (core dumped)  ipython3

Please note that the grid is working "fine" once it's deserialized with the pickle interface:

import pickle
import pyopenvdb as vdb

grid = pickle.load(open("./grid.pkl", "rb"))

print(grid.evalActiveVoxelBoundingBox())
print(grid.metadata)
print(grid.evalMinMax())

Spits:

((625, 548, -17), (1704, 1791, 61))
{'class': 'level set', 'name': '000349'}
(-0.29852765798568726, 0.2999999225139618)

nachovizzo avatar Feb 19 '22 18:02 nachovizzo

Hi @nachovizzo - I suspect this due to you trying to use a delay loaded byte streamed VDB into a vdb.write call. If you try to write out a delay loaded grid as a byte stream (which is what I imagine happened with your plk file originally) you will get into all sorts of weirdness. I would try your test again, but this time with the environment variable OPENVDB_DISABLE_DELAYED_LOAD=1 exported. With that set, regenerate your plk file and try your read/write test.

Idclip avatar Aug 03 '22 18:08 Idclip

I suspect this due to you trying to use a delay loaded byte streamed VDB into a vdb.wri

I'm not intentionally doing this :) So, this happens randomly when using the VDBFusion pipeline, sometimes when the python client is trying to write a vdb it just crashes to the above message. But it looks like it's not a problem of OpenVDB

nachovizzo avatar Oct 04 '22 07:10 nachovizzo

What solved your problen @nachovizzo, do you still get the errors?

lthiet avatar Feb 05 '23 14:02 lthiet

Hello @lthiet I haven't :)

nachovizzo avatar Feb 06 '23 08:02 nachovizzo