arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[C++] Importing an extension type without `ARROW:extension:metadata` that was registered with metadata crashes

Open paleolimbot opened this issue 2 months ago • 0 comments

Describe the bug, including details regarding any error messages, version, and platform.

I ran into this when running tests in geoarrow-c, which includes an implementation of the GeoArrow extension types for Arrow C++ ( https://github.com/geoarrow/geoarrow-c/pull/94 ). The sequence of events that triggers this is:

  • Extension type registered that contains both ARROW:extension:name and ARROW:extension:metadata
  • ImportType() from the C data interface with an extension type that only contains ARROW:extension:name .

The backtrace from C++ that throws the exception is:

libc++abi.dylib!__cxa_throw (Unknown Source:0)
libarrow.1500.2.0.dylib!std::__1::__throw_length_error[abi:ue170006](char const*) (Unknown Source:0)
libarrow.1500.2.0.dylib!std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__throw_length_error[abi:ue170006]() const (Unknown Source:0)
libarrow.1500.2.0.dylib!std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__append(unsigned long) (Unknown Source:0)
libarrow.1500.2.0.dylib!arrow::KeyValueMetadata::DeleteMany(std::__1::vector<long long, std::__1::allocator<long long>>) (Unknown Source:0)
libarrow.1500.2.0.dylib!arrow::(anonymous namespace)::SchemaImporter::DoImport() (Unknown Source:0)
libarrow.1500.2.0.dylib!arrow::ImportType(ArrowSchema*) (Unknown Source:0)
geoarrow_arrow_test!ArrowTest_ArrowTestExtensionTypeRegister_Test::TestBody() (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/src/geoarrow/geoarrow_arrow_test.cc:116)
geoarrow_arrow_test!void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/src/gtest.cc:2607)
geoarrow_arrow_test!void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/src/gtest.cc:2643)
geoarrow_arrow_test!testing::Test::Run() (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/src/gtest.cc:2682)
geoarrow_arrow_test!testing::TestInfo::Run() (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/src/gtest.cc:2861)
geoarrow_arrow_test!testing::TestSuite::Run() (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/src/gtest.cc:3015)
geoarrow_arrow_test!testing::internal::UnitTestImpl::RunAllTests() (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/src/gtest.cc:5855)
geoarrow_arrow_test!bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/src/gtest.cc:2607)
geoarrow_arrow_test!bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/src/gtest.cc:2643)
geoarrow_arrow_test!testing::UnitTest::Run() (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/src/gtest.cc:5438)
geoarrow_arrow_test!RUN_ALL_TESTS() (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/include/gtest/gtest.h:2490)
geoarrow_arrow_test!main (/Users/deweydunnington/Desktop/rscratch/geoarrow-c/build/_deps/googletest-src/googletest/src/gtest_main.cc:52)
start (Unknown Source:0)

Reproducer in Python:

import pyarrow as pa

# pip install --extra-index-url https://pypi.fury.io/arrow-nightlies/ \
#     --prefer-binary --pre nanoarrow
import nanoarrow as na

class DummyExtType(pa.ExtensionType):
    def __init__(self):
        super().__init__(pa.null(), "arrow.test")

    @classmethod
    def __arrow_ext_deserialize__(cls, storage_type, serialized):
        return DummyExtType()

    def __arrow_ext_serialize__(self):
        return b"{}"

pa.register_extension_type(DummyExtType())

# Works!
na_schema = na.extension_type(na.null(), "arrow.test", b"{}")
print(na_schema.metadata)
#> <nanoarrow._lib.SchemaMetadata>
#> - b'ARROW:extension:name': b'arrow.test'
#> - b'ARROW:extension:metadata': b'{}'
na_array = na.c_array([], na_schema)
pa.array(na_array)
#> <pyarrow.lib.ExtensionArray object at 0x105aee080>
#> 0 nulls

# Crashes
na_schema = na.extension_type(na.null(), "arrow.test")
print(na_schema.metadata)
#> <nanoarrow._lib.SchemaMetadata>
#> - b'ARROW:extension:name': b'arrow.test'
na_array = na.c_array([], na_schema)
# pa.array(na_array)
#> The Kernel crashed while executing code in the current cell or a previous cell.

Component(s)

C++

paleolimbot avatar May 20 '24 16:05 paleolimbot