torch-mlir osx delocate doesn't seem to handle dynamically loaded libs

Latest delocate'd lib fails with

libc++abi: terminating with uncaught exception of type c10::Error: Type c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > could not be converted to
 any of the known types.                                                                                                                                                                                                 
Exception raised from operator() at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/jit_type.h:1735 (most recent call first):                                                                              
frame #0: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 92 (0x1144d5e40 in libc10.dylib)           
frame #1: c10::detail::getTypePtr_<c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > >::call()::'lambda'()::operator()() const + 304 (0x15befa85c i
n libtorch_cpu.dylib)                                                                                                                                                                                                    
frame #2: c10::Type::SingletonOrSharedTypePtr<c10::Type> c10::getTypePtrCopy<c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > >() + 32 (0x15befa5b
4 in libtorch_cpu.dylib)                                                                                                                                                                                                 
frame #3: c10::detail::infer_schema::(anonymous namespace)::createArgumentVector(c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 188 (0x15b93f7d4 in libtorch_cpu.dylib)                                        
frame #4: c10::detail::infer_schema::make_function_schema(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::al
locator<char> >&&, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 96 (0x15b93f584 in libtorch_cpu.dylib)                                                
frame #5: c10::detail::infer_schema::make_function_schema(c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>, c10::ArrayRef<c10::detail::infer_schema::ArgumentDef>) + 60 (0x15b93fadc in libtorch_cpu.dylib)         
frame #6: std::__1::unique_ptr<c10::FunctionSchema, std::__1::default_delete<c10::FunctionSchema> > c10::detail::inferFunctionSchemaFromFunctor<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c1
0::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long long)>() + 76 (0x15bf697d0 in libtorch_cpu.dylib)                                                                         
frame #7: torch::CppFunction::CppFunction<at::Tensor (at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long long)>(at:
:Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long long), std::__1::enable_if<c10::guts::is_function_ty
pe<at::Tensor (at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long long)>::value, std::nullptr_t>::type) + 132 (0x15
bf696f4 in libtorch_cpu.dylib)                                                                                                                                                                                           
frame #8: at::native::(anonymous namespace)::TORCH_LIBRARY_IMPL_init_quantized_QuantizedCPU_4(torch::Library&) + 40 (0x15bf67a2c in libtorch_cpu.dylib)                                                                  
frame #9: torch::detail::TorchLibraryInit::TorchLibraryInit(torch::Library::Kind, void (*)(torch::Library&), char const*, c10::optional<c10::DispatchKey>, char const*, unsigned int) + 208 (0x15b7cd0e4 in libtorch_cpu.
dylib)                                                                                                                                                                                                                   
frame #10: _GLOBAL__sub_I_qconv.cpp + 88 (0x15bf6d3b8 in libtorch_cpu.dylib)                                                                                                                                             
frame #11: invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 164 (0x100e19d7c in dyld)                                                                             
frame #12: invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 168 (0x100e42f40
 in dyld)                                                                                                                                                                                                                
frame #13: invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 528 (0x100e39bc0 in dyld)                                   
frame #14: dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 168 (0x100e05f98 in dyld)                                                                         
frame #15: dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 192 (0x100e39968 in dyld)                                                                    
frame #16: dyld3::MachOAnalyzer::forEachInitializerPointerSection(Diagnostics&, void (unsigned int, unsigned int, unsigned char const*, bool&) block_pointer) const + 148 (0x100e42870 in dyld)                          
frame #17: dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 432 (0x100e42b70 in dyld)                        
frame #18: dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 172 (0x100e19cbc in dyld)                                                                                                              
frame #19: dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 216 (0x100e19e68 in dyld)                                                                           
frame #20: dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 180 (0x100e19e44 in dyld)                                                                           
frame #21: dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 180 (0x100e19e44 in dyld)                                                                           
frame #22: dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 180 (0x100e19e44 in dyld)                                                                           
frame #23: dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 180 (0x100e19e44 in dyld)                                                                           
frame #24: dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 124 (0x100e19f34 in dyld)                                                                                                 
frame #25: dyld4::APIs::dlopen_from(char const*, int, void*) + 520 (0x100e299e4 in dyld)                    
frame #26: _imp_create_dynamic + 1852 (0x1007ff7dc in python3.9)                                                                                                                                                         
frame #27: cfunction_vectorcall_FASTCALL + 208 (0x10071427c in python3.9)                                                                                                                                                
frame #28: _PyEval_EvalFrameDefault + 30088 (0x1007cc078 in python3.9)                                                                                                                                                   
frame #29: _PyEval_EvalCode + 2968 (0x1007c44a8 in python3.9)                                               
frame #30: _PyFunction_Vectorcall + 240 (0x1006bfe64 in python3.9)

May 13 '22 06:05 powderluv

Is this a dylib issue?

edit: sorry, I parsed the error message wrong -- still, it seems like an issue in PyTorch itself, rather than in Torch-MLIR.

May 13 '22 14:05 silvasean

yeah I think delocate can't find the quant library that is opened at runtime ??

Anyway this is now moot since the builder builds a perfectly installable universal binary on both M1 and Intel macOS. Closing this for now.

May 14 '22 23:05 powderluv

going to leave it open since I may have just seen it on my Intel macOS

May 14 '22 23:05 powderluv

This seems to be because of the weak linking of torch symbols. https://github.com/pytorch/pytorch/issues/48452

In our package we have:

site-packages % find . -name '*.dylib' | grep torch   
./torchvision/.dylibs/libz.1.2.11.dylib
./torchvision/.dylibs/libpng16.16.dylib
./torchvision/.dylibs/libc++.1.0.dylib
./torchvision/.dylibs/libjpeg.9.dylib
./torch/lib/libtorch_python.dylib
./torch/lib/libtorch.dylib
./torch/lib/libtorch_global_deps.dylib
./torch/lib/libtorch_cpu.dylib
./torch/lib/libc10.dylib
./torch/lib/libshm.dylib
./torch_mlir/.dylibs/libtorch_python.dylib
./torch_mlir/.dylibs/libtorch.dylib
./torch_mlir/.dylibs/libomp.dylib
./torch_mlir/.dylibs/libtorch_cpu.dylib
./torch_mlir/.dylibs/libc10.dylib
./torch_mlir/.dylibs/libshm.dylib
./torch_mlir/_mlir_libs/libTorchMLIRAggregateCAPI.dylib

Since we build on an x96_64 system the files in .dylibs are a no-op on Apple silicon and they resolve well and work ok. On Intel they cause the double init issue.

We probably have to link against a static version of libtorch.

Jun 07 '22 05:06 powderluv

Workaroud

# Replace mlir_venv with whatever your venv is
cd mlir_venv/lib/python3.10/site-packages/torch_mlir/.dylibs
rm *.dylib
ln -s ../../torch/lib/libc10.dylib
ln -s ../../torch/lib/libshm.dylib
ln -s ../../torch/lib/libtorch.dylib
ln -s ../../torch/lib/libtorch_cpu.dylib
ln -s ../../torch/lib/libtorch_python.dylib

Jun 07 '22 06:06 powderluv

@powderluv did we fix this with the static build?

Oct 07 '22 14:10 silvasean

This is still an issue on x86 macOS builds. But we don't care about those builds right now so ok to leave it closed.

Oct 07 '22 15:10 powderluv

I am experiencing this issue in an environment, where two .plugins try to load the same dylibs. The first plugin load successfully, but the second fails with the mentioned errors.

Oct 26 '22 16:10 spiegelball

Workaroud

# Replace mlir_venv with whatever your venv is
cd mlir_venv/lib/python3.10/site-packages/torch_mlir/.dylibs
rm *.dylib
ln -s ../../torch/lib/libc10.dylib
ln -s ../../torch/lib/libshm.dylib
ln -s ../../torch/lib/libtorch.dylib
ln -s ../../torch/lib/libtorch_cpu.dylib
ln -s ../../torch/lib/libtorch_python.dylib

The issue is still there on Intel Macs, and the workaround still work. Just don't delete all .dylib and -f to replace the one that need to be linked.

ln -s -f ../../torch/lib/libc10.dylib
ln -s -f ../../torch/lib/libshm.dylib
ln -s -f ../../torch/lib/libtorch.dylib
ln -s -f ../../torch/lib/libtorch_cpu.dylib
ln -s -f ../../torch/lib/libtorch_python.dylib

Jul 16 '23 13:07 trulyspinach

Workaroud

# Replace mlir_venv with whatever your venv is
cd mlir_venv/lib/python3.10/site-packages/torch_mlir/.dylibs
rm *.dylib
ln -s ../../torch/lib/libc10.dylib
ln -s ../../torch/lib/libshm.dylib
ln -s ../../torch/lib/libtorch.dylib
ln -s ../../torch/lib/libtorch_cpu.dylib
ln -s ../../torch/lib/libtorch_python.dylib

The issue is still there on Intel Macs, and the workaround still work. Just don't delete all .dylib and -f to replace the one that need to be linked.

ln -s -f ../../torch/lib/libc10.dylib
ln -s -f ../../torch/lib/libshm.dylib
ln -s -f ../../torch/lib/libtorch.dylib
ln -s -f ../../torch/lib/libtorch_cpu.dylib
ln -s -f ../../torch/lib/libtorch_python.dylib

Have done this but still getting the same error? After replacing, how do i re-compile?

Aug 24 '23 07:08 Striker770

torch-mlir torch-mlir copied to clipboard

osx delocate doesn't seem to handle dynamically loaded libs

torch-mlir
torch-mlir copied to clipboard