gpt4all icon indicating copy to clipboard operation
gpt4all copied to clipboard

Unable to load models after the Vulkan backend (I think)

Open mvenditto opened this issue 1 year ago • 2 comments

I'm trying to build from the latest main to later integrate the vulkan backend changes to the C# binding but I'm experiencing a weird problem.

The build is successful but when I try a minimal sample which only involves model creation (with buildVariant set to default or auto, NOT gpu) the program hangs after a few Vulkan exception (that do not surface).

What I did:

  1. built the backend
    • installed the Vulkand SDK
    • added the cmake arg: -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON
    • build successfully both from VS and manually
  2. Tested the following code:
#include "llmodel_c.h"

int main()
{
    auto err = new llmodel_error();

    llmodel_set_implementation_search_path("C:\Users\dev\Desktop\gpt4all\b6e38d6\gpt4all\gpt4all-backend\out\build\x64-Debug\bin");

    auto model = llmodel_model_create2(
        "C:\Users\dev\AppData\Local\nomic.ai\GPT4All\orca-mini-3b.ggmlv3.q4_0.bin", "auto",  err);
}
  1. Observed the program hang indefinitely

Some debugging info

After a bit of debugging, it seems that for some reason roughly the following happens (see the call stack below for more details):

  1. llmodel_model_create2() is invoked
  2. LLModel::Implementation::implementationList gets called to find the proper implementation for the model
  3. The first implementation dll to be tested (in this case libbert_avxonly) is loaded with Dlhandle::Dlhandle()
  4. The execution crashes when some Vulkan initialization code is triggered at gpt4all-backend\llama.cpp-mainline\kompute\src\Manager.cpp#L51
    I've tried to comment that line just to do a quick dirty test and the model actually gets created succesfully.

A few exception can be observed:

Exception thrown at 0x00007FFC4D234C3C in llmodel_test.exe: Microsoft C++ exception: VK::Exception at memory location 0x0000005DC21093F0.
Exception thrown at 0x00007FFC4D234C3C in llmodel_test.exe: Microsoft C++ exception: VK::Exception at memory location 0x0000005DC21093F0.
Exception thrown at 0x00007FFC4D234C3C in llmodel_test.exe: Microsoft C++ exception: VK::Exception at memory location 0x0000005DC21093F0.

Full Call Stack

Stack trace
 	nvoglv64.dll!00007ffb851a9d06()	Unknown
 	nvoglv64.dll!00007ffb851a760b()	Unknown
 	nvoglv64.dll!00007ffb851a59da()	Unknown
 	nvoglv64.dll!00007ffb8515610d()	Unknown
 	nvoglv64.dll!00007ffb85156ece()	Unknown
 	nvoglv64.dll!00007ffb855a9fb7()	Unknown
 	nvoglv64.dll!00007ffb855a8f69()	Unknown
 	vulkan-1.dll!00007ffbddbbec9a()	Unknown
 	nvoglv64.dll!00007ffb85507102()	Unknown
 	vulkan-1.dll!00007ffbddba8e25()	Unknown
 	vulkan-1.dll!00007ffbddbc6a3d()	Unknown
	kompute.dll!vk::DispatchLoaderStatic::vkCreateInstance(const VkInstanceCreateInfo * pCreateInfo, const VkAllocationCallbacks * pAllocator, VkInstance_T * * pInstance) Line 1322	C++
 	kompute.dll!vk::createInstance<:dispatchloaderstatic>(const vk::InstanceCreateInfo * pCreateInfo, const vk::AllocationCallbacks * pAllocator, vk::Instance * pInstance, const vk::DispatchLoaderStatic & d) Line 26	C++
 	kompute.dll!kp::Manager::createInstance() Line 234	C++
 	kompute.dll!kp::Manager::Manager() Line 51	C++
 	bert-avxonly.dll!`dynamic initializer for 'mgr''() Line 70	C++
 	ucrtbased.dll!00007ffbe01d22a3()	Unknown
 	bert-avxonly.dll!dllmain_crt_process_attach(HINSTANCE__ * const instance, void * const reserved) Line 66	C++
 	bert-avxonly.dll!dllmain_crt_dispatch(HINSTANCE__ * const instance, const unsigned long reason, void * const reserved) Line 219	C++
 	bert-avxonly.dll!dllmain_dispatch(HINSTANCE__ * const instance, const unsigned long reason, void * const reserved) Line 276	C++
 	bert-avxonly.dll!_DllMainCRTStartup(HINSTANCE__ * const instance, const unsigned long reason, void * const reserved) Line 335	C++
 	ntdll.dll!00007ffc4fe1868f()	Unknown
 	ntdll.dll!00007ffc4fe5d05d()	Unknown
 	ntdll.dll!00007ffc4fe5ce0e()	Unknown
 	ntdll.dll!00007ffc4fe1d61d()	Unknown
 	ntdll.dll!00007ffc4fe18930()	Unknown
 	ntdll.dll!00007ffc4fe08c9c()	Unknown
 	ntdll.dll!00007ffc4fe1a24a()	Unknown
 	KernelBase.dll!00007ffc4d1f5ef2()	Unknown
 	KernelBase.dll!00007ffc4d1f1f91()	Unknown
 	llmodel_test.exe!Dlhandle::Dlhandle(const std::string & fpath) Line 78	C++
 	llmodel_test.exe!LLModel::Implementation::implementationList::__l2::::()::__l2::::operator()(const std::string & paths) Line 97	C++
 	llmodel_test.exe!LLModel::Implementation::implementationList::__l2::::operator()() Line 115	C++
 	llmodel_test.exe!LLModel::Implementation::implementationList() Line 81	C++
 	llmodel_test.exe!LLModel::Implementation::implementation(std::basic_ifstream> & f, const std::string & buildVariant) Line 122	C++
 	llmodel_test.exe!LLModel::Implementation::construct(const std::string & modelPath, std::string buildVariant) Line 171	C++
 	llmodel_test.exe!llmodel_model_create2(const char * model_path, const char * build_variant, llmodel_error * error) Line 31	C++
 	llmodel_test.exe!main() Line 13	C++
 	llmodel_test.exe!invoke_main() Line 79	C++
 	llmodel_test.exe!__scrt_common_main_seh() Line 288	C++
 	llmodel_test.exe!__scrt_common_main() Line 331	C++
 	llmodel_test.exe!mainCRTStartup(void * __formal) Line 17	C++
 	kernel32.dll!00007ffc4e0826ad()	Unknown
 	ntdll.dll!00007ffc4fe4aa68()	Unknown

Am I doing something wrong or missing some build bits after the latest changes?

Thanks

mvenditto avatar Sep 03 '23 20:09 mvenditto