nexa-sdk icon indicating copy to clipboard operation
nexa-sdk copied to clipboard

[Bug] Access Violation (0xC0000005) in `VLM.from_()` when running via PyInstaller packaged EXE on Windows

Open dalnel opened this issue 1 month ago • 4 comments

Description I am encountering a critical issue where the application crashes immediately when initializing the VLM model. This occurs only when running the application packaged with PyInstaller on Windows 10/11.

The code runs perfectly in the development environment (Anaconda), but fails in the deployed EXE environment, specifically at the VLM.from_() call.

Environment

  • OS: Windows 10 / Windows 11
  • Environment: PyInstaller Packaged EXE (One-dir mode)
  • Hardware: Target machine is CPU-only (No GPU drivers installed).
  • Library Version: (Please insert your nexaai version here, e.g., 0.0.x)

Symptoms

  • Importing nexaai modules works fine.
  • Checking model file paths works fine.
  • Calling VLM.from_() causes an immediate Access Violation (Exit code 0xC0000005).
  • The crash prevents any downstream functionality relying on NexaAI.

Investigation & Debugging Attempts I have performed extensive troubleshooting to isolate the issue:

  1. DLL Path Management:

    • In the entry script, I verified that _internal, nexaai, and numpy.libs subdirectories are correctly added to PATH and loaded via os.add_dll_directory().
    • Verified that nexa_bridge.dll and other dependencies are present in the package.
  2. Forcing CPU/Hardware Modes:

    • Tried setting: CUDA_VISIBLE_DEVICES=-1 , NEXA_FORCE_CPU_ONLY=1 , device_id='cpu' , and plugin_id="cpu_gpu".
    • The crash persists regardless of these settings.
    • Confirmed this is not related to PDF processing (fitz) or subprocess logic; it is a pure NexaAI initialization failure.
  3. WinDbg Crash Dump Analysis: I analyzed the crash dump (LAUNCH.exe.7764.dmp) using WinDbg. The call stack indicates the failure happens inside the native bridge:

    nexa_bridge!ml_tts_synthesize
    -> nexa_bridge!ml_vlm_create
    -> vlm_bind_cp310_win_amd64!PyInit_vlm_bind
    -> msvcp140!Thrd_yield+0x140  <-- Access Violation
    

    This confirms VLM.from_() enters the NexaAI native bridge and crashes during an internal thread yield/lock operation.

  4. Environment Variables:

    • Attempted setting NEXA_PLUGIN_PATH and KMP_DUPLICATE_LIB_OK, but neither resolved the crash.

Steps to Reproduce

  1. Package the project using pyinstaller LAUNCH.spec.
  2. Deploy the folder to a Windows machine (specifically one without GPU drivers/CUDA).
  3. Run a simple test script via the EXE:
    from nexaai.vlm import VLM
    # ... setup paths ...
    # This line causes the crash:
    model = VLM.from_(...)
    

Questions & Help Needed

  1. Runtime Dependencies: Does nexa_bridge.dll require specific runtimes (e.g., Vulkan SDK, specific CUDA libraries, or MSVC redists) to be present on the host machine, even when running in CPU-only mode? PyInstaller might be missing a hidden dependency.
  2. Native Binding Issue: Is there a known issue with vlm_bind when running outside a standard Python site-packages structure?
  3. Crash Dump: I have the .dmp file available. If needed, I can provide it or further logs to help debug this native crash.

Any advice on how to stabilize the initialization in a frozen/packaged environment would be greatly appreciated.

dalnel avatar Nov 21 '25 03:11 dalnel

Thanks for the detailed report!

To help us reproduce and investigate the crash, could you please provide:

  1. The nexaai version you’re using:
pip show nexaai
  1. The exact model you are trying to load with VLM.from_().

  2. A minimal Python script that reproduces the crash.

In the meantime, you can try running the model using the repository’s test script to isolate whether it’s a PyInstaller issue:

cd bindings/python
python .\vlm.py --model ggml-org/gemma-3-4b-it-GGUF/gemma-3-4b-it-Q4_K_M.gguf --plugin-id cpu_gpu --device cpu

This should run the VLM in a standard Python environment. If it works here but fails in the frozen EXE, it likely points to missing runtime dependencies or DLL issues in the packaged build.

mengshengwu avatar Nov 21 '25 07:11 mengshengwu

Thanks for the detailed report!

To help us reproduce and investigate the crash, could you please provide:

  1. The nexaai version you’re using:

pip show nexaai 2. The exact model you are trying to load with VLM.from_(). 3. A minimal Python script that reproduces the crash.

In the meantime, you can try running the model using the repository’s test script to isolate whether it’s a PyInstaller issue:

cd bindings/python python .\vlm.py --model ggml-org/gemma-3-4b-it-GGUF/gemma-3-4b-it-Q4_K_M.gguf --plugin-id cpu_gpu --device cpu This should run the VLM in a standard Python environment. If it works here but fails in the frozen EXE, it likely points to missing runtime dependencies or DLL issues in the packaged build.

nexaai version Image

pyinstaller version

Image

I try the this vlm.py , it can run, and i used my code, it also can run

import os
from nexaai.vlm import VLM
from nexaai.common import GenerationConfig, ModelConfig, MultiModalMessage, MultiModalMessageContent

# 初始化模型
model_name = r"C:\Users\user\Desktop\Qwen3-VL-4B-Instruct-GGUF\Qwen3-VL-4B-Instruct.Q4_K.gguf"
mmproj_path = r"C:\Users\user\Desktop\Qwen3-VL-4B-Instruct-GGUF\mmproj.F16.gguf"
m_cfg = ModelConfig() 
vlm = VLM.from_(name_or_path=model_name, m_cfg=m_cfg, plugin_id="cpu_gpu", mmproj_path=mmproj_path, device_id="cpu")

# 準備對話(包含圖片)
conversation = [
    MultiModalMessage(role="user", content=[
        MultiModalMessageContent(type="text", text="幫我列出,ProductName(品名)、Model(型號)、UnitPrice(單價)、Quantity(數量),並以JSON的格式輸出給我"),
        MultiModalMessageContent(type="image", path=r'C:\Users\lidaniel\Desktop\123.png')
    ]),
    MultiModalMessage(role="system",
                        content=[MultiModalMessageContent(type="text", text="You are a helpful assistant that can understand images and text.")])
]

# 生成回應
prompt = vlm.apply_chat_template(conversation)
image_paths = [r'C:\Users\user\Desktop\123.png']
print("res")
for token in vlm.generate_stream(prompt, g_cfg=GenerationConfig(max_tokens=1024, image_paths=image_paths)):
    print(token, end="", flush=True)
print("\n[finish]")

dalnel avatar Nov 21 '25 07:11 dalnel

We haven’t officially tested or supported packaging the SDK with PyInstaller yet. The crash you’re seeing at VLM.from_() is likely related to PyInstaller configuration or missing runtime dependencies in the packaged EXE.

mengshengwu avatar Nov 21 '25 14:11 mengshengwu

@zhycheng614 Do you have any ideas or suggestions?

mengshengwu avatar Nov 21 '25 14:11 mengshengwu