fastllm
fastllm copied to clipboard
How to resolve Segment fault?
$ ftllm run Qwen/Qwen3-0.6B
Load libnuma.so.1
CPU Instruction Info: [AVX512F: ON] [AVX512_VNNI: ON] [AVX512_BF16: ON]
Load libfastllm_tools-cpu.so
Segmentation fault (core dumped)
There are generally two reasons: 1. A conflict with the environment—you can try creating a new empty virtual environment and reinstalling. 2. The model was not downloaded completely.
I'm new to fastllm, and was just testing this, but can't get past the segmentation fault issue.
Reproduction
Using fresh pytorch/pytorch:2.7.1-cuda12.8-cudnn9-devel Docker container on 8xH100 SXM (rented on Runpod):
pip install ftllm==0.1.2.0
ftllm server fastllm/DeepSeek-R1-0528-INT4 --port 3000 --host 0.0.0.0
2025-06-07 02:37:17,128 1770 server.py[line:106] INFO: Namespace(command='server', version=Fa
lse, model='fastllm/DeepSeek-R1-0528-INT4', path='', threads=-1, low=False, dtype='auto', moe
_dtype='', atype='auto', cuda_embedding=False, kv_cache_limit='auto', max_batch=-1, device=No
ne, moe_device='', moe_experts=-1, cache_history='', cache_fast='', enable_thinking='', custo
m='', lora='', cache_dir='', model_name='', host='0.0.0.0', port=3000, api_key='', think='fal
se', hide_input=False)
Model dir: /root/.cache/fastllm/fastllm/DeepSeek-R1-0528-INT4
Fetching repository metadata...
Generating download file list...
Starting download with aria2c...
...
...
...
Status Legend:
(OK):download completed.
Download completed successfully.
Load libnuma.so.1
CPU Instruction Info: [AVX512F: ON] [AVX512_VNNI: ON] [AVX512_BF16: ON]
Load libfastllm_tools.so
Segmentation fault (core dumped)
Ah 🤦 reinstalling in a new virtual environment as instructed above, solved the issue:
python3 -m venv .venv
source .venv/bin/activate
pip install ftllm==0.1.2.0
ftllm server fastllm/DeepSeek-R1-0528-INT4 --port 3000 --host 0.0.0.0
2025-06-07 03:53:15,020 2310 server.py[line:106] INFO: Namespace(command='server', version=False, model='fastllm/DeepSeek-R1-0528-INT4', path='', threads=-1, low=False, dtype='auto', moe_dtype='', atype='auto', cuda_embedding=False, kv_cache_limit='auto', max_batch=-1, device=None, moe_device='', moe_experts=-1, cache_history='', cache_fast='', enable_thinking='', custom='', lora='', cache_dir='', model_name='', host='0.0.0.0', port=3000, api_key='', think='false', hide_input=False)
Model dir: /root/.cache/fastllm/fastllm/DeepSeek-R1-0528-INT4
Load libnuma.so.1
CPU Instruction Info: [AVX512F: ON] [AVX512_VNNI: ON] [AVX512_BF16: ON]
Load libfastllm_tools.so
Loading 100
Warmup...
finish.
INFO: Started server process [2310]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:3000 (Press CTRL+C to quit)
Apologies - I originally tried reinstalling, but didn't do the virtual environment part.