[Segmentation fault] python3 torchchat.py export stories15M --dtype fp32 --quantize '{"embedding": {"bitwidth": 4, "groupsize":32}, "linear:a8w4dq": {"groupsize" : 256}}' --output-pte-path stories15M.pte
https://github.com/pytorch/torchchat/actions/runs/9047866134/job/24860312456?pr=751
This is a launch blocker for torchchat because it causes a fail for users following the example commands in our docs.
+ python3 torchchat.py export stories15M --dtype fp32 --quantize '{"embedding": {"bitwidth": 4, "groupsize":32}, "linear:a8w4dq": {"groupsize" : 256}}' --output-pte-path stories15M.pte
/opt/homebrew/Caskroom/miniconda/base/envs/test-quantization-mps-macos/lib/python3.10/site-packages/executorch/exir/emit/_emitter.py:1474: UserWarning: Mutation on a buffer in the model is detected. ExecuTorch assumes buffers that are mutated in the graph have a meaningless initial state, only the shape and dtype will be serialized.
warnings.warn(
Using device=cpu
Loading model...
Time to load model: 0.01 seconds
Quantizing the model with: {'embedding': {'bitwidth': 4, 'groupsize': 32}, 'linear:a8w4dq': {'groupsize': 256}}
Time to quantize model: 7.83 seconds
Exporting model using ExecuTorch to /Users/ec2-user/runner/_work/torchchat/torchchat/pytorch/torchchat/stories15M.pte
The methods are: {'forward'}
+ python3 generate.py stories15M --pte-path stories15M.pte --prompt 'Hello my name is'
[program.cpp:130] InternalConsistency verification requested but not available
[method.cpp:939] Overriding output data pointer allocated by memory plan is not allowed.
./run-quantization.sh: line 27: 18269 Segmentation fault: 11 python3 generate.py stories15M --pte-path stories15M.pte --prompt "Hello my name is"
Error: Process completed with exit code 1.
Also https://github.com/pytorch/torchchat/actions/runs/9054732211/job/24874908070?pr=768
thanks for reporting.
@mikekgfb tried reproducing locally. but can't so far. is it reproducible for you consistently or happened randomly?
Consistently reproducible both in ci and locally
I wonder if this is caused by the CI flow exporting to the same file name and there being some collision with multiple threads exporting to the same named .pte file. And when running the model, there was some corruption with the file causing segfault.
do you mind sharing the model artifact causing seg fault? Can help with jumpstarting the debug for this.
I wonder if this is caused by the CI flow exporting to the same file name and there being some collision with multiple threads exporting to the same named .pte file. And when running the model, there was some corruption with the file causing segfault.
do you mind sharing the model artifact causing seg fault? Can help with jumpstarting the debug for this.
I don't think we use multithreading? That being said this works now.