localGPT
localGPT copied to clipboard
xformers can't load C++/CUDA extensions
I'm running this on apple silicon M2 and with the CPU flag on. after ingesting, it asks me for a query, then it gives me this error and just hangs:
(my_env) ➜ localGPT git:(main) ✗ python3 run_localGPT.py --device_type cpu
Running on: cpu
load INSTRUCTOR_Transformer
max_seq_length 512
Using embedded DuckDB with persistence: data will be stored in: /Users/matthewberman/Desktop/localGPT/DB
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████| 2/2 [00:51<00:00, 25.91s/it]
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.0.0 with CUDA None (you have 2.0.0)
Python 3.11.3 (you have 3.11.3)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
Enter a query: what is this document about?
/opt/homebrew/lib/python3.11/site-packages/transformers/generation/utils.py:1255: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
warnings.warn(
Can you try this PR #43 to see if it solves your issue? I don't have M2 so will be a good test.