xformers can't load C++/CUDA extensions

Open mberman84 opened this issue 2 years ago • 1 comments

I'm running this on apple silicon M2 and with the CPU flag on. after ingesting, it asks me for a query, then it gives me this error and just hangs:

(my_env) ➜  localGPT git:(main) ✗ python3 run_localGPT.py --device_type cpu
Running on: cpu
load INSTRUCTOR_Transformer
max_seq_length  512
Using embedded DuckDB with persistence: data will be stored in: /Users/matthewberman/Desktop/localGPT/DB
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████| 2/2 [00:51<00:00, 25.91s/it]
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
    PyTorch 2.0.0 with CUDA None (you have 2.0.0)
    Python  3.11.3 (you have 3.11.3)
  Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
  Memory-efficient attention, SwiGLU, sparse and more won't be available.
  Set XFORMERS_MORE_DETAILS=1 for more details

Enter a query: what is this document about?
/opt/homebrew/lib/python3.11/site-packages/transformers/generation/utils.py:1255: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
  warnings.warn(

Jun 02 '23 23:06 mberman84

Can you try this PR #43 to see if it solves your issue? I don't have M2 so will be a good test.

Jun 03 '23 03:06 PromtEngineer