the problem in CUDA 11.4 and how to use pip install Faiss in CUDA 11?

Open fanguu opened this issue 2 years ago • 10 comments

Summary

Faiss assertion 'err == CUBLAS_STATUS_SUCCESS' failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<float, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<IndexType, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with AT = float; BT = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at /__w/faiss-wheels/faiss-wheels/faiss/faiss/gpu/utils/MatrixMult-inl.cuh:265; details: cublas failed (13): (512, 128) x (60000, 128)' = (512, 60000) gemm params m 60000 n 512 k 128 trA T trB N lda 128 ldb 128 ldc 60000

Platform

OS: ubuntu 20.04, RTX 3090

Faiss version: faiss-gpu-1.7.1.post2

Installed from: <pip python 3.7

Running on:

[ ] CPU
[ x] GPU

Interface:

[ ] C++
[ x] Python

Reproduction instructions

Aug 30 '21 13:08 fanguu

What are you trying to do when it fails?

Sep 01 '21 16:09 mdouze

@mdouze I met the same problem. It seems the problem appears when I tried to use clustering on GPU. I followed this example 3 from https://www.programcreek.com/python/example/112284/faiss.Clustering. Could you pls take a look? Thanks.

Dec 23 '21 06:12 wetliu

@mdouze I met the same problem. It seems the problem appears when I tried to use clustering on GPU. I followed this example 3 from https://www.programcreek.com/python/example/112284/faiss.Clustering. Could you pls take a look? Thanks.

I have met the same problem and look forward to your reply!!!

Dec 28 '21 02:12 mqwfrog

Also with a RTX 3090 ?

Jan 19 '22 11:01 mdouze

RTX A6000

Jan 19 '22 16:01 wetliu

I ran into the same problem when running on NVIDIA A100. I am using faiss-1.7.1 installed by pip.

Jan 21 '22 02:01 Saltychtao

When I was training the index on 2xRTX3090 gpus using around 10m vectors as train_gpu_script suggests, I ran into the same error as follows:

Faiss assertion 'err == CUBLAS_STATUS_SUCCESS' failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<float, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<In dexType, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with AT = float; BT = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at /project/faiss/faiss/gpu/u tils/MatrixMult-inl.cuh:265; details: cublas failed (13): (512, 256) x (262144, 256)' = (512, 262144) gemm params m 262144 n 512 k 256 trA T trB N lda 256 ldb 256 ldc 262144

Jan 25 '22 07:01 xuzhangda-patsnap

Any fixes that work?

Feb 08 '22 19:02 griff4692

Facing the same problem on A100. Attempts of reducing batch size doesn't seem to help since there are many documents in the index. It looks like an OOM, no problem if I disable --faiss-use-gpu but it runs super slowly.

Faiss assertion 'err == CUBLAS_STATUS_SUCCESS' failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<float, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<IndexType, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with AT = __half; BT = __half; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at /project/faiss/faiss/gpu/utils/MatrixMult-inl.cuh:265; 
details: cublas failed (13): (2, 768) x (524288, 768)' = (2, 524288) gemm params m 524288 n 2 k 768 trA T trB N lda 768 ldb 768 ldc 524288

My installations:

torch 1.7.1+cu110
faiss-cpu 1.7.2
faiss-gpu 1.7.2

Any workaround to this?

Mar 25 '22 22:03 memray

Did anyone find a solution for it?

Jul 20 '22 07:07 athithya-raj

faiss faiss copied to clipboard

the problem in CUDA 11.4 and how to use pip install Faiss in CUDA 11?

Summary

Platform

Reproduction instructions

faiss
faiss copied to clipboard