autofaiss icon indicating copy to clipboard operation
autofaiss copied to clipboard

Windows parallelization

Open verakumova opened this issue 2 years ago • 7 comments

Hi! Thank you for the great project! Unfortunately I'm experiencing some issues, which could be caused by Windows (10 Pro) and I'm not sure how to solve them.

I installed autofaiss with conda into a new env with Python 3.6. First, I had problems with import: ImportError: DLL load failed while importing _swigfaiss: The specified module could not be found.

I solved that by first installing openblass, numpy and faiss from conda-forge: conda create --name faiss_env python=3.6 conda activate faiss_env conda install conda-forge::blas=*=openblas conda install -c conda-forge numpy conda install -c conda-forge faiss pip install autofaiss

Then I tried to run the example from README, but I have encountered an error in embedding_reader:

~\.conda\envs\faiss_env\lib\site-packages\embedding_reader\get_file_list.py in _get_file_list(path, file_format, sort_result)
     42     path = make_path_absolute(path)
     43     fs, path_in_fs = fsspec.core.url_to_fs(path)
---> 44     prefix = path[: path.index(path_in_fs)]
ValueError: substring not found

I found out that the problem is in the fsspec.core.url_to_fs method, namely in the private method _strip_protocol on the line 402 in fsspec\core.py: urlpath = fs._strip_protocol(url) This line changes backward slashes to forward slashes and therefore the substring path_in_fs is not found in the string path.

Now comes the incomprehensible part: when I changed the private method _strip_protocol to general method strip_protocol (I only deleted the leading underscore), the ValueError disapeared and the function preserved backward slashes in the path... but then another error appeared: RuntimeError: Error in __cdecl faiss::FileIOWriter::FileIOWriter(const char *) at D:\a\faiss-wheels\faiss-wheels\faiss\faiss\impl\io.cpp:98: Error: 'f' failed: could not open C:\Users\USER\AppData\Local\Temp\tmp2jqscc1t for writing: Permission denied

This seems to me like the problem with parallelization and I don't know how to solve it. I suppose that the solution of the ValueError was not the correct one and there is still some problem with Windows implementation.

Can you give me some advice how to find out a solution to this?

Thanks!

verakumova avatar Mar 16 '22 12:03 verakumova

Faiss is not supported on windows so you probably need to use Linux. You could consider WSL2 for example

On Wed, Mar 16, 2022, 13:00 verakumova @.***> wrote:

Hi! Thank you for the great project! Unfortunately I'm experiencing some issues, which could be caused by Windows (10 Pro) and I'm not sure how to solve them.

I installed autofaiss with conda into a new env with Python 3.6. First, I had problems with import: ImportError: DLL load failed while importing _swigfaiss: The specified module could not be found.

I solved that by first installing openblass, numpy and faiss from conda-forge: conda create --name faiss_env python=3.6 conda activate faiss_env conda install conda-forge::blas=*=openblas conda install -c conda-forge numpy conda install -c conda-forge faiss pip install autofaiss

Then I tried to run the example from README, but I have encountered an error in embedding_reader:

~.conda\envs\faiss_env\lib\site-packages\embedding_reader\get_file_list.py in _get_file_list(path, file_format, sort_result) 42 path = make_path_absolute(path) 43 fs, path_in_fs = fsspec.core.url_to_fs(path) ---> 44 prefix = path[: path.index(path_in_fs)] ValueError: substring not found

I found out that the problem is in the fsspec.core.url_to_fs method, namely in the private method _strip_protocol on the line 402 in fsspec\core.py: urlpath = fs._strip_protocol(url) This line changes backward slashes to forward slashes and therefore the substring path_in_fs is not found in the substring path.

Now comes the incomprehensible part: when I changed the private method _strip_protocol to general method strip_protocol (I only deleted the leading underscore), the ValueError disapeared and the function preserved backward slashes in the path... but then another error appeared: RuntimeError: Error in __cdecl faiss::FileIOWriter::FileIOWriter(const char *) at D:\a\faiss-wheels\faiss-wheels\faiss\faiss\impl\io.cpp:98: Error: 'f' failed: could not open C:\Users\USER\AppData\Local\Temp\tmp2jqscc1t for writing: Permission denied

This seems to me like the problem with parallelization and I don't know how to solve it. I suppose that the solution of the ValueError was not the correct one and there is still some problem with Windows implementation.

Can you give me some advice how to find out a solution to this?

Thanks!

— Reply to this email directly, view it on GitHub https://github.com/criteo/autofaiss/issues/113, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437SH6IVLKDT2SYHOAW3VAHEPBANCNFSM5Q3U2KXQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

rom1504 avatar Mar 16 '22 12:03 rom1504

Thank you for the quick reply. Actually, I'm able to run faiss on windows - I have the problem only with autofaiss. But from your reply I conclude that autofaiss is not supported on windows.. And would you happen to know why the change from private to public method 'helped'?

verakumova avatar Mar 17 '22 12:03 verakumova

Can you try using a virtual env instead of conda ? The build that is tested and that works with autofaiss is on pypi, which is a different build than the conda one

On Thu, Mar 17, 2022, 14:00 verakumova @.***> wrote:

Thank you for the quick reply. Actually, I'm able to run faiss on windows

  • I have the problem only with autofaiss. But from your reply I conclude that autofaiss is not supported on windows.. And would you happen to know why the change from private to public method 'helped'?

— Reply to this email directly, view it on GitHub https://github.com/criteo/autofaiss/issues/113#issuecomment-1070897683, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437Q6BXSTVC7NS3YSZKDVAMUFNANCNFSM5Q3U2KXQ . You are receiving this because you commented.Message ID: @.***>

rom1504 avatar Mar 17 '22 13:03 rom1504

That's because NamedTemporaryFile will delete the temp folder before faiss access. For more infor in python documentation. To fix this, simply pass delete=False to NamedTemporaryFile parameters in get_index_size func of autofaiss.

tientr avatar Sep 14 '22 16:09 tientr

Ah the specific problem about temporary dir is that autofaiss writes there and apparently you don't have permission to write there Can you check why?

You're probably the first one to try autofaiss on windows

On Thu, Mar 17, 2022, 14:09 Romain Beaumont @.***> wrote:

Can you try using a virtual env instead of conda ? The build that is tested and that works with autofaiss is on pypi, which is a different build than the conda one

On Thu, Mar 17, 2022, 14:00 verakumova @.***> wrote:

Thank you for the quick reply. Actually, I'm able to run faiss on windows

  • I have the problem only with autofaiss. But from your reply I conclude that autofaiss is not supported on windows.. And would you happen to know why the change from private to public method 'helped'?

— Reply to this email directly, view it on GitHub https://github.com/criteo/autofaiss/issues/113#issuecomment-1070897683, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437Q6BXSTVC7NS3YSZKDVAMUFNANCNFSM5Q3U2KXQ . You are receiving this because you commented.Message ID: @.***>

rom1504 avatar Oct 11 '22 07:10 rom1504

@rom1504 that because NamedTemporaryFile of python (read more here ).

Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows). If delete is true (the default), the file is deleted as soon as it is closed.

Autofaiss run just fine on Windows. Tested with anaconda.

tientr avatar Oct 11 '22 10:10 tientr