tokenizers
tokenizers copied to clipboard
Error: ThreadPoolBuildError
Hi,
I am using sentence-transformers which works well on the local machine but on the cloud(Compute Canada) throwing error ThreadPoolBuildError, The global thread pool has not been initialized. Q: How do initialize the thread pool? is it related to tokenizer?
Environment: Compute Canada Python: 3.9.6
Facing error in compute Canada with memory 20000MB
Package versions huggingface-hub 0.4.0 nltk 3.6.7 numpy 1.21.2+computecanada pandas 1.3.0+computecanada scikit-learn 1.0.1+computecanada scipy 1.7.3+computecanada sentence-transformers 2.1.0+computecanada sentencepiece 0.1.96+computecanada tokenizers 0.10.3+computecanada torch 1.10.0+computecanada torchvision 0.11.1+computecanada transformers 4.16.1
CODE: from sentence_transformers import SentenceTransformer, util import torch model = SentenceTransformer('all-MiniLM-L6-v2')
text = ['The cat sits outside']
text_embeddings = model.encode(text)
Error:
Ignored unknown kwarg option direction
thread 'RUST_BACKTRACE=1
environment variable to display a backtrace
Traceback (most recent call last):
File "
Can you try using your script with TOKENIZERS_PARALLELISM=false
enabled ?
This should deactivate parallelism within tokenizers and removing the error.
If this works, you can write your code single threaded first, and try to move to multi threaded/multi processed directly in Python, so it can be easier for you to understand what goes on.
I don't really know the computecanada
environment, is there more information out there ? Is it linux based, Posix, that sort of thing ? Maybe parallelism is actually disabled ?
I got the same problem but setting TOKENIZERS_PARALLELISM=false didn't solve it for me.. Has anybody been able to solve it differently please? x(
@sarrahbbh Please re-open an issue with the appropriate details to reproduce it
@Narsil actually using a smaller number of GPUs than the ones available in my VM solved the issue for me (I guess this is the way the framework I used is implemented?) but thank you anyway!