infinity
infinity copied to clipboard
Error in offline mode with `trust_remote code`: SFR-Embedding-Mistral and nomic does not work without `einops`
Model description
You have mentioned that sfr-embedding model is supported along with all other huggingface embedding models (ref.nomic). However, both are not working : infinity | ERROR 2024-03-21 14:35:59,554 infinity_emb ERROR: acceleration.py:21 infinity | BetterTransformer is not available for model. The infinity | model type mistral is not yet supported to be used infinity | with BetterTransformer. Feel free to open an issue infinity | at https://github.com/huggingface/optimum/issues if infinity | you would like this model type to be supported. infinity | Currently supported models are: dict_keys(['albert', infinity | 'bark', 'bart', 'bert', 'bert-generation', infinity | 'blenderbot', 'bloom', 'camembert', 'blip-2', infinity | 'clip', 'codegen', 'data2vec-text', 'deit', infinity | 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2', infinity | 'gptj', 'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm', infinity | 'm2m_100', 'marian', 'markuplm', 'mbart', 'opt', infinity | 'pegasus', 'rembert', 'prophetnet', 'roberta', infinity | 'roc_bert', 'roformer', 'splinter', 'tapas', 't5', infinity | 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2', infinity | 'xlm-roberta', 'yolos']).. Continue without infinity | bettertransformer modeling code. infinity | Traceback (most recent call last): infinity | File infinity | "/app/infinity_emb/transformer/acceleration.py", infinity | line 19, in to_bettertransformer infinity | model = BetterTransformer.transform(model) infinity | File "/usr/lib/python3.10/contextlib.py", line 79, infinity | in inner infinity | return func(*args, **kwds) infinity | File infinity | "/app/.venv/lib/python3.10/site-packages/optimum/bet infinity | tertransformer/transformation.py", line 234, in infinity | transform infinity | raise NotImplementedError( infinity | NotImplementedError: The model type mistral is not infinity | yet supported to be used with BetterTransformer.
Open source status
- [ ] The model implementation is available on transformers
- [ ] The model weights are available on huggingface-hub
- [ ] I verified that the model is currently not running in infinity
Provide useful links for the implementation
No response
Thanks for opening the issue. Did you really try to get nomic running?
I would not be concerned about the stacktrace of
infinity | NotImplementedError: The model type mistral is not
Its just a info warning, that says that the optimum package already uses a better attention implementation for mistral, and no better one is available.
nomic
python3 -m venv venv
source ./venv/bin/activate
pip install infinity_emb[all]
pip install einops # einops is a package required just by the custom code of nomic.
infinity_emb --model-name-or-path nomic-ai/nomic-embed-text-v1.5
(.venv) (base) michael@michael-laptop:~/infinity/libs/infinity_emb$ infinity_emb --model-name-or-path nomic-ai/nomic-embed-text-v1.5
INFO: Started server process [426215]
INFO: Waiting for application startup.
INFO 2024-03-30 09:31:45,673 infinity_emb INFO: model=`nomic-ai/nomic-embed-text-v1.5` selected, using engine=`torch` and device=`None` select_model.py:54
INFO 2024-03-30 09:31:46,118 sentence_transformers.SentenceTransformer INFO: Load pretrained SentenceTransformer: SentenceTransformer.py:107
nomic-ai/nomic-embed-text-v1.5
model.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 547M/547M [00:25<00:00, 21.9MB/s]
WARNING 2024-03-30 09:32:14,036 modeling_hf_nomic_bert.py:357
transformers_modules.nomic-ai.nomic-embed-text-v1-unsupervised.3916676c856f1e25a4cc7a4e0ac740ea6ca9723a.modeling_hf_nomic_bert
WARNING: <All keys matched successfully>
tokenizer_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 1.19k/1.19k [00:00<00:00, 8.19MB/s]
vocab.txt: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 232k/232k [00:00<00:00, 1.87MB/s]
tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 711k/711k [00:00<00:00, 2.94MB/s]
special_tokens_map.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 695/695 [00:00<00:00, 5.14MB/s]
1_Pooling/config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 286/286 [00:00<00:00, 1.99MB/s]
INFO 2024-03-30 09:32:16,061 sentence_transformers.SentenceTransformer INFO: Use pytorch device_name: cuda SentenceTransformer.py:213
INFO 2024-03-30 09:32:16,502 infinity_emb INFO: Adding optimizations via Huggingface optimum. acceleration.py:17
ERROR 2024-03-30 09:32:16,503 infinity_emb ERROR: BetterTransformer is not available for model. The model type nomic_bert is not yet supported acceleration.py:21
to be used with BetterTransformer. Feel free to open an issue at https://github.com/huggingface/optimum/issues if you would like this
model type to be supported. Currently supported models are: dict_keys(['albert', 'bark', 'bart', 'bert', 'bert-generation', 'blenderbot',
'bloom', 'camembert', 'blip-2', 'clip', 'codegen', 'data2vec-text', 'deit', 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2', 'gptj',
'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm', 'm2m_100', 'marian', 'markuplm', 'mbart', 'opt', 'pegasus', 'rembert', 'prophetnet',
'roberta', 'roc_bert', 'roformer', 'splinter', 'tapas', 't5', 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2', 'xlm-roberta', 'yolos'])..
Continue without bettertransformer modeling code.
Traceback (most recent call last):
File "/home/michael/infinity/libs/infinity_emb/infinity_emb/transformer/acceleration.py", line 19, in to_bettertransformer
model = BetterTransformer.transform(model)
File "/usr/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/home/michael/infinity/libs/infinity_emb/.venv/lib/python3.10/site-packages/optimum/bettertransformer/transformation.py", line
234, in transform
raise NotImplementedError(
NotImplementedError: The model type nomic_bert is not yet supported to be used with BetterTransformer. Feel free to open an issue at
https://github.com/huggingface/optimum/issues if you would like this model type to be supported. Currently supported models are:
dict_keys(['albert', 'bark', 'bart', 'bert', 'bert-generation', 'blenderbot', 'bloom', 'camembert', 'blip-2', 'clip', 'codegen',
'data2vec-text', 'deit', 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2', 'gptj', 'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm',
'm2m_100', 'marian', 'markuplm', 'mbart', 'opt', 'pegasus', 'rembert', 'prophetnet', 'roberta', 'roc_bert', 'roformer', 'splinter',
'tapas', 't5', 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2', 'xlm-roberta', 'yolos']).
INFO 2024-03-30 09:32:16,510 infinity_emb INFO: Switching to half() precision (cuda: fp16). sentence_transformer.py:73
INFO 2024-03-30 09:32:17,047 infinity_emb INFO: Getting timings for batch_size=32 and avg tokens per sentence=1 select_model.py:77
5.65 ms tokenization
13.25 ms inference
0.26 ms post-processing
19.16 ms total
embeddings/sec: 1670.14
INFO 2024-03-30 09:32:18,570 infinity_emb INFO: Getting timings for batch_size=32 and avg tokens per sentence=512 select_model.py:83
14.14 ms tokenization
13.47 ms inference
726.95 ms post-processing
754.57 ms total
embeddings/sec: 42.41
INFO 2024-03-30 09:32:18,572 infinity_emb INFO: model warmed up, between 42.41-1670.14 embeddings/sec at batch_size=32 select_model.py:84
INFO 2024-03-30 09:32:18,574 infinity_emb INFO: creating batching engine batch_handler.py:392
INFO 2024-03-30 09:32:18,575 infinity_emb INFO: ready to batch requests. batch_handler.py:249
INFO 2024-03-30 09:32:18,577 infinity_emb INFO: infinity_server.py:64
♾️ Infinity - Embedding Inference Server
MIT License; Copyright (c) 2023 Michael Feil
Version 0.0.31
Open the Docs via Swagger UI:
http://0.0.0.0:7997/docs
Access model via 'GET':
curl http://0.0.0.0:7997/models
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:7997 (Press CTRL+C to quit)
Mistral
@prasannakrish97 Can you try running the above commands and post it here?
Hello We are using the docker image 0.0.31. We install our models (nomic-embed-text-v1, and nomic-embed-text-v1.5) locally (/data), no internet access. einops is 0.7.0. We got an error after the aforementioned warning : (same error for both) For SFR-Embedding-Mistral, it’s working as intended, past the waning to ignore.
However, we're encountering the following problem for nomic (Nota Bene : Would like to mention that the same model nomic works well with Text Embedding Inference locally but not with infinity ) :
infinity-nomic_1 | INFO: Started server process [1]
infinity-nomic_1 | INFO: Waiting for application startup.
infinity-nomic_1 | INFO 2024-04-05 08:52:36,666 infinity_emb INFO: select_model.py:54
infinity-nomic_1 | model=`/data` selected, using engine=`torch` and
infinity-nomic_1 | device=`None`
infinity-nomic_1 | INFO 2024-04-05 08:52:36,678 SentenceTransformer.py:107
infinity-nomic_1 | sentence_transformers.SentenceTransformer
infinity-nomic_1 | INFO: Load pretrained SentenceTransformer:
infinity-nomic_1 | /data
infinity-nomic_1 | WARNING 2024-04-05 08:52:42,469 modeling_hf_nomic_bert.py:357
infinity-nomic_1 | transformers_modules.data.modeling_hf_nom
infinity-nomic_1 | ic_bert WARNING: <All keys matched
infinity-nomic_1 | successfully>
infinity-nomic_1 | INFO 2024-04-05 08:52:42,536 SentenceTransformer.py:213
infinity-nomic_1 | sentence_transformers.SentenceTransformer
infinity-nomic_1 | INFO: Use pytorch device_name: cpu
infinity-nomic_1 | INFO 2024-04-05 08:52:42,560 infinity_emb INFO: Adding acceleration.py:17
infinity-nomic_1 | optimizations via Huggingface optimum.
infinity-nomic_1 | ERROR 2024-04-05 08:52:42,562 infinity_emb ERROR: acceleration.py:21
infinity-nomic_1 | BetterTransformer is not available for model. The
infinity-nomic_1 | model type nomic_bert is not yet supported to be
infinity-nomic_1 | used with BetterTransformer. Feel free to open an
infinity-nomic_1 | issue at
infinity-nomic_1 | https://github.com/huggingface/optimum/issues if you
infinity-nomic_1 | would like this model type to be supported.
infinity-nomic_1 | Currently supported models are: dict_keys(['albert',
infinity-nomic_1 | 'bark', 'bart', 'bert', 'bert-generation',
infinity-nomic_1 | 'blenderbot', 'bloom', 'camembert', 'blip-2',
infinity-nomic_1 | 'clip', 'codegen', 'data2vec-text', 'deit',
infinity-nomic_1 | 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2',
infinity-nomic_1 | 'gptj', 'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm',
infinity-nomic_1 | 'm2m_100', 'marian', 'markuplm', 'mbart', 'opt',
infinity-nomic_1 | 'pegasus', 'rembert', 'prophetnet', 'roberta',
infinity-nomic_1 | 'roc_bert', 'roformer', 'splinter', 'tapas', 't5',
infinity-nomic_1 | 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2',
infinity-nomic_1 | 'xlm-roberta', 'yolos']).. Continue without
infinity-nomic_1 | bettertransformer modeling code.
infinity-nomic_1 | Traceback (most recent call last):
infinity-nomic_1 | File
infinity-nomic_1 | "/app/infinity_emb/transformer/acceleration.py",
infinity-nomic_1 | line 19, in to_bettertransformer
infinity-nomic_1 | model = BetterTransformer.transform(model)
infinity-nomic_1 | File "/usr/lib/python3.10/contextlib.py", line 79,
infinity-nomic_1 | in inner
infinity-nomic_1 | return func(*args, **kwds)
infinity-nomic_1 | File
infinity-nomic_1 | "/app/.venv/lib/python3.10/site-packages/optimum/bet
infinity-nomic_1 | tertransformer/transformation.py", line 234, in
infinity-nomic_1 | transform
infinity-nomic_1 | raise NotImplementedError(
infinity-nomic_1 | NotImplementedError: The model type nomic_bert is
infinity-nomic_1 | not yet supported to be used with BetterTransformer.
infinity-nomic_1 | Feel free to open an issue at
infinity-nomic_1 | https://github.com/huggingface/optimum/issues if you
infinity-nomic_1 | would like this model type to be supported.
infinity-nomic_1 | Currently supported models are: dict_keys(['albert',
infinity-nomic_1 | 'bark', 'bart', 'bert', 'bert-generation',
infinity-nomic_1 | 'blenderbot', 'bloom', 'camembert', 'blip-2',
infinity-nomic_1 | 'clip', 'codegen', 'data2vec-text', 'deit',
infinity-nomic_1 | 'distilbert', 'electra', 'ernie', 'fsmt', 'gpt2',
infinity-nomic_1 | 'gptj', 'gpt_neo', 'gpt_neox', 'hubert', 'layoutlm',
infinity-nomic_1 | 'm2m_100', 'marian', 'markuplm', 'mbart', 'opt',
infinity-nomic_1 | 'pegasus', 'rembert', 'prophetnet', 'roberta',
infinity-nomic_1 | 'roc_bert', 'roformer', 'splinter', 'tapas', 't5',
infinity-nomic_1 | 'vilt', 'vit', 'vit_mae', 'vit_msn', 'wav2vec2',
infinity-nomic_1 | 'xlm-roberta', 'yolos']).
infinity-nomic_1 | ERROR: Traceback (most recent call last):
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/starlette/routing.py", line 677, in lifespan
infinity-nomic_1 | async with self.lifespan_context(app) as maybe_state:
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/starlette/routing.py", line 566, in __aenter__
infinity-nomic_1 | await self._router.startup()
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/starlette/routing.py", line 654, in startup
infinity-nomic_1 | await handler()
infinity-nomic_1 | File "/app/infinity_emb/infinity_server.py", line 62, in _startup
infinity-nomic_1 | app.model = AsyncEmbeddingEngine.from_args(engine_args)
infinity-nomic_1 | File "/app/infinity_emb/engine.py", line 49, in from_args
infinity-nomic_1 | engine = cls(**asdict(engine_args), _show_deprecation_warning=False)
infinity-nomic_1 | File "/app/infinity_emb/engine.py", line 40, in __init__
infinity-nomic_1 | self._model, self._min_inference_t, self._max_inference_t = select_model(
infinity-nomic_1 | File "/app/infinity_emb/inference/select_model.py", line 68, in select_model
infinity-nomic_1 | loaded_engine.warmup(batch_size=engine_args.batch_size, n_tokens=1)
infinity-nomic_1 | File "/app/infinity_emb/transformer/abstract.py", line 55, in warmup
infinity-nomic_1 | return run_warmup(self, inp)
infinity-nomic_1 | File "/app/infinity_emb/transformer/abstract.py", line 105, in run_warmup
infinity-nomic_1 | embed = model.encode_core(feat)
infinity-nomic_1 | File "/app/infinity_emb/transformer/embedder/sentence_transformer.py", line 97, in encode_core
infinity-nomic_1 | out_features = self.forward(features)["sentence_embedding"]
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
infinity-nomic_1 | input = module(input)
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
infinity-nomic_1 | return self._call_impl(*args, **kwargs)
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
infinity-nomic_1 | return forward_call(*args, **kwargs)
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/sentence_transformers/models/Transformer.py", line 98, in forward
infinity-nomic_1 | output_states = self.auto_model(**trans_features, return_dict=False)
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
infinity-nomic_1 | return self._call_impl(*args, **kwargs)
infinity-nomic_1 | File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
infinity-nomic_1 | return forward_call(*args, **kwargs)
infinity-nomic_1 | TypeError: NomicBertModel.forward() got an unexpected keyword argument 'return_dict'
Okay, I have shown above that it is possible to run infinity with nomic. Therefore I'll do the following:
Try running again with this commands. Also delete all of your preexisting huggingface_hub modules and set a explicit commit. nomic runs with custom modeling code, so be aware that not pinning a specific version will lead to the fact that you execute whatever code from them in any future version.
python3 -m venv venv
source ./venv/bin/activate
pip install infinity_emb[all]
pip install einops # einops is a package required just by the custom code of nomic.
infinity_emb --model-name-or-path nomic-ai/nomic-embed-text-v1.5 --revision some_specfic_revision
#195 I'll plan to make it easier to "bake in a model in a dockerfile" - to many people have had issues with that, and it requires to much knowledge into compatible huggingface_hub / sentence_transformers versions, cache path etc. Perhaps give it a try once its merged.
https://huggingface.co/nomic-ai/nomic-embed-text-v1.5/discussions/16#6616ca28401ac37f878f4701