pyannote-audio
pyannote-audio copied to clipboard
Offline loading of pipeline ('NoneType' object has no attribute 'eval')
I am not sure if i am missing something. I followed the documentation in how to load a pipeline for speaker diarization offline.
i followed this description: https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/applying_a_pipeline.ipynb
But used the config.yaml from https://huggingface.co/pyannote/speaker-diarization instead of the VAD that is used in the offline use section since i want to use speaker diarization and not voice activity detection. (thats a bit confusing since at the top of that description, speaker diarization is used.)
i try to load it like this:
from pyannote.audio import Pipeline as PyannotePipeline
PyannotePipeline.from_pretrained(str(Path(cache_pyannote_path / "speaker-diarization" / "pipeline_config.yaml").resolve()))
but i get the following error:
2023-04-07 20:39:16 - +--------------------- Traceback (most recent call last) ---------------------+
2023-04-07 20:39:16 - | E:\AI\xyz\xyz.py:566 |
2023-04-07 20:39:16 - | in <module> |
2023-04-07 20:39:16 - | |
2023-04-07 20:39:16 - | 563 return bool(string) |
2023-04-07 20:39:16 - | 564 |
2023-04-07 20:39:16 - | 565 |
2023-04-07 20:39:16 - | > 566 main() |
2023-04-07 20:39:16 - | 567 |
2023-04-07 20:39:16 - | |
2023-04-07 20:39:16 - | E:\Python\Python310\lib\site-packages\click\core.py:1130 in __call__ |
2023-04-07 20:39:16 - | |
2023-04-07 20:39:16 - | 1127 |
2023-04-07 20:39:16 - | 1128 def __call__(self, *args: t.Any, **kwargs: t.Any) -> t.Any: |
2023-04-07 20:39:16 - | 1129 """Alias for :meth:`main`.""" |
2023-04-07 20:39:16 - | > 1130 return self.main(*args, **kwargs) |
2023-04-07 20:39:16 - | 1131 |
2023-04-07 20:39:16 - | 1132 |
2023-04-07 20:39:16 - | 1133 class Command(BaseCommand): |
2023-04-07 20:39:16 - | |
2023-04-07 20:39:16 - | E:\Python\Python310\lib\site-packages\click\core.py:1055 in main |
2023-04-07 20:39:16 - | |
2023-04-07 20:39:16 - | 1052 try: |
2023-04-07 20:39:16 - | 1053 try: |
2023-04-07 20:39:16 - | 1054 with self.make_context(prog_name, args, **extra) as |
2023-04-07 20:39:16 - | > 1055 rv = self.invoke(ctx) |
2023-04-07 20:39:16 - | 1056 if not standalone_mode: |
2023-04-07 20:39:16 - | 1057 return rv |
2023-04-07 20:39:16 - | 1058 # it's not safe to `ctx.exit(rv)` here! |
2023-04-07 20:39:16 - | |
2023-04-07 20:39:16 - | E:\Python\Python310\lib\site-packages\click\core.py:1404 in invoke |
2023-04-07 20:39:16 - | |
2023-04-07 20:39:16 - | 1401 echo(style(message, fg="red"), err=True) |
2023-04-07 20:39:16 - | 1402 |
2023-04-07 20:39:16 - | 1403 if self.callback is not None: |
2023-04-07 20:39:16 - | > 1404 return ctx.invoke(self.callback, **ctx.params) |
2023-04-07 20:39:16 - | 1405 |
2023-04-07 20:39:16 - | 1406 def shell_complete(self, ctx: Context, incomplete: str) -> t.Lis |
2023-04-07 20:39:16 - | 1407 """Return a list of completions for the incomplete value. Lo |
2023-04-07 20:39:16 - | |
2023-04-07 20:39:16 - | E:\Python\Python310\lib\site-packages\click\core.py:760 in invoke |
2023-04-07 20:39:16 - | |
2023-04-07 20:39:16 - | 757 |
2023-04-07 20:39:16 - | 758 with augment_usage_errors(__self): |
2023-04-07 20:39:16 - | 759 with ctx: |
2023-04-07 20:39:16 - | > 760 return __callback(*args, **kwargs) |
2023-04-07 20:39:16 - | 761 |
2023-04-07 20:39:16 - | 762 def forward( |
2023-04-07 20:39:16 - | 763 __self, __cmd: "Command", *args: t.Any, **kwargs: t.Any # n |
2023-04-07 20:39:16 - | |
2023-04-07 20:39:18 - | E:\Python\Python310\lib\site-packages\click\decorators.py:26 in new_func |
2023-04-07 20:39:18 - | |
2023-04-07 20:39:18 - | 23 """ |
2023-04-07 20:39:18 - | 24 |
2023-04-07 20:39:18 - | 25 def new_func(*args, **kwargs): # type: ignore |
2023-04-07 20:39:18 - | > 26 return f(get_current_context(), *args, **kwargs) |
2023-04-07 20:39:18 - | 27 |
2023-04-07 20:39:18 - | 28 return update_wrapper(t.cast(F, new_func), f) |
2023-04-07 20:39:18 - | 29 |
2023-04-07 20:39:18 - | |
2023-04-07 20:39:18 - | E:\AI\xyz\xyz.py:373 |
2023-04-07 20:39:18 - | in main |
2023-04-07 20:39:18 - | |
2023-04-07 20:39:18 - | 370 |
2023-04-07 20:39:18 - | 371 # load speaker diarization model |
2023-04-07 20:39:18 - | 372 #diarization_model = PyannoteModel.from_pretrained(str(Path(c |
2023-04-07 20:39:18 - | > 373 diarization_pipeline = PyannotePipeline.from_pretrained(str(P |
2023-04-07 20:39:18 - | 374 |
2023-04-07 20:39:18 - | 375 # num_samples = 1536 |
2023-04-07 20:39:18 - | 376 num_samples = int(settings.SetOption("vad_num_samples", |
2023-04-07 20:39:18 - | |
2023-04-07 20:39:18 - | E:\Python\Python310\lib\site-packages\pyannote\audio\core\pipeline.py:126 |
2023-04-07 20:39:18 - | in from_pretrained |
2023-04-07 20:39:18 - | |
2023-04-07 20:39:18 - | 123 ) |
2023-04-07 20:39:18 - | 124 params = config["pipeline"].get("params", {}) |
2023-04-07 20:39:18 - | 125 params.setdefault("use_auth_token", use_auth_token) |
2023-04-07 20:39:18 - | > 126 pipeline = Klass(**params) |
2023-04-07 20:39:18 - | 127 |
2023-04-07 20:39:18 - | 128 # freeze parameters |
2023-04-07 20:39:18 - | 129 if "freeze" in config: |
2023-04-07 20:39:18 - | |
2023-04-07 20:39:18 - | E:\Python\Python310\lib\site-packages\pyannote\audio\pipelines\speaker_diar |
2023-04-07 20:39:18 - | ization.py:125 in __init__ |
2023-04-07 20:39:18 - | |
2023-04-07 20:39:18 - | 122 super().__init__() |
2023-04-07 20:39:18 - | 123 |
2023-04-07 20:39:18 - | 124 self.segmentation_model = segmentation |
2023-04-07 20:39:18 - | > 125 model: Model = get_model(segmentation, use_auth_token=use_aut |
2023-04-07 20:39:18 - | 126 |
2023-04-07 20:39:18 - | 127 self.segmentation_batch_size = segmentation_batch_size |
2023-04-07 20:39:18 - | 128 self.segmentation_duration = ( |
2023-04-07 20:39:18 - | |
2023-04-07 20:39:18 - | E:\Python\Python310\lib\site-packages\pyannote\audio\pipelines\utils\getter |
2023-04-07 20:39:18 - | .py:89 in get_model |
2023-04-07 20:39:18 - | |
2023-04-07 20:39:18 - | 86 f"expected `str` or `dict`." |
2023-04-07 20:39:18 - | 87 ) |
2023-04-07 20:39:18 - | 88 |
2023-04-07 20:39:18 - | > 89 model.eval() |
2023-04-07 20:39:18 - | 90 return model |
2023-04-07 20:39:18 - | 91 |
2023-04-07 20:39:18 - | 92 |
2023-04-07 20:39:18 - +-----------------------------------------------------------------------------+
2023-04-07 20:39:18 - AttributeError: 'NoneType' object has no attribute 'eval'
Thank you for your issue. Give us a little time to review it.
PS. You might want to check the FAQ if you haven't done so already.
This is an automated reply, generated by FAQtory
- Visit hf.co/pyannote/speaker-diarization
- hf.co/pyannote/segmentation
Accept user conditions of Both Models
And add the user token during downloading the model from pre-trianed
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
use_auth_token="TOKEN HERE")
Visit hf.co/pyannote/speaker-diarization
- hf.co/pyannote/segmentation
Accept user conditions of Both Models
And add the user token during downloading the model from pre-trianed
from pyannote.audio import Pipeline pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization", use_auth_token="TOKEN HERE")
The whole reason to do offline loading is that you do not need a token for every user of the software. I can't expect that every user will create a huggingface account etc.
- Edit
your/path/to/pyannote/speaker-diarization/config.yaml
pipeline:
name: pyannote.audio.pipelines.SpeakerDiarization
params:
clustering: AgglomerativeClustering
embedding: your/path/to/speechbrain/spkrec-ecapa-voxceleb # Folder, must contains `speechbrain` keyword.
embedding_batch_size: 32
embedding_exclude_overlap: true
segmentation: your/path/to/pyannote/segmentation/[email protected] # File
segmentation_batch_size: 32
params:
clustering:
method: centroid
min_cluster_size: 15
threshold: 0.7153814381597874
segmentation:
min_duration_off: 0.5817029604921046
threshold: 0.4442333667381752
- Edit
pyannote/audio/pipelines/speaker_verification.py
(version 2.1.1)
self.classifier_ = SpeechBrain_EncoderClassifier.from_hparams(
source=self.embedding,
savedir=self.embedding if Path(self.embedding).exists() else f"{CACHE_DIR}/speechbrain",
run_opts={"device": self.device},
use_auth_token=use_auth_token,
)
- Speaker diarization
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("your/path/to/pyannote/speaker-diarization/config.yaml")
thanks. but still no luck. I placed the spkrec-ecapa-voxceleb stuff besides the pipeline config and changed the config accordingly:
pipeline:
name: pyannote.audio.pipelines.SpeakerDiarization
params:
clustering: AgglomerativeClustering
embedding: speechbrain/spkrec-ecapa-voxceleb
embedding_batch_size: 32
embedding_exclude_overlap: true
segmentation: pytorch_model.bin
segmentation_batch_size: 32
params:
clustering:
method: centroid
min_cluster_size: 15
threshold: 0.7153814381597874
segmentation:
min_duration_off: 0.5817029604921046
threshold: 0.4442333667381752
and changed the speaker_verification.py like you mentioned.
You need to download from speechbrain/spkrec-ecapa-voxceleb to the local speechbrain/spkrec-ecapa-voxceleb
directory.
classifier.ckpt embedding_model.ckpt hyperparams.yaml label_encoder.ckpt mean_var_norm_emb.ckpt
Hello @hbredin ,
Would a PR containing the following point be accepted for offline model loading ?
2. Edit `pyannote/audio/pipelines/speaker_verification.py`(version 2.1.1)
self.classifier_ = SpeechBrain_EncoderClassifier.from_hparams( source=self.embedding, savedir=self.embedding if Path(self.embedding).exists() else f"{CACHE_DIR}/speechbrain", run_opts={"device": self.device}, use_auth_token=use_auth_token, )
The modified line is savedir=self.embedding if Path(self.embedding).exists() else f"{CACHE_DIR}/speechbrain",
at https://github.com/pyannote/pyannote-audio/blob/11b56a137a578db9335efc00298f6ec1932e6317/pyannote/audio/pipelines/speaker_verification.py#L260
I'd gladly have a look at a PR facilitating the offline use of pyannote. Would be nice to also update the related part of the documentation.
I will take a look and submit a PR.
The tutorial doesn't work? Because I'm getting the same error when running it:
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained('pyannote/speaker-diarization', use_auth_token=True)
Error:
model.eval()
return model
AttributeError: 'NoneType' object has no attribute 'eval'
Hi @hbredin
I am also facing similar issue with offline usage with VAD config.
pyannote.audio version == 2.1.1
Config
pipeline: name: pyannote.audio.pipelines.VoiceActivityDetection params: segmentation: pytorch_model.bin params: min_duration_off: 0.09791355693027545 min_duration_on: 0.05537587440407595 offset: 0.4806866463041527 onset: 0.8104268538848918
Code pipeline = Pipeline.from_pretrained(f"vad_config.yaml")
Error: AttributeError: 'NoneType' object has no attribute 'eval'
Hello. Are there plans to make the offline use of speaker-diarization-3.0 pipeline work? I tried above suggestions to no avail.
pyannote
models and pipelines have always been usable offline.
The documentation is just... missing.
- download the segmentation model
- download the embedding model
- copy the parameters of the pipeline configuration file
- edit it to point to the local models
Also, feel free to make a PR improving the documentation!
except that this is exactly what i tried some time ago without success.
Would have to try it again to see if it works now or if i was just missing something else in detail.
So yes. an updated documentation would help a lot if someone gets this to work and would update it.
issue is still there as of today, the model is not found for some reason and a none value is returned, can someone look into this please.
https://github.com/pyannote/pyannote-audio/pull/1682
I did not see this issue when proposing the respective PR: https://github.com/pyannote/pyannote-audio/pull/1682
please check if the new tutorial addresses these issues: https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/community/offline_usage_speaker_diarization.ipynb
I solved this by accepting both conditions as mentioned in the readme
- Edit
your/path/to/pyannote/speaker-diarization/config.yaml
pipeline: name: pyannote.audio.pipelines.SpeakerDiarization params: clustering: AgglomerativeClustering embedding: your/path/to/speechbrain/spkrec-ecapa-voxceleb # Folder, must contains `speechbrain` keyword. embedding_batch_size: 32 embedding_exclude_overlap: true segmentation: your/path/to/pyannote/segmentation/[email protected] # File segmentation_batch_size: 32 params: clustering: method: centroid min_cluster_size: 15 threshold: 0.7153814381597874 segmentation: min_duration_off: 0.5817029604921046 threshold: 0.4442333667381752
- Edit
pyannote/audio/pipelines/speaker_verification.py
(version 2.1.1)self.classifier_ = SpeechBrain_EncoderClassifier.from_hparams( source=self.embedding, savedir=self.embedding if Path(self.embedding).exists() else f"{CACHE_DIR}/speechbrain", run_opts={"device": self.device}, use_auth_token=use_auth_token, )
- Speaker diarization
from pyannote.audio import Pipeline pipeline = Pipeline.from_pretrained("your/path/to/pyannote/speaker-diarization/config.yaml")
按照你的方法,还是报错huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name'