pyannote-database
                                
                                
                                
                                    pyannote-database copied to clipboard
                            
                            
                            
                        `LABLoader` raise ValueError("`path` must contain the {uri} placeholder.") even if the placeholder is configured correctly
Part of my configuration:
Databases:
  # tell pyannote.database where to find AMI wav files.
  # {uri} is a placeholder for the session name (eg. ES2004c).
  # you might need to update this line to fit your own setup.
  AMI: amicorpus/{uri}/audio/{uri}.Mix-Headset.wav
  AMI-SDM: amicorpus/{uri}/audio/{uri}.Array1-01.wav
Protocols:
  AMI-SDM:
    SpeakerDiarization:
      only_words:
        train:
            uri: ../lists/train.meetings.txt
            annotation: ../only_words/rttms/train/{uri}.rttm
            annotated: ../uems/train/{uri}.uem
            lab: ../only_words/labs/train/{uri}.lab
        development:
            uri: ../lists/dev.meetings.txt
            annotation: ../only_words/rttms/dev/{uri}.rttm
            annotated: ../uems/dev/{uri}.uem
            lab: ../only_words/labs/dev/{uri}.lab
        test:
            uri: ../lists/test.meetings.txt
            annotation: ../only_words/rttms/test/{uri}.rttm
            annotated: ../uems/test/{uri}.uem
            lab: ../only_words/labs/test/{uri}.lab
When I comment out these two lines, the program runs well and file['lab'] returns exactly an Annotation object
https://github.com/pyannote/pyannote-database/blob/da5794b4bef2e95e93659817799aff6a770366a9/pyannote/database/loader.py#L260-L261
Seems this sanity check is not working as expected. Also other loaders (e.g. RTTMLoader) don't have this line (I guess the logic should be similar).
Another observation:
load_rttm() returns a dict as {uri: annotation} while load_lab() returns simply annoation object, just wonder if this is a delibrate design as I see no reason for distinguishing the behaviour for similar functionalities.
The difference between rttm and lab lies in the fact that
labformat has nofilenamefield, onelabfile can therefore contain annotations for only one audio file. theurimust therefore be infered from thelabfile name.rttmformat has afilenamefield, onerttmfile can therefore contain annotations for multiple audio file.
Then what's the proper way to configure lab? Could you give me an example?