Voice Activity Detection
Transformers
PyTorch
speaker
speaker-diarization
meeting
wavlm
wespeaker
diarizen
pyannote
pyannote-audio-pipeline
Instructions to use BUT-FIT/diarizen-wavlm-large-s80-md with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BUT-FIT/diarizen-wavlm-large-s80-md with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("BUT-FIT/diarizen-wavlm-large-s80-md", dtype="auto") - Notebooks
- Google Colab
- Kaggle
snapshot model mismatch
#5
by robtaylor-chipflow - opened
When running the below test code, i get the error:
RuntimeError: Error(s) in loading state_dict for Model:
size mismatch for classifier.weight: copying a param with shape torch.Size([16, 256]) from checkpoint, the shape in current model is torch.Size([11, 256]).
size mismatch for classifier.bias: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([11]).
Test code:
diar_pipeline = DiariZenPipeline.from_pretrained("BUT-FIT/diarizen-wavlm-large-s80-md")
from diarizen.pipelines.inference import DiariZenPipeline
# load pre-trained model
diar_pipeline = DiariZenPipeline.from_pretrained("BUT-FIT/diarizen-wavlm-large-s80-md")
# apply diarization pipeline
diar_results = diar_pipeline('audio.wav')
# print results
for turn, _, speaker in diar_results.itertracks(yield_label=True):
print(f"start={turn.start:.1f}s stop={turn.end:.1f}s speaker_{speaker}")
# load pre-trained model and save RTTM result
diar_pipeline = DiariZenPipeline.from_pretrained(
"BUT-FIT/diarizen-wavlm-large-s80-md",
rttm_out_dir='.'
)
# apply diarization pipeline
diar_results = diar_pipeline('audio.wav', sess_name='session_name')
``
Hi, could you try to re-pull the code?