Pikauba comments

Results 9 comments of


                                            Pikauba

Issues with SVD regularization

I actually observed the same issue @black-puppydog. I was wondering why the regularization term eye() was so big. I believe that mathematically it's a mistake to let it how it...

No activation between residual blocks

The addition acts like a non-linearity already. It's a design choice, you can see the same in Cycle-GAN and pix-to-pixHD. It actually allows it to output negative values.

Removing torchaudio.set_audio_backend("soundfile")

As described in your link, "In 2.2, the legacy global backend mechanism will be removed. Utility functions [get_audio_backend()](https://pytorch.org/audio/stable/generated/torchaudio.get_audio_backend.html#torchaudio.get_audio_backend) and [set_audio_backend()](https://pytorch.org/audio/stable/generated/torchaudio.set_audio_backend.html#torchaudio.set_audio_backend) become no-op." Considering that for now pyannote-audio has a requirement...

Better Diarization pipeline

Be careful about this so called [Fix](https://github.com/Vaibhavs10/insanely-fast-whisper/blob/355275fe7c05578a1c948452ff063f60a9670cc6/src/insanely_fast_whisper/utils/diarize.py#L147C16-L147C30). As it is the same exact code used in speechbox (I wonder why the speechbox library is not directly integrated in this repo...

Extend to batch SVD

I adapted the PowerIteration method to fit with batch matrix and eigenvectors if you are interested. ```python class PowerIteration(torch.autograd.Function): @staticmethod def forward(ctx, M, v, n_iter=19): ctx.n_iter = n_iter ctx.save_for_backward(M, v)...

Pikauba

Issues with SVD regularization

No activation between residual blocks

Removing torchaudio.set_audio_backend("soundfile")

Better Diarization pipeline

Extend to batch SVD

Add per token confidence to each segment.

FIX: fix VAD for no voice activity less than min_duration_off

Is it applicable to time series data？

[BUG] semantic error in learning objective loss function