Justin Salamon
Justin Salamon
We should add this table to the README file in the section where we explain about the different models available, ideally also including performance drop with respect to the full...
The crepe model uses a sampling rate of 16 kHz, and takes 1024 samples as input i.e. 64 milliseconds. You could pass every 1024 samples from your audio buffer into...
@sannawag @0b01 CREPE already supports viterbi decoding: `crepe.predict(audio, sr, viterbi=True)`. For voicing activation we've found that a simple threshold on the returned voicing confidence values work well (where the confidence...
thanks @sannawag I'll have to give this a closer look, so it might take some time before I can give more feedback. As a general note it's helpful to first...
@maxrmorrison @jongwook Delighted to see this moving forward! One quick question - librosa is a pretty heavy dependency. Could we just carve out just the viterbi decoding code, so as...
It's hard to provide help without further details (OS version, python environment, command you ran, full error message, etc.). From the little information provided, I can only venture a guess...
We don't provide the fusion layers as of now. This could be something we could add in the future potentially, but right now our focus is on adding support for...
Sounds reasonable. Perhaps I'd move the `hop_size` and `embedding_size` params into the `get_embedding` function call, as these are independent of the specific model file loaded?
Do we still want to do this now that you can pass a model to `get_embedding`?
Yeah could still be handy, let’s keep things stable for now and we can revisit this question down the line if/when we get more user feedback