Kinyugo
Kinyugo
Hi @mthrok & @carolineechen, I apologize for taking too long to get back to this. I have made an implementation of the MDCT algorithm in PyTorch [here](https://github.com/Kinyugo/torch_mdct). Kindly let me...
I have been using an implementation where I compute the pseudo-inverse of the filterbanks. It seems to work nicely and it gives similar results as `lstsq`. An upside of the...
Hi @nateanl, using `L-BFGS-B` will be a great solution. I am not sure if there is a specific implementation of it in PyTorch. It could be useful to run tests...
Thanks @Natooz. You are right miditok v3.0 is much faster I have seen about a 10x improvement. I'll consider tokenizing on the fly, that would use PyTorch dataset mutiprocessing, right?...
Interesting. If I get some time I'll test this and share the results with you. That will help reduce complexity a lot. But I assume for bpe you will want...
Hello. Thanks for your apt replies. I have a few questions though. >Yes indeed. Note that BPE training is also done by tokenizing MIDIs on the fly. The big interest...
> No I just meant that when training the tokenizer, the training data (MIDIs) is tokenized on the fly, there is no need to pre-tokenise it. :) Nice. I had...
> About full samples: I am currently experimenting with a TSD tokenizer, trained with BPE on the MetaMIDI Dataset. Nice! Looking forward to your findings. > The resulting tokens wouldn't...
No problem if I do get time too may be I can contribute that. Currently held up as well.
Hello @Natooz Thanks for looking into the issue. I am currently running tests for this. However, I do notice an `IndexError: list index out of range` error with some files...