asteroid
asteroid copied to clipboard
Cleanly separating model and dataset code from egs/
Refs #180
So I just wanted to work through the similarities and differences of the models, datasets and egs. First thing I noticed is that for some egs, model and dataset code lives in the eg folder, and for some it lives in the asteroid package. Is there a reason for this (other than historic :-)? If not, what do you think about moving all the model code and dataset code out of the egs?
I recall that we decided to offer some generic (and multi-purpose e.g. ConvTasNet, DPRNN) architectures into the toolkit but more specific ones just into the egs (e.g. WHAMR stacked Bi-LSTM TasNet). It is a reasonable compromise I think otherwise we will have to add every new thing to the toolkit and it will be hard to maintain.
For the datasets, if we have a unified way to preprocess the data, it should live inside Asteroid, I agree.
For models, it's not as clear IMO. For example, we have the audio-visual recipes where the architectures are really specific, SEGAN (merged soon) doesn't really share structure with other architectures etc.. But you're right that there is also a historical reason. At first, we aimed at providing well-abstracted (hopefully) building blocks rather than architectures. We decided to include architecture as well and I'm very happy with that now !
I'd be ok to move some things back from egs but probably not all.. How to decide is the next question.