river
river copied to clipboard
Toy datasets like stream-learn
- stream-learn has a nice
StreamGenerator
class that can generate different kinds of concept drift, see here. Also see their paper, which describes each type of drift. It would be nice to have the same. We reached out to them but they were not open to cooperation. - I think we should have multiple classes rather than a single one. One class per type of concept drift sounds good. Making them composable would be ideal for two reasons:
- We can add drift to any dataset
- We can mix different kinds of drifts
- I suggest creating a new
datasets.drift
submodule - It would be cool to see if we could remove some of our existing dataset classes in favor of these new ones, to avoid repetition (
FriedmanDrift
,RandomRBFDrift
,LEDDrift
,ConceptDriftStream
) - Ideally we would like to have the same documentation page with descriptive plots of each dataset
- In fact, we also need a nice page that describes each dataset, like what scikit-learn does here