snape
snape copied to clipboard
Snape is a convenient artificial dataset generator that wraps sklearn's make_classification and make_regression and then adds in 'realism' features such as complex formating, varying scales, categoric...
are categorical features dependent ? for example feature1 is dependent on feature2
Create a script that will use something like google image search to query images of N things, such that the dataset could be used for computer vision classification tasks.
For categoricals like [jan, feb, mar...dec] we should shuffle before applying the binning. The ordinal nature of this categorical binned to the gaussian column is 'too easy.'