spotlight icon indicating copy to clipboard operation
spotlight copied to clipboard

classification : synthetic unbalanced data generating

Open Sandy4321 opened this issue 4 years ago • 0 comments

may you share some links to synthetic unbalanced data generating for classification when your code is for recommendation system data https://maciejkula.github.io/spotlight/datasets/synthetic.html

meaning close to real data - with mix of categorical and continues features values in addition to known simple one https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html weights : array-like of shape (n_classes,) or (n_classes - 1,), (default=None) The proportions of samples assigned to each class. If None, then classes are balanced. Note that if len(weights) == n_classes - 1, then the last class weight is automatically inferred. More than n_samples samples may be returned if the sum of weights exceeds 1

or maybe your code can be used for binary classification with mix of categorical and continues features values when different group of features have complicated dependency?

Sandy4321 avatar Jun 09 '20 15:06 Sandy4321