scikit-learn_bench
scikit-learn_bench copied to clipboard
adding parameters for device context and patching of Scikit-Learn
This is initial suggestion in purpose to start the discussion. Parameters for device context and Scikit-Learn patching are added. Some benchmarks are changed in purpose to use new parameters.
Example of the config file:
{
"common": {
"lib": ["sklearn"],
"data-format": ["numpy"],
"data-order": ["C"],
"device": ["None", "host", "cpu", "gpu"],
"patch_sklearn": ["False", "True"],
"dtype": ["float64"]
},
"cases": [
{
"algorithm": "kmeans",
"dataset": [
{
"source": "synthetic",
"type": "kmeans",
"n_clusters": 10,
"n_features": 50,
"training": {
"n_samples": 1000000
}
}
],
"n-clusters": [10]
},
{
"algorithm": "dbscan",
"dataset": [
{
"source": "synthetic",
"type": "blobs",
"n_clusters": 10,
"n_features": 50,
"training": {
"n_samples": 10000
}
}
],
"min-samples": [5000],
"eps": [1]
},
{
"algorithm": "linear",
"dataset": [
{
"source": "synthetic",
"type": "regression",
"n_features": 50,
"training": {
"n_samples": 1000000
}
}
]
},
{
"algorithm": "log_reg",
"solver":["lbfgs", "newton-cg"],
"dataset": [
{
"source": "synthetic",
"type": "classification",
"n_classes": 2,
"n_features": 100,
"training": {
"n_samples": 100000
}
},
{
"source": "synthetic",
"type": "classification",
"n_classes": 5,
"n_features": 100,
"training": {
"n_samples": 100000
}
}
]
}
]
}
Example of the config file:
{ "common": { "lib": ["sklearn"], "data-format": ["numpy"], "data-order": ["C"], "device": ["None", "host", "cpu", "gpu"], "patch_sklearn": ["False", "True"], "dtype": ["float64"] }, "cases": [ { "algorithm": "kmeans", "dataset": [ { "source": "synthetic", "type": "kmeans", "n_clusters": 10, "n_features": 50, "training": { "n_samples": 1000000 } } ], "n-clusters": [10] }, { "algorithm": "dbscan", "dataset": [ { "source": "synthetic", "type": "blobs", "n_clusters": 10, "n_features": 50, "training": { "n_samples": 10000 } } ], "min-samples": [5000], "eps": [1] }, { "algorithm": "linear", "dataset": [ { "source": "synthetic", "type": "regression", "n_features": 50, "training": { "n_samples": 1000000 } } ] }, { "algorithm": "log_reg", "solver":["lbfgs", "newton-cg"], "dataset": [ { "source": "synthetic", "type": "classification", "n_classes": 2, "n_features": 100, "training": { "n_samples": 100000 } }, { "source": "synthetic", "type": "classification", "n_classes": 5, "n_features": 100, "training": { "n_samples": 100000 } } ] } ] }
Why doesn't the config add to this repository?
@PetrovKP At this moment we have only one example in the repository. I suppose to create a directory with various configs. Please comment what do you think about it.
@PetrovKP At this moment we have only one example in the repository. I suppose to create a directory with various configs. Please comment what do you think about it.
We wanted to store all the configs here. This is correct and convenient. I don’t know why we still haven’t moved the configs ...
Implemented in https://github.com/IntelPython/scikit-learn_bench/pull/133