scikit-learn_bench icon indicating copy to clipboard operation
scikit-learn_bench copied to clipboard

adding parameters for device context and patching of Scikit-Learn

Open Alexander-Makaryev opened this issue 4 years ago • 4 comments

This is initial suggestion in purpose to start the discussion. Parameters for device context and Scikit-Learn patching are added. Some benchmarks are changed in purpose to use new parameters.

Alexander-Makaryev avatar May 17 '20 15:05 Alexander-Makaryev

Example of the config file:

{
    "common": {
        "lib": ["sklearn"],
        "data-format": ["numpy"],
        "data-order": ["C"],
        "device": ["None", "host", "cpu", "gpu"],
        "patch_sklearn": ["False", "True"],
        "dtype": ["float64"]
    },
    "cases": [
                {
            "algorithm": "kmeans",
            "dataset": [
                {
                    "source": "synthetic",
                    "type": "kmeans",
                    "n_clusters": 10,
                    "n_features": 50,
                    "training": {
                        "n_samples": 1000000
                    }
                }
            ],
            "n-clusters": [10]
        },
        {
            "algorithm": "dbscan",
            "dataset": [
                {
                    "source": "synthetic",
                    "type": "blobs",
                    "n_clusters": 10,
                    "n_features": 50,
                    "training": {
                        "n_samples": 10000
                    }
                }
            ],
            "min-samples": [5000],
            "eps": [1]
        },
        {
            "algorithm": "linear",
            "dataset": [
                {
                    "source": "synthetic",
                    "type": "regression",
                    "n_features": 50,
                    "training": {
                        "n_samples": 1000000
                    }
                }
            ]
        },
        {
            "algorithm": "log_reg",
            "solver":["lbfgs", "newton-cg"],
            "dataset": [
                {
                    "source": "synthetic",
                    "type": "classification",
                    "n_classes": 2,
                    "n_features": 100,
                    "training": {
                        "n_samples": 100000
                    }
                },
                {
                    "source": "synthetic",
                    "type": "classification",
                    "n_classes": 5,
                    "n_features": 100,
                    "training": {
                        "n_samples": 100000
                    }
                }
            ]
        }
    ]
}

Alexander-Makaryev avatar May 17 '20 23:05 Alexander-Makaryev

Example of the config file:

{
    "common": {
        "lib": ["sklearn"],
        "data-format": ["numpy"],
        "data-order": ["C"],
        "device": ["None", "host", "cpu", "gpu"],
        "patch_sklearn": ["False", "True"],
        "dtype": ["float64"]
    },
    "cases": [
                {
            "algorithm": "kmeans",
            "dataset": [
                {
                    "source": "synthetic",
                    "type": "kmeans",
                    "n_clusters": 10,
                    "n_features": 50,
                    "training": {
                        "n_samples": 1000000
                    }
                }
            ],
            "n-clusters": [10]
        },
        {
            "algorithm": "dbscan",
            "dataset": [
                {
                    "source": "synthetic",
                    "type": "blobs",
                    "n_clusters": 10,
                    "n_features": 50,
                    "training": {
                        "n_samples": 10000
                    }
                }
            ],
            "min-samples": [5000],
            "eps": [1]
        },
        {
            "algorithm": "linear",
            "dataset": [
                {
                    "source": "synthetic",
                    "type": "regression",
                    "n_features": 50,
                    "training": {
                        "n_samples": 1000000
                    }
                }
            ]
        },
        {
            "algorithm": "log_reg",
            "solver":["lbfgs", "newton-cg"],
            "dataset": [
                {
                    "source": "synthetic",
                    "type": "classification",
                    "n_classes": 2,
                    "n_features": 100,
                    "training": {
                        "n_samples": 100000
                    }
                },
                {
                    "source": "synthetic",
                    "type": "classification",
                    "n_classes": 5,
                    "n_features": 100,
                    "training": {
                        "n_samples": 100000
                    }
                }
            ]
        }
    ]
}

Why doesn't the config add to this repository?

PetrovKP avatar May 18 '20 14:05 PetrovKP

@PetrovKP At this moment we have only one example in the repository. I suppose to create a directory with various configs. Please comment what do you think about it.

Alexander-Makaryev avatar May 19 '20 21:05 Alexander-Makaryev

@PetrovKP At this moment we have only one example in the repository. I suppose to create a directory with various configs. Please comment what do you think about it.

We wanted to store all the configs here. This is correct and convenient. I don’t know why we still haven’t moved the configs ...

PetrovKP avatar May 19 '20 22:05 PetrovKP

Implemented in https://github.com/IntelPython/scikit-learn_bench/pull/133

Alexsandruss avatar May 17 '23 12:05 Alexsandruss