spark-sklearn icon indicating copy to clipboard operation
spark-sklearn copied to clipboard

Clarify RandomizedSearchCV documentation for sampling with replacement

Open shaunswanson opened this issue 6 years ago • 2 comments

At this line, it may be better to explicitly mention which parameters will be sampled with replacement if any one of them is a distribution:

https://github.com/databricks/spark-sklearn/blob/master/python/spark_sklearn/random_search.py#L27

Are all parameters (those given as distributions and those given as lists of values) sampled with replacement in this scenario, or only parameters given by distributions?

If so, this documentation would be better stated as:

sampling with replacement is used for all parameters.

shaunswanson avatar Feb 05 '19 21:02 shaunswanson

Let me know if you'd like a pull request. :)

shaunswanson avatar Feb 05 '19 21:02 shaunswanson

Sure, yeah, I'm sure this could use some clarification. Feel free to open a PR.

srowen avatar Feb 05 '19 21:02 srowen