Thomas J. Fan

Results 255 comments of Thomas J. Fan

scikit-learn has a similar `_assert_all_finite` where it does a sum check first (for speed). If the sum check fails then it does a `xp.any(xp.isnan(...))` to see if there was an...

In scikit-learn, we benchmarked our inf/nan check for 2d arrays and found that the memory pressure was much lower for valid input with no nans and infs. For SciPy, if...

I was unclear about Case 2. In https://github.com/scikit-learn/scikit-learn/issues/23394#issuecomment-1132997413, I was thinking of normalizing the variance itself with the "Coefficient of variation", which is a feature request https://github.com/scikit-learn/scikit-learn/issues/15118 or the "quartile...

With version 1.2, as long as estimators outputs dataframes, the dtypes are preserved. Here is a small snippet inspired from the opening comment: ```python from sklearn.compose import ColumnTransformer from sklearn.preprocessing...

I do not think the caller (`_check_targets`) can raise a more informative error message because the caller can not distinguish between different `"unknown"`s without checking the target too. For example,...

I think the only issue I would have with `type_of_target` is that is does calls `np.any(y != y.astype(int))` and `np.unique`, which adds some overhead. https://github.com/scikit-learn/scikit-learn/blob/fd237278e895b42abe8d8d09105cbb82dc2cbba7/sklearn/utils/multiclass.py#L285-L290

I think it would not be possible for joblib to tell "where the configuration are stored and should be passed into the spawn processes" unless the library somehow registers the...

The less magically way to do it would be to require it to be passed to the `Parallel` object. Although, it would require the user to constantly pass the getter...

> I would be interested to know your opinion on the different use case that are expected for the global options. For the moment, the use case for scikit-learn is...

@zhiruiwang This should work for your use case: ```python from functools import partial import mlcrate as mlc pool = mlc.SuperPool() def f(x, y): return x**(2 / y) res = pool.map(partial(f,...