core
core copied to clipboard
Sonoff Zigbee 3.0 USB Dongle-P won't setup ZHA integration
The problem
Sonoff Dongle won't setup ZHA integration whatsoever. After I press "create a network" I get a "Loading next step for zha" windows followed by an "error" message without specifying what error is actually about.
What version of Home Assistant Core has the issue?
2024.8.3
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
ZHA
Link to integration documentation on our website
https://www.home-assistant.io/integrations/zha/
Diagnostics information
I enabled ZHA logging and here is the part of the log after rebooting the system and trying to establish integration.
home-assistant_2024-08-27T19-35-25.304Z.log
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response
✔️ Linting Passed
All linting checks passed. Your pull request is in excellent shape! ☀️
While LDA was a bit hard to port over to dask, PCA worked perfectly out of the box! (which I think is a pretty big win for the array API)
The other preprocessing/metrics/tools also worked out of the box, I think (at least judging by the tests).
R.e. performance
Testing out of core
for dask generated by dask-ml with parameters
n_samples=100_000
n_classes=2
n_informative=5
on a Gitpod machine with 2 cores and 8 GB RAM, I get
14m51s for 100,000,000 by 20 LDA (14.90 GB) # 100,000,000 samples 47 seconds for 10,000,000 by 20 LDA (1.49 GB) # 10,000,000 samples 6m 3.5s for 50,000,000 by 20 LDA (7.45 GB). # 50,000,000 samples
which I think is a pretty decent scaling
Distributed computation
For
n_samples=20,000,000
n_classes=2
n_informative=5
and chunksize=100,000
I am measuring 45s runtime for 4 workers (2 CPU, 8GB RAM), and 4 min 29s runtime for a single worker
- note: there was a bit of spilling on this one.
cc @betatim @ogrisel
FYI, array-api-compat fixes are ongoing here https://github.com/data-apis/array-api-compat/pull/110
Since array-api-compat 1.5.1 came out and CI is green here,
I'm going to be marking this PR as ready for review.
The only other change I'm planning right now is splitting out the LDA changes, since that requires a patch to dask itself.
The correct way to handle laziness is also something that might be good to think about. (It might be good to loop in more scikit-learn devs about this).
Thanks @lithomas1! Could you please add an entry referencing this PR in doc/whats_new/v1.5.rst? There is a dedicated section for Array API changes. This should fix the "Check Changelog" CI failure.
The only other change I'm planning right now is splitting out the LDA changes, since that requires a patch to dask itself.
Sounds good, however in the last CI run with dask I see 3 failures:
FAILED ::__init__.py::__init__.py::test_common.py::test_array_api_compliance[r2_score-check_array_api_regression_metric-dask.array-None-None] - ValueError: Boolean index assignment in Dask expects equally shaped arrays.
FAILED ::__init__.py::test_common.py::test_estimators[LinearDiscriminantAnalysis()-check_array_api_input(array_namespace=dask.array,dtype_name=None,device=None,skip_methods={})] - ValueError: An error occurred while calling the wrap_func_shape_as_first_ar...
FAILED ::__init__.py::__init__.py::test_multiclass.py::test_is_multilabel_array_api_compliance[dask.array-None-None] - AssertionError: is_multilabel(dask.array<array, shape=(10, 10), dtype=int64...
only one of them seems directly related to the lazy shape needed for LinearDiscriminantAnalysis.
The failures for check_array_api_regression_metric with r2_score and is_multilabel seem to be from a different nature but I would need to investigate with a debugger to confirm.
The correct way to handle laziness is also something that might be good to think about. (It might be good to loop in more scikit-learn devs about this).
I am personally fine with some kind of "trigger compute now if lazy" help in array-api-compat as a short term solution to be able to experiment with lazy Array API libs in scikit-learn and see empirically how much we have to rely on it in scikit-learn to inform the decision of a future version of the SPEC itself to better explicitly handle lazyness with an official API.
Here is the info about the r2 score metric. So it's indeed a with of lazy shape:
> /Users/ogrisel/code/scikit-learn/sklearn/metrics/_regression.py(884)_assemble_r2_explained_variance()
-> output_scores[valid_score] = 1 - (
(Pdb) l
879 # (note: even if denominator is zero, thus avoiding NaN scores)
880 output_scores = xp.ones([n_outputs], device=device, dtype=dtype)
881 # Non-zero Numerator and Non-zero Denominator: use the formula
882 valid_score = nonzero_denominator & nonzero_numerator
883
884 -> output_scores[valid_score] = 1 - (
885 numerator[valid_score] / denominator[valid_score]
886 )
887
888 # Non-zero Numerator and Zero Denominator:
889 # arbitrary set to 0.0 to avoid -inf scores
(Pdb) p valid_score
dask.array<and_, shape=(2,), dtype=bool, chunksize=(2,), chunktype=numpy.ndarray>
(Pdb) p numerator
dask.array<sum-aggregate, shape=(2,), dtype=float64, chunksize=(2,), chunktype=numpy.ndarray>
(Pdb) p denominator
dask.array<sum-aggregate, shape=(2,), dtype=float64, chunksize=(2,), chunktype=numpy.ndarray>
(Pdb) p numerator[valid_score]
dask.array<getitem, shape=(nan,), dtype=float64, chunksize=(nan,), chunktype=numpy.ndarray>
(Pdb) p 1 - (numerator[valid_score] / denominator[valid_score])
dask.array<sub, shape=(nan,), dtype=float64, chunksize=(nan,), chunktype=numpy.ndarray>
(Pdb) p output_scores
dask.array<ones_like, shape=(2,), dtype=float64, chunksize=(2,), chunktype=numpy.ndarray>
This causes:
E ValueError: Boolean index assignment in Dask expects equally shaped arrays.
E Example: da1[da2] = da3 where da1.shape == (4,), da2.shape == (4,) and da3.shape == (4,).
E Alternatively, you can use the extended API that supportsindexing with tuples.
E Example: da1[(da2,)] = da3.
I wonder what should be done in this case. Trigger an explicit computation prior to assignment? In this particular case, dask might be able to perform some shape inference only from the lazy info to see that the masked boolean assignment has compatible shapes, even if they are all nans: they are compatible nans.
For information, the tuple indexing syntax suggested in the error message would not work for this particular case:
diff --git a/sklearn/metrics/_regression.py b/sklearn/metrics/_regression.py
index ad5c76810f..f0931045ab 100644
--- a/sklearn/metrics/_regression.py
+++ b/sklearn/metrics/_regression.py
@@ -881,7 +881,7 @@ def _assemble_r2_explained_variance(
# Non-zero Numerator and Non-zero Denominator: use the formula
valid_score = nonzero_denominator & nonzero_numerator
- output_scores[valid_score] = 1 - (
+ output_scores[(valid_score,)] = 1 - (
numerator[valid_score] / denominator[valid_score]
)
would lead to:
E ValueError: Arrays chunk sizes are unknown: (nan,)
E
E A possible solution: https://docs.dask.org/en/latest/array-chunks.html#unknown-chunks
E Summary: to compute chunks sizes, use
E
E x.compute_chunk_sizes() # for Dask Array `x`
E ddf.to_dask_array(lengths=True) # for Dask DataFrame `ddf`
The is_multilabel failure is indeed also related to nan-valued lazy shape items in the expression labels.shape[0] < 3 because labels's shape is value dependent: labels = xp.unique_values(y):
https://github.com/scikit-learn/scikit-learn/blob/0e19a4822ff49951d2a7606444a1a6085c32b56b/sklearn/utils/multiclass.py#L193-L198
So to conclude we have identified only two types of problems related to the lazy evaluation in dask:
- decisions in the code based on lazy shape item values;
- masked assignments with the same lazy boolean array on each side.
Shall we create two issues upstream and link to this PR?
In the mean time we could either:
- introduce a temporary, dask-specific eager evaluation private helper in scikit-learn and use it sparingly whenever we need value-dependent branching in our code;
- introduce such a public helper in
array-api-compatand use it in scikit-learn to unblock this PR.
The only other change I'm planning right now is splitting out the LDA changes, since that requires a patch to dask itself.
Sounds good, however in the last CI run with dask I see 3 failures:
FAILED ::__init__.py::__init__.py::test_common.py::test_array_api_compliance[r2_score-check_array_api_regression_metric-dask.array-None-None] - ValueError: Boolean index assignment in Dask expects equally shaped arrays. FAILED ::__init__.py::test_common.py::test_estimators[LinearDiscriminantAnalysis()-check_array_api_input(array_namespace=dask.array,dtype_name=None,device=None,skip_methods={})] - ValueError: An error occurred while calling the wrap_func_shape_as_first_ar... FAILED ::__init__.py::__init__.py::test_multiclass.py::test_is_multilabel_array_api_compliance[dask.array-None-None] - AssertionError: is_multilabel(dask.array<array, shape=(10, 10), dtype=int64...only one of them seems directly related to the lazy shape needed for
LinearDiscriminantAnalysis.
Sorry, I pulled out all my changes last night related to the triggering compute on lazy shapes intentionally (and went to bed before leaving a comment ...), since I realized both of them depend on changing dask behavior/fixing bugs in dask.
For LDA, there's an annoying race condition, that if I was able to patch in dask, we would be able to use the shape helper, I mentioned above.
r2_score/is_multilabel has the issue where when dask is doing boolean assignment with arrays with a lazy shape, it doesn't automatically call compute on the operands, and errors out with a shape doesn't match error.
So, we would have to insert a shape helper call randomly, which would go against the idea of being able to write code for both lazy and eager array implementations, and not having to worry about the laziness.
In this case, we might have to go for the heavier approach that @betatim mentioned of wrapping the dask array class itself, and call compute on lazy arrays in __setitem__ (and other methods if necessary) before handing the __setitem__ of to dask.
(I haven't worked out the details of this yet though, but my first thought is that we can return wrapped arrays from the array_namespace call)
IMO, right now, having these two things not work is not really a big deal for dask support in scikit-learn (esp. since the estimators/methods in progress right now, e.g. #27800 for RidgeCV, and #28106 for mean_tweedie_deviance look like they should work out of the box without any changes at an initial glance)
So, @betatim @ogrisel What do you think about just adding skips if necessary to any methods/estimators that dask doesn't support out of the box, and collecting data in a doc/Github issue somewhere, so we can have a more complete picture of what's needed to support laziness?
(I'm happy to transcribe our conversations on this topic (and planned solutions) so that we don't lose it.)
I am fine with this if you find a concise way to add the necessary skips with links to one or more dedicated issues in either scikit-learn or upstream issue trackers.
This should be ready for another pass now.
Primary mechanism for skips would be the _array_api_skips dictionary in sklearn/utils/_array_api.py
I also added a skip_dask fixture/decorator for individual tests that test for array_api_compliance.
BTW, I launched the tests on hosts with cuda and MPS devices to confirm that this PR did not introduce any regression for those.
Here is another pass of feedback, possibly final on my end.
Would it be possible to open issues upstream, either in the dask, array-api-compat repos or the the Array API spec repo to link to from the comments of the skipped tests?
Yep, I'm tracking failures in https://github.com/scikit-learn/scikit-learn/issues/26724#issuecomment-2038674794
Failures are listed as checkboxes (with accompanying upstream issues if necessary).
Thanks for the reviews. I've updated the PR following the feedback.
Are you comfortable with merging partial dask support with skipped tests? Or do you prefer to implement a more complete support first?
Unsure. I started looked at the diff just now. I think if some estimators work and some don't that is one thing. But if for an estimator only some of it works, that is a bit weird. From a user's perspective it would be weird I think because you can't really do much if only part of an estimator works.
Can we use array-api-compat to add slogdet to dask arrays? Then we'd avoid the problem that an estimator semi works.
For the two other skips (LinearDiscriminantAnalysis and r2_score), what is the error/exception a user will see when they try to use them with a dask array? Should we have a dedicated exception message for this case or is "random error", like you get when you use an estimator that doesn't yet have Array API support, good enough?
Should we have a dedicated exception message for this case or is "random error", like you get when you use an estimator that doesn't yet have Array API support, good enough?
This is going to be hard (impossible) to do that in a library agnostic way. Note that we already get low-level error message with dask array in main:
>>> import sklearn
>>> sklearn.set_config(array_api_dispatch=True)
>>> import dask.array as da
>>> X = da.random.normal(size=(1000, 10))
>>> y = da.random.randint(0, 2, size=1000)
>>> from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
>>> LinearDiscriminantAnalysis().fit(X, y)
Traceback (most recent call last):
File ~/miniforge3/envs/dev/lib/python3.11/site-packages/dask/backends.py:140 in wrapper
return func(*args, **kwargs)
File ~/miniforge3/envs/dev/lib/python3.11/site-packages/dask/array/wrap.py:63 in wrap_func_shape_as_first_arg
parsed = _parse_wrap_args(func, args, kwargs, shape)
File ~/miniforge3/envs/dev/lib/python3.11/site-packages/dask/array/wrap.py:33 in _parse_wrap_args
chunks = normalize_chunks(chunks, shape, dtype=dtype)
File ~/miniforge3/envs/dev/lib/python3.11/site-packages/dask/array/core.py:3095 in normalize_chunks
chunks = auto_chunks(chunks, shape, limit, dtype, previous_chunks)
File ~/miniforge3/envs/dev/lib/python3.11/site-packages/dask/array/core.py:3210 in auto_chunks
raise ValueError(
ValueError: Can not perform automatic rechunking with unknown (nan) chunk sizes.
A possible solution: https://docs.dask.org/en/latest/array-chunks.html#unknown-chunks
Summary: to compute chunks sizes, use
x.compute_chunk_sizes() # for Dask Array `x`
ddf.to_dask_array(lengths=True) # for Dask DataFrame `ddf`
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
Cell In[11], line 1
LinearDiscriminantAnalysis().fit(X, y)
File ~/code/scikit-learn/sklearn/base.py:1474 in wrapper
return fit_method(estimator, *args, **kwargs)
File ~/code/scikit-learn/sklearn/discriminant_analysis.py:638 in fit
self._solve_svd(X, y)
File ~/code/scikit-learn/sklearn/discriminant_analysis.py:510 in _solve_svd
self.means_ = _class_means(X, y)
File ~/code/scikit-learn/sklearn/discriminant_analysis.py:116 in _class_means
means = xp.zeros((classes.shape[0], X.shape[1]), device=device(X), dtype=X.dtype)
File ~/miniforge3/envs/dev/lib/python3.11/site-packages/array_api_compat/_internal.py:28 in wrapped_f
return f(*args, xp=xp, **kwargs)
File ~/miniforge3/envs/dev/lib/python3.11/site-packages/array_api_compat/common/_aliases.py:134 in zeros
return xp.zeros(shape, dtype=dtype, **kwargs)
File ~/miniforge3/envs/dev/lib/python3.11/site-packages/dask/backends.py:142 in wrapper
raise type(e)(
ValueError: An error occurred while calling the wrap_func_shape_as_first_arg method registered to the numpy backend.
Original Message: Can not perform automatic rechunking with unknown (nan) chunk sizes.
A possible solution: https://docs.dask.org/en/latest/array-chunks.html#unknown-chunks
Summary: to compute chunks sizes, use
x.compute_chunk_sizes() # for Dask Array `x`
ddf.to_dask_array(lengths=True) # for Dask DataFrame `ddf`
So from this point of view, this PR is not making things worse than they currently are. At least we document that dask support is not fully functional yet.
Can we use array-api-compat to add
slogdetto dask arrays? Then we'd avoid the problem that an estimator semi works.
This is not possible yet.
Dask doesn't expose a way to do a determinant yet. (and there doesn't seem to be an easy workaround, since dask's LU/QR decomps don't provide enough information to do it. I've opened an issue on the dask side, but there doesn't seem to be a response yet https://github.com/dask/dask/issues/11042)
I agree that it's not ideal to have dask not be able to use score, but this is something that I think is reasonable to have users work around for now, e.g. by using _estimator_with_converted_arrays, to convert the estimator from dask to numpy arrays.
(similar to how we transfer arrays from GPU to CPU on cupy)
I'm not sure how important score is though. There seem to be some usages of it, but curiously there seem to be no examples in scikit-learn itself on using score.
https://github.com/search?q=repo%3Ascikit-learn%2Fscikit-learn+pca.score&type=code
My feeling overall is that adding dask support leaves us with something that is only half finished for the (very few) existing estimators. My first reaction to that is to wait a bit until Array API support in dask has progressed a bit more. My assumption is that there is no need to be in a hurry here.
I'm not sure how important score is though. There seem to be some usages of it, but curiously there seem to be no examples in scikit-learn itself on using score.
The score method is implicitly used by tools such as cross_val_score or GridSearchCV. However, it is true that it is very rarely used for PCA alone in practice. It's mostly used for supervised learning pipelines.
I agree that it's not ideal to have dask not be able to use score, but this is something that I think is reasonable to have users work around for now, e.g. by using _estimator_with_converted_arrays, to convert the estimator from dask to numpy arrays. (similar to how we transfer arrays from GPU to CPU on cupy)
It might also be a case the score method itself of the estimator can convert arrays to numpy if the namespace does not provide the necessary xp.linalg.slogdet method. For truncated PCA, we can expect the call of the fit method and maybe get_precision to be slow and would deserve running on accelerated namespace while the final xp.linalg.slogdet call should not be a performance critical operation (and the result is a scalar).
Let's convert to draft for the time being then.
Thanks for updating.
I'm back to working on this again now that my summer is over.
Last time I remember, I think had ridge regression working locally (after wrangling some issues with data-dependent output shapes which I think I worked around with where).
I'm going to try to fix some more issues in dask upstream like the lack of sorting, which seems to be a majority of failures, an also linalg.
r.e. dynamic shapes: Have you tried running sklearn functions through jax with JIT on? I believe dynamic shapes isn't an issue in jax without the JIT.
Have you tried running sklearn functions through jax with JIT on?
No I have not tried.
For the record, #30340 was recently merged to main so this might unblock this PR because we can now leverage dask compat features implemented in array-api-extra.
Thanks, I'll rebase this this week and let you know how it goes.
Made a bit of progress. Some notes I made along the way are:
TODOs
- [ ] Upstream fill_diagonal to array-api-extra (5 uses in sklearn)
- [ ] Upstream isin/in1d to array-api-extra (used in encoders)
Changes
- had to disable shape check in
_averagefor lazy backends
Blockers
- [ ] setdiff1d in array-api-extra only works on eager arrays
- Probably the next major thing to work on (this blocks LabelEncoder from working)
- remaining 13 metrics tests also depend on this
- Probably the next major thing to work on (this blocks LabelEncoder from working)
Progress
- Fixed PCA (was broken due to fill_diagonal)
- Fixed the input validators (e.g.
is_multilabel)- nunique in array-api-extra is very helpful so far
- Able to do
int(nunique(...to get n_classes - This forces compute (but shouldn't be so bad if we're only doing this on y)
- Able to do
- Going to try to apply this to LDA later on
- nunique in array-api-extra is very helpful so far
Thoughts
- nunique works, but it'd be better if it was part of a unique function though
- Right now, we have to do double the unique computation, once on finding the unique values and once for nunique
- If we want to support lazy backends, there needs to be some thought as to how error checking should work
- Right now, I disable them since for dask we don't know the shape until we compute. The alternative would be to defer them to raise during compute time.
I haven't touched this PR in a while (besides rebasing). I've opened an issue at array-api-extra, https://github.com/data-apis/array-api-extra/issues/268, for the missing functions and I think they're in favor of adding them.
Hopefully I can make the PRs into array-api-extra and/or finish debugging LDA next week.