tsfresh
tsfresh copied to clipboard
RayDistributor for using Ray to distribute the calculations in tsfresh
Ray is getting popular for building distributed applications and easy to fit into tsfresh by a RayDistributor
.
Distributed tsfresh on Ray
This repo involves a new RayDistributor
for tsfresh to use ray to distribute the calculations.
RayDistributor
is a subclass of IterableDistributorBaseClass
in tsfresh which follows the developing instruction in https://tsfresh.readthedocs.io/en/latest/text/tsfresh_on_a_cluster.html.
Quick Start
Use RayDistributor
the same way as MultiprocessingDistributor
, ClusterDaskDistributor
or LocalDaskDistributor
.
from tsfresh.utilities.distribution import RayDistributor
distributor = RayDistributor(n_workers=4)
# ...
extracted_features = extract_features(..., distributor=distributor)
# ...
Code change summary
- add
RayDistributor
definition in tsfresh.utilities.distribution - add
RayDistributor
document in docs/text/tsfresh_on_a_cluster.rst - Update pre-commit-config to enable future development
- Update test-requirements.txt for UT
- munually test the UT and document generation locally
@nils-braun It would be great to have some suggestions to avoid changing pre-commit-config version and to the PR itself