matrixprofile-ts icon indicating copy to clipboard operation
matrixprofile-ts copied to clipboard

O(n²) Memory Requirements for _stamp_parallel()

Open JaKasb opened this issue 5 years ago • 0 comments

The function _stamp_parallel() computes the distance-profiles via a pool of workers. Each worker returns a list of distance-profiles.

The reduction of the distance-matrix occurs after all workers return their lists-of-distance-profiles. The memory requirements of all distance-profiles is O( (sampling_rate*n) * n )

reduce(map(mass_distance_profile_parallel(indices)))

https://github.com/target/matrixprofile-ts/blob/207aa94cfff143dee824c46c5bad7444a806088c/matrixprofile/matrixProfile.py#L128

https://github.com/target/matrixprofile-ts/blob/bcba7dc741d254435a72a60d6e014bf563de7d5a/matrixprofile/distanceProfile.py#L74

To reduce the memory requirements, each worker must reduce the list-of-distances to an intermediate matrix-profile. Afterwards the pool-spawner must reduce the intermediate matrix-profiles to the final matrix-profile.

reduce(map(reduce(mass_distance_profile_parallel(indices))))

JaKasb avatar Oct 17 '19 16:10 JaKasb