aeon icon indicating copy to clipboard operation
aeon copied to clipboard

[BUG] Fixes sbd distances for multivariate case

Open tanishy7777 opened this issue 7 months ago • 23 comments

Reference Issues/PRs

Fixes #2674 Closes #2715 As mentioned in https://github.com/aeon-toolkit/aeon/issues/2661#issuecomment-2745266428 and https://github.com/aeon-toolkit/aeon/issues/2661#issuecomment-2779108840

fixes sbd distance for multivariate case

What does this implement/fix? Explain your changes.

Currently the implementation of sbd distance in aeon handles the multivariate case slightly differently than the official version(https://github.com/TheDatumOrg/kshape-python) and tslearn's implementation as well.

Also, I set the unequal_length=False (need input from reviewers regarding this as SBD distance(official & tslearn version) supports unequal length time series but doesn't support unequal channels, but the tests in aeon in test_distances check for unequal channels also, more details in https://github.com/aeon-toolkit/aeon/issues/2661#issuecomment-2799941937 ).

Differences: sbd_distance: finds the distance for each channel independently and then takes its average normalized_cc(from tslearn): finds the correlations for each of the channels and then sums them and finds the max from this sum and then normalizes them using the norm of the the entire multivariate series.

Does your contribution introduce a new dependency? If yes, which one?

Any other comments?

PR checklist

For all contributions
  • [x] I've added myself to the list of contributors. Alternatively, you can use the @all-contributors bot to do this for you after the PR has been merged.
  • [x] The PR title starts with either [ENH], [MNT], [DOC], [BUG], [REF], [DEP] or [GOV] indicating whether the PR topic is related to enhancement, maintenance, documentation, bugs, refactoring, deprecation or governance.
For new estimators and functions
  • [ ] I've added the estimator/function to the online API documentation.
  • [ ] (OPTIONAL) I've added myself as a __maintainer__ at the top of relevant files and want to be contacted regarding its maintenance. Unmaintained files may be removed. This is for the full file, and you should not add yourself if you are just making minor changes or do not want to help maintain its contents.
For developers with write access
  • [ ] (OPTIONAL) I've updated aeon's CODEOWNERS to receive notifications about future changes to these files.

tanishy7777 avatar Apr 13 '25 13:04 tanishy7777

Thank you for contributing to aeon

I have added the following labels to this PR based on the title: [ $\color{#d73a4a}{\textsf{bug}}$ ]. I have added the following labels to this PR based on the changes made: [ $\color{#5209C9}{\textsf{distances}}$, $\color{#2C2F20}{\textsf{testing}}$ ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

  • [ ] Run pre-commit checks for all files
  • [ ] Run mypy typecheck tests
  • [ ] Run all pytest tests and configurations
  • [ ] Run all notebook example tests
  • [ ] Run numba-disabled codecov tests
  • [ ] Stop automatic pre-commit fixes (always disabled for drafts)
  • [ ] Disable numba cache loading
  • [ ] Push an empty commit to re-run CI checks

aeon-actions-bot[bot] avatar Apr 13 '25 13:04 aeon-actions-bot[bot]

Equivalence between original implementation, tslearn's implementation and fixed implementation for multivariate and uni variate case

Image

tanishy7777 avatar Apr 13 '25 15:04 tanishy7777

Reply to https://github.com/aeon-toolkit/aeon/issues/2661#issuecomment-2799962773

Distance measures dependent on the _dtw_distance function like ddtw_distance, dtw_gi_distance and also adtw, erp, edr distances ignore the other channels without warning.

For example in the dtw_distance, for a given time step all the channels are passed to the _univariate_squared_distance which takes min of the channels and computes the distance between them.

Image

Image Image

Image where

@njit(cache=True, fastmath=True)
def _univariate_euclidean_distance(x: np.ndarray, y: np.ndarray) -> float:
    return np.sqrt(_univariate_squared_distance(x, y))

tanishy7777 avatar Apr 13 '25 16:04 tanishy7777

Even though currently we are simply ignoring the other channels in all the distance measures(this is the case for atleast the ones mentioned above). Would it be better to add warnings for this?

tanishy7777 avatar Apr 13 '25 16:04 tanishy7777

@SebastianSchmidl

I have added the code for benchmarks here https://github.com/tanishy7777/aeon/tree/main/benchmarks

Important note for comparison, I removed numba acceleration for the aeon functions, i.e. sbd_distance_fix and sbd_distance, as the main idea of these benchmarks is to compare the algorithm and the original implementation, and tslearn implementation dont use numba.

The benchmarks test the various cases for the following functions:

  • sbd_distance (from aeon main branch)
  • sbd_distance_fix (from this PR)
  • sbd_original (from the original implementation)
  • sbd_tslearn (tslearn's implementation)

I tested the following cases:

  1. For univariate time series, varying the no of timesteps.
  2. For multivariate time series (2 channels), varying the no of timesteps.
  3. For a constant no of timesteps=1000, varying the no of channels.

The red line denotes the sbd_distance_fix function, it performs better than the original implementation in all cases. (If I am not making any obvious mistake in the benchmark code itself that is causing this)

Figure_1_small Figure_2_small Figure_3_small

tanishy7777 avatar Apr 14 '25 11:04 tanishy7777

As the implementations will be used with the Numba JIT turned on in most cases, I would be more interested in the runtimes and scaling behavior of the aeon implementations with Numba. You do not need to include sbd_original and sbd_tslearn.

Could you also include the implementation idea in #2715 in the benchmark?

Especially the different usage of with objmode(cc="float64[:, :]"): with Numba could make a big difference in certain configurations.

SebastianSchmidl avatar Apr 15 '25 16:04 SebastianSchmidl

As the implementations will be used with the Numba JIT turned on in most cases, I would be more interested in the runtimes and scaling behavior of the aeon implementations with Numba. You do not need to include sbd_original and sbd_tslearn.

Could you also include the implementation idea in #2715 in the benchmark?

Especially the different usage of with objmode(cc="float64[:, :]"): with Numba could make a big difference in certain configurations.

Thanks for the feedback, will enable numba and compare with #2715. Couple of assignments and project presentations lined up this weekend in college, but will add it soon.

tanishy7777 avatar Apr 17 '25 14:04 tanishy7777

@SebastianSchmidl These are the new benchmarks sorry for the delay, I have semester exams going on. I have updated the code for them https://github.com/tanishy7777/aeon/tree/main/benchmarks

Changes:

  • Comparing numba enabled benchmarks.
  • Also comparing with the implementation in #2715

Observations:

  • For univariate case: sbd_distance_fix is better than the original but worse than sbd_distance_pr2715 at higher time series lengths.
  • For 2 channels case: sbd_distance_fix is the best for all time series lengths among all the implementations.
  • For fixed time series length(1000 timepoints): sbd_distance_fix is equal to the sbd_distance across different no of channels and is also better than sbd_distance_pr2715.

Please let me know if this looks good? Or any changes are needed. Thank you! Figure_1_final Figure_2_final Figure_3_final

tanishy7777 avatar Apr 24 '25 17:04 tanishy7777

Sorry to ask again for more, but could you scale the experiment further (larger time series)? According to the plots, all runs finish in under 1s.

Interesting that #2715 is faster than your proposal for one channel, but scales worse with the number of channels 🤔

SebastianSchmidl avatar Apr 25 '25 12:04 SebastianSchmidl

Sorry to ask again for more, but could you scale the experiment further (larger time series)? According to the plots, all runs finish in under 1s.

Interesting that #2715 is faster than your proposal for two channels, but scales worse with the number of channels 🤔

Yea sure will add them. Actually for 2 channel my fix works better than #2715 For univariate case only my fix is worse than #2715 but it is still better than the current implementation in aeon.

Also another thing, for the comparision across channels I am using series length of 1000. That is why the aeon implementation is so close. But if you look at the 2nd plot, it diverges for 2 channels case for higher timer series. I am guessing this but I think the same will happen(divergent behaviour) if I keep lets say time series length =15000. Will have to check this though.

tanishy7777 avatar Apr 25 '25 12:04 tanishy7777

Yes, I meant the single channel experiment.

SebastianSchmidl avatar Apr 25 '25 12:04 SebastianSchmidl

Yes, I meant the single channel experiment.

Btw edited my message (added a para) in case you missed it

tanishy7777 avatar Apr 25 '25 12:04 tanishy7777

It is not obvious how to best benchmark this because we have two scaling dimensions (length of time series and number of channels), which are not completely independent. You could create a full matrix of tests (all combinations of lengths and channels in the search range), but this could take a long time and is harder to plot. Maybe, we just need to increase the number of experiments, e.g. 5 length-scaling experiments with fixed #channels = [1,5,10,100,1000] and 5 channel-scaling experiments with fixed lengths = [1000,5000,10000,50000,100000] or similar ... still a lot to test

SebastianSchmidl avatar Apr 25 '25 12:04 SebastianSchmidl

It is not obvious how to best benchmark this because we have two scaling dimensions (length of time series and number of channels), which are not completely independent. You could create a full matrix of tests (all combinations of lengths and channels in the search range), but this could take a long time and is harder to plot. Maybe, we just need to increase the number of experiments, e.g. 5 length-scaling experiments with fixed #channels = [1,5,10,100,1000] and 5 channel-scaling experiments with fixed lengths = [1000,5000,10000,50000,100000] or similar ... still a lot to test

Yep that makes sense, will do this

tanishy7777 avatar Apr 25 '25 13:04 tanishy7777

Just catching up really. Is the main issue here scalability or? @TonyBagnall @chrisholder possibly some issues re: the unequal stuff and results comparison

The main issue was that the definition of the SBD distance included a multivariate version that was not implemented in aeon. aeon just computed the SBD for each channel independently and then combined the results (we do this for other distances as well).

However, multiple PRs (this one and #2715) were opened to address this issue. I suggested comparing their performance and integrating the better performing implementation into aeon.

SebastianSchmidl avatar May 06 '25 16:05 SebastianSchmidl

There seems to be no replies for the other PR. I suggest just closing it unless the contents itself are better.

MatthewMiddlehurst avatar May 08 '25 12:05 MatthewMiddlehurst

This version also seems to perform better than #2715, but let's wait for the new numbers from @tanishy7777. Once, your mentioned issues are addressed, I would also merge this one in favor of #2715 (I already added a dependency to the issue description for this).

SebastianSchmidl avatar May 08 '25 13:05 SebastianSchmidl

catching up a bit, I agree its important to have consistency with the other implementations and this solution seems good to me

TonyBagnall avatar May 18 '25 11:05 TonyBagnall

I have closed the other PR, is this still active @tanishy7777?

MatthewMiddlehurst avatar May 23 '25 19:05 MatthewMiddlehurst

I have closed the other PR, is this still active @tanishy7777?

Yea will work on this. I think only the benchmarks are remaining

tanishy7777 avatar May 25 '25 07:05 tanishy7777

I still have a few comments above. More concerned with correctness than speed currently but if you can benchmark it thats good.

MatthewMiddlehurst avatar May 25 '25 17:05 MatthewMiddlehurst

I still have a few comments above. More concerned with correctness that speed currently but if you can benchmark it thats good.

Got it, will address the comments first.

tanishy7777 avatar May 27 '25 17:05 tanishy7777

I agree. The other implementation is stale, and the preliminary performance looked good. So, you can go ahead with resolving the issues and checking the correctness of your implementation.

SebastianSchmidl avatar May 28 '25 09:05 SebastianSchmidl