aeon
aeon copied to clipboard
[ENH] Implement Proximity Forest 2.0 classifier using aeon distances
Reference Issues/PRs
Closes: #428 Incorporates changes suggested in #1874 (maybe we can close PR #1876)
What does this implement/fix? Explain your changes.
- Created a private distance file to parameterise DTW and ADTW distances.
- Created a function to calculate the first_order_derivative of a time series.
- Implemented the ProximityTree2 and ProximityForest2 class, as per the paper.
- Added the classes to API.
- Wrote unit tests.
To do:
- [ ] Improve the computational efficiency by implementing Early Abandoning and Pruning algorithm for elastic distance measures.
Thank you for contributing to aeon
I have added the following labels to this PR based on the title: [ $\color{#FEF1BE}{\textsf{enhancement}}$ ]. I have added the following labels to this PR based on the changes made: [ $\color{#BCAE15}{\textsf{classification}}$ ]. Feel free to change these if they do not properly represent the PR.
The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.
If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.
Don't hesitate to ask questions on the aeon Slack channel if you have any.
PR CI actions
These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.
- [ ] Run
pre-commitchecks for all files - [ ] Run all
pytesttests and configurations - [ ] Run all notebook example tests
- [ ] Run numba-disabled
codecovtests - [ ] Stop automatic
pre-commitfixes (always disabled for drafts) - [ ] Push an empty commit to re-run CI checks
have we compared results vs published?
Not yet, need to put it on the cluster. I have not got results for the original yet also.
have we compared results vs published?
Actually, the algorithm isn't complete yet. We still need to work on computational power, particularly integrating the EAP technique. I'd resume the work in a couple of days.
Want to run this soon. This is not blocking that, but some component seems to be non-deterministic.
What were the remaining things to do here ? In terms of testing, it seems that there is some kind of non-deterministic thing happens, which given the algorithm shouldn't be happening If I remember correctly.
Need to compare to past results and yeah something is failing the non-deterministic test
What were the remaining things to do here ? In terms of testing, it seems that there is some kind of non-deterministic thing happens, which given the algorithm shouldn't be happening If I remember correctly.
Hi, we need to integrate the EAP for distance measures to complete this algorithm as such. In the current implementation, I've used private distance functions, which use pruning to stop distance calculations greater than a threshold.
Hi, @MatthewMiddlehurst, I've corrected the code. I think we are good to compare the results now.
that was not the issue, its will be internal to the fit tree function. Would revert the last change as the previous was how we do that elsewhere.
Does not stop me evaluating, just have to find the time to get it working with our setup and run it 🙂
Hi, I ran this on 90+ UCR datasets with 5 resamples, and it appears to perform worse.
5 resamples
aeon average acc: 0.827681 paper average acc: 0.861343 average acc diff: -0.03366
| aeon | paper | diff | |
|---|---|---|---|
| SonyAIBORobotSurface1 | 0.941764 | 0.878869 | 0.062895 |
| Wine | 0.822222 | 0.781481 | 0.040741 |
| Lightning7 | 0.775342 | 0.742466 | 0.032877 |
| Lightning2 | 0.809836 | 0.783607 | 0.02623 |
| ECG200 | 0.904 | 0.89 | 0.014 |
| RefrigerationDevices | 0.7008 | 0.693333 | 0.007467 |
| DiatomSizeReduction | 0.944444 | 0.937255 | 0.00719 |
| HouseTwenty | 0.941176 | 0.934454 | 0.006723 |
| Ham | 0.737143 | 0.733333 | 0.00381 |
| DistalPhalanxOutlineCorrect | 0.806522 | 0.803623 | 0.002899 |
| FaceFour | 0.956818 | 0.954545 | 0.002273 |
| ItalyPowerDemand | 0.957629 | 0.955879 | 0.001749 |
| ECG5000 | 0.944267 | 0.943111 | 0.001156 |
| Mallat | 0.97177 | 0.971087 | 0.000682 |
| Coffee | 1 | 1 | 0 |
| GunPointOldVersusYoung | 1 | 1 | 0 |
| InsectEPGRegularTrain | 1 | 1 | 0 |
| InsectEPGSmallTrain | 1 | 1 | 0 |
| SmoothSubspace | 0.998667 | 0.998667 | 0 |
| Trace | 1 | 1 | 0 |
| TwoPatterns | 0.9997 | 0.99995 | -0.00025 |
| GunPointMaleVersusFemale | 0.998101 | 0.999367 | -0.00127 |
| Wafer | 0.99682 | 0.998799 | -0.00198 |
| SyntheticControl | 0.992667 | 0.995333 | -0.00267 |
| Earthquakes | 0.748201 | 0.751079 | -0.00288 |
| FaceAll | 0.947456 | 0.951361 | -0.00391 |
| Plane | 0.994286 | 1 | -0.00571 |
| FacesUCR | 0.959024 | 0.964878 | -0.00585 |
| OliveOil | 0.88 | 0.886667 | -0.00667 |
| DistalPhalanxOutlineAgeGroup | 0.794245 | 0.801439 | -0.00719 |
| MixedShapesSmallTrain | 0.92701 | 0.93468 | -0.00767 |
| MixedShapesRegularTrain | 0.962392 | 0.971546 | -0.00915 |
| PhalangesOutlinesCorrect | 0.812121 | 0.821678 | -0.00956 |
| Chinatown | 0.956851 | 0.96793 | -0.01108 |
| MoteStrain | 0.916294 | 0.927476 | -0.01118 |
| CBF | 0.975778 | 0.988222 | -0.01244 |
| Meat | 0.963333 | 0.976667 | -0.01333 |
| CricketZ | 0.804103 | 0.818974 | -0.01487 |
| CinCECGTorso | 0.973768 | 0.988696 | -0.01493 |
| InsectWingbeatSound | 0.615253 | 0.630202 | -0.01495 |
| Worms | 0.693506 | 0.709091 | -0.01558 |
| Crop | 0.745595 | 0.762512 | -0.01692 |
| GunPointAgeSpan | 0.978481 | 0.99557 | -0.01709 |
| CricketX | 0.781538 | 0.798974 | -0.01744 |
| PowerCons | 0.97 | 0.987778 | -0.01778 |
| ShapeletSim | 0.89 | 0.907778 | -0.01778 |
| Strawberry | 0.942162 | 0.963784 | -0.02162 |
| Computers | 0.768 | 0.7904 | -0.0224 |
| BME | 0.973333 | 1 | -0.02667 |
| Herring | 0.56875 | 0.596875 | -0.02813 |
| ACSF1 | 0.712 | 0.742 | -0.03 |
| SwedishLeaf | 0.93312 | 0.96384 | -0.03072 |
| ToeSegmentation2 | 0.887692 | 0.918462 | -0.03077 |
| Symbols | 0.934271 | 0.96804 | -0.03377 |
| CricketY | 0.767692 | 0.802051 | -0.03436 |
| Haptics | 0.446104 | 0.480519 | -0.03442 |
| UWaveGestureLibraryY | 0.75321 | 0.78794 | -0.03473 |
| UWaveGestureLibraryX | 0.815131 | 0.851591 | -0.03646 |
| ArrowHead | 0.861714 | 0.898286 | -0.03657 |
| MiddlePhalanxOutlineCorrect | 0.798625 | 0.837113 | -0.03849 |
| UWaveGestureLibraryZ | 0.750307 | 0.789336 | -0.03903 |
| MedicalImages | 0.749211 | 0.791316 | -0.04211 |
| WordSynonyms | 0.748903 | 0.79185 | -0.04295 |
| ECGFiveDays | 0.926365 | 0.973519 | -0.04715 |
| Rock | 0.764 | 0.812 | -0.048 |
| SonyAIBORobotSurface2 | 0.843861 | 0.89192 | -0.04806 |
| ShapesAll | 0.85 | 0.898667 | -0.04867 |
| WormsTwoClass | 0.737662 | 0.787013 | -0.04935 |
| BirdChicken | 0.88 | 0.93 | -0.05 |
| GunPoint | 0.949333 | 1 | -0.05067 |
| Yoga | 0.8544 | 0.9054 | -0.051 |
| ProximalPhalanxTW | 0.747317 | 0.8 | -0.05268 |
| FiftyWords | 0.793846 | 0.847473 | -0.05363 |
| ProximalPhalanxOutlineAgeGroup | 0.794146 | 0.847805 | -0.05366 |
| ProximalPhalanxOutlineCorrect | 0.817182 | 0.876976 | -0.05979 |
| Fish | 0.908571 | 0.969143 | -0.06057 |
| MiddlePhalanxTW | 0.497403 | 0.558442 | -0.06104 |
| DistalPhalanxTW | 0.647482 | 0.709353 | -0.06187 |
| Adiac | 0.690537 | 0.75601 | -0.06547 |
| MiddlePhalanxOutlineAgeGroup | 0.607792 | 0.675325 | -0.06753 |
| ChlorineConcentration | 0.584635 | 0.652344 | -0.06771 |
| ScreenType | 0.5696 | 0.6416 | -0.072 |
| OSULeaf | 0.800826 | 0.88843 | -0.0876 |
| UMD | 0.884722 | 0.975 | -0.09028 |
| SmallKitchenAppliances | 0.685333 | 0.780267 | -0.09493 |
| BeetleFly | 0.76 | 0.86 | -0.1 |
| Beef | 0.52 | 0.62 | -0.1 |
| EOGHorizontalSignal | 0.668508 | 0.768508 | -0.1 |
| InlineSkate | 0.402182 | 0.505455 | -0.10327 |
| LargeKitchenAppliances | 0.716267 | 0.819733 | -0.10347 |
| EOGVerticalSignal | 0.656354 | 0.760221 | -0.10387 |
| FreezerRegularTrain | 0.882456 | 0.999228 | -0.11677 |
| Car | 0.763333 | 0.883333 | -0.12 |
| ToeSegmentation1 | 0.769298 | 0.89386 | -0.12456 |
| FreezerSmallTrain | 0.754386 | 0.892912 | -0.13853 |
| TwoLeadECG | 0.830378 | 0.997191 | -0.16681 |
train/test
aeon average acc: 0.815098 paper average acc: 0.848603 average acc diff: -0.0335
| aeon | paper | diff | |
|---|---|---|---|
| SonyAIBORobotSurface1 | 0.908486 | 0.816972 | 0.091514 |
| Wine | 0.574074 | 0.518519 | 0.055556 |
| Ham | 0.685714 | 0.657143 | 0.028571 |
| ArrowHead | 0.897143 | 0.874286 | 0.022857 |
| DiatomSizeReduction | 0.973856 | 0.954248 | 0.019608 |
| HouseTwenty | 0.94958 | 0.932773 | 0.016807 |
| Lightning2 | 0.836066 | 0.819672 | 0.016393 |
| RefrigerationDevices | 0.557333 | 0.546667 | 0.010667 |
| DistalPhalanxOutlineCorrect | 0.789855 | 0.786232 | 0.003623 |
| Coffee | 1 | 1 | 0 |
| DistalPhalanxOutlineAgeGroup | 0.726619 | 0.726619 | 0 |
| FaceFour | 0.965909 | 0.965909 | 0 |
| GunPointMaleVersusFemale | 1 | 1 | 0 |
| GunPointOldVersusYoung | 1 | 1 | 0 |
| Haptics | 0.457792 | 0.457792 | 0 |
| InsectEPGRegularTrain | 1 | 1 | 0 |
| InsectEPGSmallTrain | 1 | 1 | 0 |
| OliveOil | 0.9 | 0.9 | 0 |
| SmoothSubspace | 1 | 1 | 0 |
| SyntheticControl | 0.99 | 0.99 | 0 |
| Trace | 1 | 1 | 0 |
| FacesUCR | 0.957073 | 0.957561 | -0.00049 |
| TwoPatterns | 0.999 | 0.99975 | -0.00075 |
| CricketY | 0.807692 | 0.810256 | -0.00256 |
| CricketZ | 0.802564 | 0.805128 | -0.00256 |
| Wafer | 0.99562 | 0.998378 | -0.00276 |
| Chinatown | 0.973761 | 0.976676 | -0.00292 |
| ECG5000 | 0.941556 | 0.945778 | -0.00422 |
| ItalyPowerDemand | 0.962099 | 0.96793 | -0.00583 |
| FaceAll | 0.800592 | 0.808284 | -0.00769 |
| MixedShapesRegularTrain | 0.960825 | 0.969485 | -0.00866 |
| CinCECGTorso | 0.960145 | 0.968841 | -0.0087 |
| Plane | 0.990476 | 1 | -0.00952 |
| MiddlePhalanxTW | 0.519481 | 0.532468 | -0.01299 |
| GunPoint | 0.986667 | 1 | -0.01333 |
| Lightning7 | 0.780822 | 0.794521 | -0.0137 |
| Earthquakes | 0.748201 | 0.76259 | -0.01439 |
| MixedShapesSmallTrain | 0.922887 | 0.938144 | -0.01526 |
| Meat | 0.916667 | 0.933333 | -0.01667 |
| GunPointAgeSpan | 0.981013 | 1 | -0.01899 |
| SwedishLeaf | 0.9312 | 0.9504 | -0.0192 |
| BME | 0.98 | 1 | -0.02 |
| CricketX | 0.776923 | 0.797436 | -0.02051 |
| Crop | 0.746786 | 0.767321 | -0.02054 |
| ProximalPhalanxOutlineCorrect | 0.859107 | 0.879725 | -0.02062 |
| MoteStrain | 0.904153 | 0.92492 | -0.02077 |
| DistalPhalanxTW | 0.640288 | 0.661871 | -0.02158 |
| Mallat | 0.955224 | 0.978252 | -0.02303 |
| ToeSegmentation2 | 0.884615 | 0.907692 | -0.02308 |
| MedicalImages | 0.759211 | 0.782895 | -0.02368 |
| EOGHorizontalSignal | 0.546961 | 0.571823 | -0.02486 |
| PowerCons | 0.961111 | 0.988889 | -0.02778 |
| Symbols | 0.948744 | 0.976884 | -0.02814 |
| UWaveGestureLibraryY | 0.754886 | 0.784478 | -0.02959 |
| Strawberry | 0.935135 | 0.964865 | -0.02973 |
| Herring | 0.59375 | 0.625 | -0.03125 |
| Computers | 0.712 | 0.744 | -0.032 |
| InsectWingbeatSound | 0.611111 | 0.643434 | -0.03232 |
| Beef | 0.7 | 0.733333 | -0.03333 |
| UMD | 0.951389 | 0.986111 | -0.03472 |
| PhalangesOutlinesCorrect | 0.794872 | 0.829837 | -0.03497 |
| Yoga | 0.854333 | 0.889333 | -0.035 |
| FiftyWords | 0.804396 | 0.841758 | -0.03736 |
| CBF | 0.957778 | 0.996667 | -0.03889 |
| Worms | 0.662338 | 0.701299 | -0.03896 |
| UWaveGestureLibraryX | 0.807929 | 0.847292 | -0.03936 |
| ACSF1 | 0.79 | 0.83 | -0.04 |
| ECG200 | 0.9 | 0.94 | -0.04 |
| UWaveGestureLibraryZ | 0.745114 | 0.785874 | -0.04076 |
| ShapesAll | 0.848333 | 0.891667 | -0.04333 |
| ProximalPhalanxTW | 0.746341 | 0.790244 | -0.0439 |
| MiddlePhalanxOutlineAgeGroup | 0.551948 | 0.597403 | -0.04545 |
| WordSynonyms | 0.731975 | 0.782132 | -0.05016 |
| WormsTwoClass | 0.727273 | 0.779221 | -0.05195 |
| ProximalPhalanxOutlineAgeGroup | 0.795122 | 0.84878 | -0.05366 |
| Fish | 0.925714 | 0.982857 | -0.05714 |
| SonyAIBORobotSurface2 | 0.828961 | 0.886674 | -0.05771 |
| ChlorineConcentration | 0.580469 | 0.642969 | -0.0625 |
| FreezerSmallTrain | 0.682456 | 0.74807 | -0.06561 |
| ShapeletSim | 0.883333 | 0.95 | -0.06667 |
| ECGFiveDays | 0.894309 | 0.962834 | -0.06852 |
| ScreenType | 0.466667 | 0.536 | -0.06933 |
| MiddlePhalanxOutlineCorrect | 0.776632 | 0.85567 | -0.07904 |
| Rock | 0.72 | 0.8 | -0.08 |
| FreezerRegularTrain | 0.910175 | 0.998246 | -0.08807 |
| EOGVerticalSignal | 0.475138 | 0.569061 | -0.09392 |
| Adiac | 0.680307 | 0.780051 | -0.09974 |
| InlineSkate | 0.374545 | 0.494545 | -0.12 |
| LargeKitchenAppliances | 0.645333 | 0.765333 | -0.12 |
| ToeSegmentation1 | 0.846491 | 0.969298 | -0.12281 |
| SmallKitchenAppliances | 0.658667 | 0.784 | -0.12533 |
| OSULeaf | 0.743802 | 0.871901 | -0.1281 |
| BirdChicken | 0.8 | 0.95 | -0.15 |
| Car | 0.783333 | 0.933333 | -0.15 |
| TwoLeadECG | 0.833187 | 0.998244 | -0.16506 |
| BeetleFly | 0.65 | 0.85 | -0.2 |
The results are not what I expected :( Apart from EAP, everything else was adopted per the paper, so accuracy should have been the same. Will look into the code once more.
I have updated to also include just the default train/test split. The paper seems to have changed a big from the arxiv version. We can try asking to authors if necessary.
I have updated to also include just the default train/test split. The paper seems to have changed a big from the arxiv version. We can try asking to authors if necessary.
Yes, you are right, in the published version, they've added HYDRA as well. It wasn't there in the arxiv version.
Hi @MatthewMiddlehurst, could you rerun this on the UCR dataset? Have added Minkowski+HYDRA, so it may be slow compared to ProximityForest.