aeon icon indicating copy to clipboard operation
aeon copied to clipboard

[DOC] More prominent examples of using aeon with sklearn

Open TonyBagnall opened this issue 1 year ago • 9 comments

Describe the issue linked to the documentation

it may just be me, but looking around our docs for examples on how to use aeon with sklearn cross validation etc, all I found was this

https://www.aeon-toolkit.org/en/v0.10.0/examples/distances/sklearn_distances.html

I know there is more there, but I think "getting started if you are familiar with sklearn" with loads of examples for clustering, classification and regression or something would be helpful

Suggest a potential alternative/fix

No response

TonyBagnall avatar Aug 07 '24 15:08 TonyBagnall

@aeon-actions-bot assign @aryan0931

aryan0931 avatar Nov 27 '24 18:11 aryan0931

Hey @MatthewMiddlehurst, I am keen in solving this documentation issue. Can I please be assigned this?

NiyatiBisht08 avatar Mar 22 '25 04:03 NiyatiBisht08

What do you plan to add?

MatthewMiddlehurst avatar Mar 22 '25 09:03 MatthewMiddlehurst

Hi @MatthewMiddlehurst , I have experience in scientific computing and AI-driven projects, and I'm familiar with both sklearn and documentation structuring. I can enhance the examples to make it easier for sklearn users to get started with aeon. Let me know if I can take this up .

an04shu avatar Mar 28 '25 11:03 an04shu

What do you plan on adding?

MatthewMiddlehurst avatar Mar 28 '25 11:03 MatthewMiddlehurst

Thanks for the clarification! After reviewing the documentation, I noticed that while it covers classification (using KNeighborsClassifier) and clustering (DBSCAN), there are still some areas where more clarity could be helpful. I’d like to contribute by adding: Regression Examples: Right now, there aren’t any concrete examples of using aeon distances with scikit-learn regression models. I plan to add a practical example demonstrating how to integrate aeon distances in a regression workflow.

Better Data Formatting Guidance: The docs mention the difference between aeon’s 3D format (n_cases, n_channels, n_timepoints) and sklearn’s 2D format but don’t provide step-by-step instructions on converting between the two. I’d like to add clear examples of how to reshape data correctly.

Preprocessing Tips: Since sklearn expects 2D arrays, preprocessing is crucial when working with time-series data. I’ll include guidance on structuring datasets properly, ensuring compatibility, and avoiding common errors.

an04shu avatar Mar 28 '25 12:03 an04shu

Ok feel free to open a PR and we can see how it looks. I wouldn't close this issue in it though.

MatthewMiddlehurst avatar Mar 28 '25 12:03 MatthewMiddlehurst

I've opened a PR for this. Looking forward to your feedback

an04shu avatar Mar 30 '25 19:03 an04shu

Hi @MatthewMiddlehurst This is a great point — I’ve also found myself wishing for more sklearn-style examples when getting started with aeon. I’ve got some experience with sklearn and would love to help out with this!

I’m happy to put together a “getting started for sklearn users” section with examples for clustering, classification, regression, and cross-validation. Let me know if that sounds good to you!

hiteshkarvil avatar Apr 10 '25 09:04 hiteshkarvil