aeon [MNT] Add codespell support (config, workflow to detect/not fix) and make it fix a "few" typos

More about codespell: https://github.com/codespell-project/codespell .

I personally introduced it to dozens if not hundreds of projects already and so far only positive feedback.

CI workflow has 'permissions' set only to 'read' so also should be safe.

This is just a draft: due to large number of typos detected, so before tuning up config and skips, I decided first to check with you either you would be interested/willing to eventually review/accept such a PR. But if you spot false positive -- alert or introduce into pyproject.toml config for codespell and let's work together to make aeon "typos free"!

TODOs

...
[ ] run round of interactive fixes for ambiguous typos -- I didn't do yet

Mar 19 '25 12:03 yarikoptic

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Mar 19 '25 12:03 review-notebook-app[bot]

Thank you for contributing to `aeon`

I did not find any labels to add based on the title. Please add the [ENH], [MNT], [BUG], [DOC], [REF], [DEP] and/or [GOV] tags to your pull requests titles. For now you can add the labels manually. This PR changes too many different packages (>3) for automatic addition of labels, please manually add package labels if relevant.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

[ ] Run pre-commit checks for all files
[ ] Run mypy typecheck tests
[ ] Run all pytest tests and configurations
[ ] Run all notebook example tests
[ ] Run numba-disabled codecov tests
[ ] Stop automatic pre-commit fixes (always disabled for drafts)
[ ] Disable numba cache loading
[ ] Push an empty commit to re-run CI checks

Mar 19 '25 12:03 aeon-actions-bot[bot]

@yarikoptic anything you need for this? No issue if you're just busy.

Apr 03 '25 17:04 MatthewMiddlehurst

@MatthewMiddlehurst thanks for the review!

There are a few concerns that would mean I would not want it editing my code/docs on its own, but making PRs or suggestions is fine, and I could see it being helpful.

this would not do any changes -- it would just detect and annotate like at https://github.com/aeon-toolkit/aeon/pull/2653/files#diff-55dd3e1949bcb682be343ce038cc8afa221db3fd5d3096eee878870ca6b34543 you can now see

which you would not get if you have only pre-commit. Up to you to either leave a dedicated CI as well, or just resort to pre-commit. I personally prefer a dedicated one for this and other reasons such as immediately becomes clear that likely culprit is just a typo ;)

Apr 03 '25 19:04 yarikoptic

@yarikoptic anything you need for this? No issue if you're just busy.

ideally

1. someone going through those ambigous ones to see if anything to skip at files level or just for specific words or ...

❯ codespell
./aeon/benchmarking/results_loaders.py:254: ot ==> to, of, or, not, it
./aeon/classification/deep_learning/_disjoint_cnn.py:57: ot ==> to, of, or, not, it
./aeon/classification/deep_learning/_fcn.py:39: ot ==> to, of, or, not, it
./aeon/classification/dictionary_based/_tde.py:619: propotion ==> proportion, promotion
./aeon/classification/ordinal_classification/_ordinal_tde.py:593: propotion ==> proportion, promotion
./aeon/clustering/deep_learning/_ae_dcnn.py:49: ot ==> to, of, or, not, it
./aeon/clustering/deep_learning/_ae_fcn.py:47: ot ==> to, of, or, not, it
./aeon/networks/_ae_fcn.py:37: ot ==> to, of, or, not, it
./aeon/networks/_disjoint_cnn.py:51: ot ==> to, of, or, not, it
./aeon/networks/_fcn.py:32: ot ==> to, of, or, not, it
./aeon/regression/deep_learning/_disjoint_cnn.py:57: ot ==> to, of, or, not, it
./aeon/regression/deep_learning/_fcn.py:39: ot ==> to, of, or, not, it
./aeon/segmentation/_ggs.py:66: interative ==> interactive, iterative
./aeon/segmentation/_ggs.py:129: fo ==> of, for, to, do, go
./aeon/segmentation/_ggs.py:390: interative ==> interactive, iterative
./aeon/segmentation/_hidalgo.py:540: lik ==> like, lick, link
./aeon/segmentation/_hidalgo.py:541: lik ==> like, lick, link
./aeon/segmentation/_hidalgo.py:625: lik ==> like, lick, link
./aeon/segmentation/_hidalgo.py:627: lik ==> like, lick, link
./aeon/segmentation/_hidalgo.py:629: lik ==> like, lick, link
./aeon/segmentation/_hmm.py:246: preceeding ==> preceding, proceeding
./aeon/similarity_search/base.py:199: interable ==> iterable, interactable, integrable, intolerable
./aeon/similarity_search/query_search.py:163: Interable ==> Iterable, Interactable, Integrable, Intolerable
./aeon/transformations/collection/channel_selection/_channel_scorer.py:25: fro ==> for, from
./aeon/transformations/collection/convolution_based/_minirocket.py:142: dependend ==> dependent, depended, depend
./aeon/transformations/collection/convolution_based/_multirocket.py:164: dependend ==> dependent, depended, depend
./aeon/transformations/collection/tests/test_pad.py:69: padd ==> pad, padded
./aeon/utils/tags/_tags.py:3: contrained ==> constrained, contained
./aeon/visualisation/estimator/_shapelets.py:262: macth ==> match, math, mach
./aeon/visualisation/estimator/_shapelets.py:515: macth ==> match, math, mach
./aeon/visualisation/estimator/_shapelets.py:1117: macth ==> match, math, mach
./aeon/visualisation/results/_mcm.py:141: ot ==> to, of, or, not, it
./aeon/visualisation/results/_scatter.py:291: wil ==> will, well
./aeon/visualisation/results/_scatter.py:296: wil ==> will, well
./docs/conf.py:591: buttom ==> button, bottom
./docs/changelogs/v0/v0.1.md:54: coverting ==> converting, covering, coveting
./docs/changelogs/v0/v0.6.md:126: ot ==> to, of, or, not, it
./examples/clustering/partitional_clustering.ipynb:142: proably ==> probably, provably
./examples/datasets/provided_data.ipynb:496: flourescent ==> fluorescent, florescent
./examples/distances/distances.ipynb:418: constained ==> constrained, contained
./examples/networks/deep_learning.ipynb:257: throught ==> thought, through, throughout
./examples/transformations/sast.ipynb:2152: considere ==> consider, considered
./examples/transformations/tsfresh.ipynb:9: extacting ==> extracting, exacting

add those and other false positives to ignore to pyproject.toml config: https://github.com/aeon-toolkit/aeon/pull/2653/files#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711 .

then we could rerun that commit with codespell -w and fall it up with the codespell -w -i 3 -C 2 to address outstanding "ambiguous" ones and be done and make aeon typos free!

Apr 03 '25 19:04 yarikoptic

Hi, @yarikoptic I went ahead and did the above. I think It would fit better into our current setup going through pre-commit. Can you just do the annotations without the other action? Would be helpful if you could take a look at the setup for any issues. Don't have to go through all the typos 🙂

Jun 26 '25 18:06 MatthewMiddlehurst

Also, I am a fan of skipping .git don't think we want to skip .github, .git* will do both I believe, tried a few ways to do one but not the other but didn't work. Not sure if you could help with that @yarikoptic

Jun 26 '25 18:06 MatthewMiddlehurst

Hi, @yarikoptic I went ahead and did the above. I think It would fit better into our current setup going through pre-commit. Can you just do the annotations without the other action? Would be helpful if you could take a look at the setup for any issues. Don't have to go through all the typos 🙂

I am not sure that annotation action alone, without codespell one, would work. Could we push some TEMP commit with a typo and see if it gets properly annotated? (and then drop that commit and force-push; let me know if you need help on that) I think there is overall little to no harm to have additional codespell step.

Also, I am a fan of skipping .git don't think we want to skip .github, .git* will do both I believe, tried a few ways to do one but not the other but didn't work. Not sure if you could help with that @yarikoptic

replace .git* with just .git in skip, or am I missing something ?

Jun 26 '25 20:06 yarikoptic

No, not missing anything. I must have missed the simplest option while trying to fix that 🙂. Agreed, no harm in having an additional step/failing check for the annotations.

I am fine with this now but will let other developers comment if they want. Thank you for putting this up and for the help @yarikoptic.

Jun 30 '25 20:06 MatthewMiddlehurst

aeon aeon copied to clipboard

[MNT] Add codespell support (config, workflow to detect/not fix) and make it fix a "few" typos

Thank you for contributing to aeon

PR CI actions

aeon
aeon copied to clipboard

Thank you for contributing to `aeon`