matrixprofile-ts icon indicating copy to clipboard operation
matrixprofile-ts copied to clipboard

Code Style Guidelines

Open tylerwmarrs opened this issue 6 years ago • 6 comments

There seems to be a lot of inconsistency in coding style. For example, the time series variable is as tsA instead of snake cased as ts_a. Python conventions are not being met within the code. This is fairly low priority, but something to consider; especially if many people contribute. I am not saying that this code base should follow Python standards - just a standard.

tylerwmarrs avatar Jan 18 '19 05:01 tylerwmarrs

@tylerwmarrs that's a fair point. Do you have any specific proposals, or issues that really stand out?

vanbenschoten avatar Jan 18 '19 15:01 vanbenschoten

It probably makes the most sense to follow PEP-8 standards - https://www.python.org/dev/peps/pep-0008/

I believe that you can add a PEP-8 linter to your pull request process.

https://gist.github.com/MichaelCurrie/802ce28c993ff2dd632c

I really like how numpy documents their code. Here is an example and documentation. https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard

tylerwmarrs avatar Jan 18 '19 16:01 tylerwmarrs

I agree - the code for SCRIMP++ is especially hard to follow.

ameya98 avatar Jun 18 '19 05:06 ameya98

@ameya98 SCRIMP++ is written with the goal to have almost every piece of the code unit tested. The fact of the matter is that SCRIMP++ is more complex than the original algorithms. Do you have any suggestions on how to improve the module so that it is easier to follow?

tylerwmarrs avatar Jun 18 '19 05:06 tylerwmarrs

Obviously, I do not intend to criticize this library and your work, because I think it is a great contribution for everyone to use. However, if you do want constructive feedback:

  • Why are there so many functions that seem to do extremely simple operations? Unit-testing does not require breaking extremely simple operations into multiple functions. You can check whatever properties you want to check inside your tests - there's no need for the source code to be so convoluted.
  • Variable names such as tmp_a, tmp_b and so on - this makes the code absolutely unmaintainable.
  • Most of these functions have 5+ parameters - this is not recommended at all as it makes the function too verbose and harder to test.

Use a static analysis tool (eg. Codebeat) that will give your code a rating and tell you where/how to improve the code by refactoring.

ameya98 avatar Jun 18 '19 06:06 ameya98

@ameya98 I very much so appreciate your feedback. All criticism is taken in the right context. I am probably a little bias towards the structure of the module as I wrote it. I think that it is still maintainable as is, but for others to follow, it definitely could use some refactoring.

Extremely simple operations - this shows the slow approach to how I implemented the algorithm. Literally one major computation at a time. This allowed for me to leave the algorithm and come back to it easily.

The variable names are not descriptive enough in some cases for sure.

I hope you can find some time to give this module the appropriate refactoring it needs. Unfortunately, I have other pressing issues at hand that will prevent me from doing this for a while.

As an aside, have you looked at the Matlab implementations of these algorithms? I would argue that those are difficult to follow 😁. Instead of hijacking this thread for SCRIMP++, please create a new issue if you are able to refactor.

tylerwmarrs avatar Jun 18 '19 06:06 tylerwmarrs