gitinspector icon indicating copy to clipboard operation
gitinspector copied to clipboard

Add MATLAB/Octave extension and comment recognition

Open bonanza123 opened this issue 8 years ago • 9 comments

Adds MATLAB/Octave extension and comment recognition

bonanza123 avatar Sep 20 '16 16:09 bonanza123

Hi. Thank you. Would you also be willing to add support for metrics part (metrics.py) ? We have support for a lot of languages, but most of them lack metrics support. I once was pretty proficient in matlab (we had it in a few math courses), but haven't really used it since.

To explain how it works;

metric_eloc = Specifies the recommended maximum number of lines for a certain file type. __metric_cc_tokens __ = For each language, the first list is a list of regexp entry tokens (something that increases the depth of logic), the second list is a regexp list of exit tokens (something that decreases the depth).

adam-waldenberg avatar Sep 23 '16 14:09 adam-waldenberg

@adam-waldenberg I will have a look at it, but I guess this will take some time as I'm currently a bit busy.

One more question regarding the metrics: what does the r e.g. in

["else", r"for\s+\(.*\)", r"foreach\s+\(.*\)", r"goto\s+\w+:", r"if\s+\(.*\)", r"case\s+\w+:",
"default:", r"while\s+\(.*\)"],

mean? And how to handle cases like:

if ( 5 > 1 ) // comment

? Or are the comments stripped before the metrics are calculated?

bonanza123 avatar Sep 23 '16 14:09 bonanza123

@bonanza123 That is the notation for "raw strings" in Python. The string becomes a raw string instead of a Python string. So you don't have to do for example "" to do a single backslash.

r"if\s+(.*) should pick up you example (it ignores the rest).

You can use something like https://regex101.com/ to test.

Have fun ! :)

adam-waldenberg avatar Sep 23 '16 14:09 adam-waldenberg

@adam-waldenberg One more question: Does r"for\s+\(.*\)" also match r"parfor\s+\(.*\)" ?

It would also be nice if you could check my modifications, in particular the if and elseif part, as I'm not sure how the matching is done in the scripts.

bonanza123 avatar Sep 24 '16 17:09 bonanza123

@bonanza123 Hi. Yes, it should see a parfor with that as well. They only immediate problem I see is that "elseif\s+(.*)" is defined as both an entry and exit token.

Otherwise it looks good.

adam-waldenberg avatar Oct 03 '16 00:10 adam-waldenberg

@adam-waldenberg thanks for your feedback. What do you recommend regarding the elseif ? In MATLAB it could look like:

if (x == 5 )
y=5;
elseif (x == 4)
y=34;
elseif ( x==3)
y=5;
else
y=0;
end

I assumed that if we just count from if to end, then we wouldn't consider how many ugly elseifs he/she added.

Alternatively, he/she could also write a lot of if, else, end which would maybe even worse.

bonanza123 avatar Oct 03 '16 10:10 bonanza123

@bonanza123 The elseif's are still counted if they are included in the token list. You should probably also add "else". So in the above case gitinspector would/should count it the following way;

if (x == 5 ) % +2
    y=5;              
elseif (x == 4) % +2
    y=34;
elseif ( x==3) % +2
    y=5;
else % +2
    y=0;
end % +1

Strictly speaking, each point in the if/elseif/else list is of course an exit point, but we don't really care about that. The gitinspector way of counting complexity isn't "exact". However, it doesn't have to be, as it's mainly a way to just help everybody find badly designed files, not as an exact measurement of a metric.

adam-waldenberg avatar Oct 04 '16 13:10 adam-waldenberg

After some thought, I actually think you can remove "end" from the token list.

adam-waldenberg avatar Oct 04 '16 13:10 adam-waldenberg

@bonanza123

Looked a bit more at this... It doesn't seem complete. Does matlab require paranthesis around for loops and statements? I Seem to remember you can write similar to;

for a=1:1:10

Quite a few years since I did anything in Matlab.

adam-waldenberg avatar May 12 '17 23:05 adam-waldenberg