Pylint alerts corrections as part of intervention experiment
I'd like to conduct a software engineering experiment regarding the benefit of Pylint alerts removal. The experiment is described here. In the experiments, Pylint is used with some specific alerts, files are selected for intervention and control. After the interventions are done, one can wait and examine the results.
I'm asking your approval for conducting the interventions in your repositories.The interventions are expected to benefit the project and will submitted in PR for approval.
See examples of interventions in stanford-oval/storm, gabfl/vault, and coreruleset/coreruleset.
See the planed internetions.
@brechtm, Is it OK if I'll fix the alerts?
Of course, code improvement are appreciated! However, it's not clear to me how much effort this requires on my part. Since I don't have much free time nowadays, I cannot promise I can respond in a timely manner.
I'll try not to require too much from you. I'll do the interventions. In case that I'll have questions, I'll be happy to consult.
After that, you'll ask you to review the PR. Most interventions are simple and I'll document and justify them so I hope it will not take you too much time.
I fixed the pylint alert and created a PR. Note that this is still work in progress yet I wanted to consult you.
I notice that the setup for development doe not use requirements file. Is setting the development environment done using pyenv_setup.py? Can you guide me?
The alert in src\rinoh\hyphenator.py seems to identify a bug and not a refactoring opportunity.all = ("Hyphenator") while it should be a list (e.g. all = ["Hyphenator"]). https://stackoverflow.com/questions/44834/what-does-all-mean-in-python
In src\rinoh\highlight.py and src\rinoh\frontend\sphinx\nodes.py there are alerts on using-constant-test. In both cases the function "warn" appears in the if statement but not as a call (e.g., warn()). Later the function is called. What is the intention? elif warn: warn('Could not lex literal_block as "%s". ' 'Highlighting skipped.' % language)
In src\rinoh\frontend\rst\nodes.py there is an alert on pointless statment. Are properties being created on the fly and the self.title aimed to check the existence? class Admonition(DocutilsGroupingNode): grouped_flowables_class = rt.Admonition
def children_flowables(self, skip_first=0): try: self.title return super().children_flowables(skip_first=1) except AttributeError: return super().children_flowables()
In src\rinoh\element.py method build_document of class DocumentElement has an empty implementation of pass. It is implemented in subclasses. Might be better to enforce it by raising NotImplementedError.
In src\rinoh\stylesheets\matcher.py there is a wildcard import. Code uses from rinoh.styleds import * This hides the imported objects and might lead to collisions in the future if objects of the same name are imported with wildcards. styleds is a collection of imports, aimed to ease related imports. I copied the used imports from there and deleted the unused ones. This adds many imports and does not benefit from styleds, so this is a downside of fixing this alert.
@brechtm, can you guide me regadding these cases?
My apologies for not getting back to you earlier. I'm trying to resume work on rinohtype after a long hiatus. I rebased your banch onto current master here: https://github.com/brechtm/rinohtype/tree/evidencebp
For now I have reviewed only a few of the changed files:
src\rinoh\style.py: Looks good 👍src\rinoh\font\type1.py: This breaks the tests, and I don't think the refactoring improves the code. The new method pulls in too much state (file, section, section_names, sections, values) from where it's being called.src\rinoh\reference.py: Again, I don't think the refactored code is better, as it duplicates the list of types being checked. I personally don't think too-many-branches is a real issue in these cases (very short body per branch). Would it be better if a match-case were to be used instead?src\rinoh\font\opentype\required.py: Looks good 👍src\rinoh\__main__.py: Looking at the code before refactoring, I can say it definitely would benefit from more structuring! (I haven't looked at the refactored code yet)
If you are still interested after all this time, I'm happy to review your other changes as well (in a timely fashion this time).
I notice that the setup for development doe not use requirements file. Is setting the development environment done using pyenv_setup.py? Can you guide me?
See https://github.com/brechtm/rinohtype/blob/master/DEVELOPING.rst
The alert in src\rinoh\hyphenator.py seems to identify a bug and not a refactoring opportunity.all = ("Hyphenator") while it should be a list (e.g. all = ["Hyphenator"]). https://stackoverflow.com/questions/44834/what-does-all-mean-in-python
Yes, that should be a list.
In src\rinoh\highlight.py and src\rinoh\frontend\sphinx\nodes.py there are alerts on using-constant-test. In both cases the function "warn" appears in the if statement but not as a call (e.g., warn()). Later the function is called. What is the intention?
That must be a bug, indeed! A comment says it's been copied from Sphinx, so perhaps the bug was also copied from there?
In src\rinoh\frontend\rst\nodes.py there is an alert on pointless statment. Are properties being created on the fly and the self.title aimed to check the existence?
self.title will be present if the node has a child named 'title' (see TreeNode.__getattr__). Adding a comment might be useful.
In src\rinoh\element.py method build_document of class DocumentElement has an empty implementation of pass. It is implemented in subclasses. Might be better to enforce it by raising NotImplementedError.
Most subclasses do not need to set document metadata. Raising NotImplementedError would all of them to override this method with a pass implementation.
In src\rinoh\stylesheets\matcher.py there is a wildcard import. Code uses from rinoh.styleds import * This hides the imported objects and might lead to collisions in the future if objects of the same name are imported with wildcards. styleds is a collection of imports, aimed to ease related imports. I copied the used imports from there and deleted the unused ones. This adds many imports and does not benefit from styleds, so this is a downside of fixing this alert.
Yes, indeed, this duplicates code for something that is already being handled in the styleds module.
As you can probably tell, I never really used code linters much. I guess my view on them is that their output is not black and white; in many cases it can make more sense to violate the generic linter rules simply because the code is still easy to read. Only humans can judge that (and several humans will likely disagree 😄). That said, I am sure that linters can help detect bugs! However, with the time I can spend on rinohtype, I'm now trying to more careful how I spend this time, making sure it pays off. I don't think adding a linter check will save more time than it will cost me.
On a related note, I recently realized that I overengineered the CI setup, automatically testing against a whole bunch of older docutils and Sphinx versions. I have to conclude that this is not worth the effort, since I spend a lot of time fixing issues with the many possible combinations of dependency versions (and the CI setup itself), but most people probably run (near) the latest versions. But I digress...
Thank you very much for your detailed feedback!
The main guideline in this experiment is that the repository should benefit from them. If you have time to review the other changes, that will be great. Otherwise, merging the interventions that you like and fixing the bug will be great.
As for the benefit of linters, this is part of what the experiment tries to evaluate. In general, it seems that complexity reducing alerts (like too-many-branches) do help to reduce tendency to bugs. However, there are indeed many cases in which it desn't help (e.g., there were no bugs to begin with), so indeed it should not be applied blindly.
Some more comments:
src/rinoh/attribute.py: OK, but it seems arbitrary to extract part of the code to a method but not the code above it.src/rinoh/image.py: Looks good 👍src/rinoh/font/opentype/cff.py: I prefer the longer lines.src/rinoh/backend/pdf/filter.py: I wouldn't refactor the code like this. The_handle_bytefunction encapsulates only part of the code that forms a logical unit. For example, it would make more sense to still do thebyte != last_bytecheck in run_length_encoder, but move the body of that branch to a function. This is similar to the refactoring insrc\rinoh\font\type1.py, where a seemingly arbitrary part of the code is moved into a function just for the sake of fixing the linting error but ignoring any logical coherence.src/rinoh/table.py: Looks good 👍src/rinoh/backend/pdf/xobject/purepng.py: I assume this simply copied in the updated version of PurePNG? Note to self: I should include the PurePNG version number (at least in the commit message)src/rinoh/flowable.py: Looks good, except that I don't agree with putting comma's at the start of the line 😄src/rinoh/hyphenator.py: Looks good 👍src/rinoh/paragraph.pyandsrc/rinoh/backend/pdf/reader.py: still to do...
In general, the handle*_ function names could be more descriptive.
src/rinoh/attribute.py: OK, but it seems arbitrary to extract part of the code to a method but not the code above it. -- Yes, there can be other extraction. Please remember that I'm not familiar with the project.
src/rinoh/flowable.py: Looks good, except that I don't agree with putting comma's at the start of the line -- A bit of a personal style, but I can change that.
src/rinoh/paragraph.py and src/rinoh/backend/pdf/reader.py: still to do... -- Ok, go over them and then we will continue
src/rinoh/paragraph.py: I agree that this method should be split up. However, the _process_groups method takes a lot of arguments and then returns some of them (updated), which I'm not sure is an improvement. It would take me quite a bit of time to grok this code and propertly refactor it.src/rinoh/backend/pdf/reader.py: Looks good 👍
I'll cherry-pick the commits that look good to me and adjust others to my liking (if it doesn't take too much time).
Great, thanks!