PyCG icon indicating copy to clipboard operation
PyCG copied to clipboard

Added quoting mechanism in utils to prevent crashes when analyzing ipynb files.

Open ani-hovhannisyan opened this issue 4 years ago • 6 comments

Notebook files (.ipynb) contain python code, but ipynb files may contain unusual symbols. Especially "%" or "!" at the beginning of the code. This forces AST module to throw an exception. Which catches PyCG and crashes without telling anything. To solve this, I thought it is better not to touch the codebase of PyCG and to add quoting mechanism in utils/common.py. However, maybe PyCG should also print out that some lines were considered as comments or print out that the file doesn't contain fully python code. The current commit works as follows: The regexp checks whether the line of the code contains "!" or "%" symbols at the beginning of the line (including whitespaces too). If regexp matches, then we add quoting "#" symbol right at the beginning of the unwanted symbol. So in the end, the whole line which contains unwanted symbols will be considered as a simple comment line.

ani-hovhannisyan avatar Jul 17 '21 15:07 ani-hovhannisyan

Thanks for the PR! Some minor things, just to be consistent with the styling of the rest of the codebase:

  • Please add an extra space at the beginning of each comment
  • Remove the empty line on line 51

vitsalis avatar Jul 19 '21 09:07 vitsalis

Sure, corrected.

ani-hovhannisyan avatar Jul 20 '21 16:07 ani-hovhannisyan

Unfortunately, current change of "output-line-number" branch caused bug. Please revert it. I tested with different file types, it seems like regexp doesn't work properly. It may also quote lines which looks like this: print("This is ",100*mem_usg/start_mem_usg,"% of the initial size") I will fix and make tests and will ask for pull later again.

ani-hovhannisyan avatar Jul 20 '21 17:07 ani-hovhannisyan

Sure thing! Do you want me to close this PR or will you continue your work here? Still haven't merged anything on master.

vitsalis avatar Jul 21 '21 19:07 vitsalis

Yes I will continue to work on this branch, once I finish row numbers output properly. will ask again.

ani-hovhannisyan avatar Jul 23 '21 12:07 ani-hovhannisyan

Dear @vitsalis, Current branch was intended to have an implementation of outputting line numbers of function calls. But I also commited, unusal symbols quotation, directory analysis, and directory cleaning script. Probably, we may require other changes, which I'm not sure whether should be merged with master branch or not. Also, there's still no any commits of tests.

ani-hovhannisyan avatar Aug 23 '21 03:08 ani-hovhannisyan