graph4code icon indicating copy to clipboard operation
graph4code copied to clipboard

Issue in tasks 3-4 "collecting docstrings documentation" for scripts that contain built-in modules

Open itsallgoodman7 opened this issue 1 year ago • 2 comments

I am trying to create/collect a series of git repositories containing multiple python scripts.

Though, there is a problem in one of your scripts when I have to collect the docstrings documentations (to be run in the dir containing all analysis output previously collected): if one of the candidate python scripts chosen contains a built-in module like future / functools, unittest, the script /src/inspect_docstrings_per_module.py (called inside scripts/inspect_modules_for_docstrings.sh) returns errors (as shown below) and the docstrings output for those modules (present in the top modules to inspect) is not created correctly.

The problem therefore seems to be related to the script to be run at step 3 (src/generate_top_modules.py) that shouldn’t consider these built-in modules to be installed (with pip install) within the conda environment creation done at step 4. (I am referring to the steps at your GitHub page: https://github.com/wala/graph4code)

Right now I cannot proceed using custom GitHub repositories to be analysed because those always contain python scripts that include some built-in modules that results in this error at step 4 (after being selected as a top_module into /output/top_modules.json at step 3).

Screenshot 2022-08-10 at 12 37 02 Screenshot 2022-08-10 at 12 37 32

itsallgoodman7 avatar Sep 07 '22 15:09 itsallgoodman7

linked to this problem (of the packages inside the "analysis_modules" in generate_top_modules.py), there is also the issue that not every package work with the forum graph creation. Ex: when the python scripts that we want to analyse use matploplib or numpy as "top_modules", it doesn't work (0 triples found), while sci-kit learn does. Screenshot 2022-09-26 at 15 45 16 Screenshot 2022-09-26 at 13 19 45 Screenshot 2022-09-26 at 18 13 04

itsallgoodman7 avatar Oct 05 '22 09:10 itsallgoodman7

We did notice that depending on OS/Python versions etc, some modules could be inspected by Python successfully and others couldn't. Our approach was to switch machines/Pythons etc to do it. It is annoying for sure, but unclear how to get around the problem - this is a Python problem.

ksrinivs64 avatar Oct 05 '22 22:10 ksrinivs64