graph4code icon indicating copy to clipboard operation
graph4code copied to clipboard

Creating Forums graph

Open shahendahatem opened this issue 1 year ago • 5 comments

shahendahatem avatar Aug 01 '23 10:08 shahendahatem

I tried to extract the graph like ( head, relation, tail) as provided in the paper title ("A Toolkit for Generating Code Knowledge Graphs") in fig 4 but I could not get the relation from the docstring. I make all the steps but I only have head and tail in the graph. what should I do to be able to extract the graph in the same form as fig 4.

shahendahatem avatar Aug 01 '23 10:08 shahendahatem

Sorry I think I need a bit more background on what you mean. What code are you running, and what is the actual graph you get?

ksrinivs64 avatar Aug 03 '23 01:08 ksrinivs64

I mean, I tried to do the code example that have been provided in the paper. I only get the nodes but i could not get the relation as mention in the figures. Also, when I run the code I only get the Jason file for the provided code. How can I get the nq file for it. image

image

shahendahatem avatar Aug 03 '23 08:08 shahendahatem

Hi Shahenda, Please try to follow the sequence of steps in the README: https://github.com/wala/graph4code/tree/master#create-your-own-graph. The steps are supposed to create a graph for this example script: https://github.com/wala/graph4code/blob/master/example_scripts/test1.py.

ibrahimabdelaziz avatar Aug 04 '23 02:08 ibrahimabdelaziz

Hi Dr, Thanks for your reply I tried to follow the steps and I get the following results

1- Code Analysis Graph:

java -DoutputDir=./output/static_analysis/ -cp jars/codebreaker3.jar util.RunTurtleSingleAnalysis ./example_scripts/test1.py null null

output : test 1 has 21 turtles

2- Collecting documentation (docstrings) for your scripts

  • First step

cd src python generate_top_modules.py '../output/static_analysis/0x630x200xba0x940x550x7a0x7e0xbe0x8e0x5b0x6a0x9b0xe90x180x910x61.json.bz2' ../output/top_modules.json 1

output : top_modules.json sklearn

  • Second step

cd scripts sh inspect_modules_for_docstrings.sh ../output/top_modules.json ../output/modules_out/ ~/anaconda3/

output: Number of documents stored in index:docstrings_index {'count': 146996, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}}

3- Creating docstrings graph

cd src python create_docstrings_graph.py --docstring_dir ../output/modules_out/ --class_map_file ../resources/classes.map --out_dir ../output/docstrings_graph/

output : Total number of triples = 728817 writing files to ../output/docstrings_graph//classes_found.txt skipped triples from lambda expressions: 0 skipped triples due to space in URI: 0

I get confused on some points:

  • The output from first step is Json file that contains the nodes and the Code Analysis Schema. I tried to draw the graph from Json file but I could only draw the nodes connected without any relation. Till now the graph has not any information from python knowledge graph

  • I get the top modules from the second step and all the Number of documents stored in index. I coud not understand the result, the number of total and successful is only 1 and the number of count is 146996. what is this number represent?

  • The output from the third step is the number for the triples that extracted from analysis of test1.py. How can I draw the output as nodes and relation ( as mentioned in figure 4 in the paper)

  • I could not understand why we get this huge number of functions and methods and the code in test 1 is simple

shahendahatem avatar Aug 13 '23 07:08 shahendahatem