deepAPI icon indicating copy to clipboard operation
deepAPI copied to clipboard

Code or tool for pre-processing source code

Open betterenvi opened this issue 6 years ago • 20 comments

Can you also provide the code or tool for pre-processing source code? (parsing source code, and extracting api sequences etc.) Thanks!

betterenvi avatar Nov 19 '18 13:11 betterenvi

Dear Author, We have evaluated the deep code search model in your paper with our own search codebase, we want to improve the current search performance and start from improving the API sequence extraction because we find some gap by comparing with yours.

We now parse the AST for each extracted function and then extract the API sequence, in which way the dependencies to other classes are missing. We find in the deep API paper chapter 4.1.1 , it's based on the whole project and use an extraction algorithm to analyze the dependencies of the whole project. Could you please share with us the code for the extraction algorithm and API sequence extraction or give us some clue?

Great thanks to you.

ttbuffey avatar Aug 13 '19 02:08 ttbuffey

@ttbuffey please provide your email address.

guxd avatar Aug 19 '19 08:08 guxd

@guxd [email protected]

ttbuffey avatar Aug 19 '19 08:08 ttbuffey

@guxd I want to confirm that in the Code2APIseq.zip project, the entry file is Code2APIseq.java, this project takes a function body as input and return its API sequences, right?

It doesn't contain the code of analyzing the dependencies of the whole project, right?

ttbuffey avatar Aug 20 '19 06:08 ttbuffey

yes, that part is omitted.

guxd avatar Aug 20 '19 07:08 guxd

@guxd thanks again for sharing the project to me. I think the project dependencies is a key part in API sequence extraction, right? Could you please also help to share with me that part of code?

We really come to bottleneck in improving the search performance, seeing from the code you shared with me, it's really difficult and takes tremendous time for me to implement it, especially when i'm not good at java language.

I have implemented the API extraction based on each function in python language using the javalang library(https://github.com/c2nes/javalang) you shared with us before, but the javalang library doesn't support project dependencies analysis.

Sincerely hope you can help again.

ttbuffey avatar Aug 20 '19 07:08 ttbuffey

The code for dependency analysis is included in the package. (see Line 73 at ObjSeqBuilder.java). We just modified the main function.

guxd avatar Aug 20 '19 07:08 guxd

thanks for your confirmation

ttbuffey avatar Aug 20 '19 08:08 ttbuffey

@guxd I have read the code, I have two questions addressed as below. It will be appreciated if you could give me some guidance.

  • I wonder how the whole project dependencies analysis is implemented, what's the main function looks like. Could you please describe the main steps with me. As I checked the line 73 at ObjSeqBuilder.java, per my understanding it's parsing the projects dependencies, but i don't know how to pass the whole project information to this function?

  • After the project dependencies are analyzed, I find the dependencies are kept in the callGraph and Datagraph, but how it is applied to the AST parsing for a single function?

  • regarding the current main function, I changed the function body defined by parameter code as our own code, the result is empty. By checking the code related to JDKAPI.java, it will filter out the calls not listed in this file. Can you explain what's the purpose of this file?

Thanks very much. I have tried to understand the code with my limited knowledge.🤦‍♀️

ttbuffey avatar Aug 22 '19 10:08 ttbuffey

@guxd Dear author, could you please help to give me some clue for the above questions. Sincerely wish you the best.

ttbuffey avatar Aug 26 '19 02:08 ttbuffey

@guxd 可以简单说一下main函数中解析整个项目依赖的关键过程吗?project信息是以什么方式传送到整个代码的实现中呢?

ttbuffey avatar Sep 10 '19 09:09 ttbuffey

@guxd Dear author, could you please help to give me some clue for the above questions. Sincerely wish you the best.

Hello, did you get the point on how to do the dependency analysis?

JiyangZhang avatar Jan 02 '20 23:01 JiyangZhang

@JiyangZhang We follow the GrouMiner and their code for the project dependency analysis. The code I provided to @ttbuffey was a simplified version of GrouMiner for a quick demo.

guxd avatar Jan 03 '20 03:01 guxd

@JiyangZhang We follow the GrouMiner and their code for the project dependency analysis. The code I provided to @ttbuffey was a simplified version of GrouMiner for a quick demo.

Thank you very much for reply! The link you give seems not work. Are all the code for extracting API sequence in the Code2APISeq.zip in the google drive link:?https://drive.google.com/drive/folders/1jBKMWZr5ZEyLaLgH34M7AjJ2v52Cq5vv

JiyangZhang avatar Jan 03 '20 03:01 JiyangZhang

The link works in my network. Yes, it contains all code for extracting API sequence.

guxd avatar Jan 03 '20 04:01 guxd

The link works in my network. Yes, it contains all code for extracting API sequence.

great thanks. But I am not sure I use it in the correct way. I came across the same problem as @ttbuffey I tried to substitute the code with the example in Fig4 in your paper, but got 'Empty' as the result. I have no idea about the reason. Thanks Sorry for disturbing you.

JiyangZhang avatar Jan 03 '20 04:01 JiyangZhang

Could you check whether the APIs in the example is included in the JDKAPI.java? @ttbuffey This file is used to filter out non-JDK APIs. The 'Empty' result could be due to the filtering process. Besides, DeepAPI used a more complicated version of GrouMiner rather than the provided demo extractor.

guxd avatar Jan 03 '20 04:01 guxd

Could you check whether the APIs in the example is included in the JDKAPI.java? @ttbuffey This file is used to filter out non-JDK APIs. The 'Empty' result could be due to the filtering process. Besides, DeepAPI used a more complicated version of GrouMiner rather than the provided demo extractor.

Hi, could you share the code you used to extract the api sequence? I am working on a project to create a dataset of methods' api sequences.

JiyangZhang avatar Jan 03 '20 22:01 JiyangZhang

The raw code is not at hand now. The demo code is very close to the code that DeepAPI used.

guxd avatar Jan 08 '20 09:01 guxd

Hello! I am a researcher as well. Can you please send me a copy of the Code2API.zip as well? I need to generate API from new training samples. my email is [email protected] Thanks!

Moshiii avatar Oct 12 '20 14:10 Moshiii