naturalcc icon indicating copy to clipboard operation
naturalcc copied to clipboard

Is the Implementations part model reimplemented by yourselves

Open skye95git opened this issue 2 years ago • 4 comments

Thanks for your great work! I have a few questions:

  1. Is the Implementations part model reimplemented by yourselves, or is it the official open source implementation collected?
  2. The Deepcs link failed.
  3. In the Code Retrieval (Search) department, is there a pre-training implementation for CodeBERT or GraphCodeBERT?
  4. Does the preprocessing part of the dataset contain data flow graph and control flow graph corresponding to the code?

skye95git avatar May 27 '22 03:05 skye95git

Hi sky95kit, Thanks for you interest to our work. I will ask our team members to answer your questions. For the CFG and DFG part, we currently recommend you to our team members's tool SVF (https://github.com/SVF-tools/SVF).

wanyao1992 avatar May 27 '22 03:05 wanyao1992

Answers:

  1. Some of the models are open-source but are implemented in different platforms (such as Torch7 or TF). We translated them into NaturalCC, or re-implemented by papers or GitHub repos.
  2. We will check it out.
  3. The authors of CodeBERT and GraphCodeBERT do not release their pretraining script, and we do not have sufficient resources to re-implement them. We (MSRA) don't plan to release the pre-training code in the near future. :(
  4. Not yet. You can refer to Data-flow and control-flow graphs for Java.

whatsmyname avatar May 27 '22 03:05 whatsmyname

Thanks for your great work! I have a few questions:

  1. Is the Implementations part model reimplemented by yourselves, or is it the official open source implementation collected?
  2. The Deepcs link failed.
  3. In the Code Retrieval (Search) department, is there a pre-training implementation for CodeBERT or GraphCodeBERT?
  4. Does the preprocessing part of the dataset contain data flow graph and control flow graph corresponding to the code?

Hi @skye95git I noticed that you also have questions in the deepcs repo's issues Evaluation Benchmark on the trained model #16. Have you tried to re-train DeepCS on the codesearchnet dataset? Maybe we can discuss it.

isHuangXin avatar Jun 01 '22 05:06 isHuangXin

Thanks for your great work! I have a few questions:

  1. Is the Implementations part model reimplemented by yourselves, or is it the official open source implementation collected?
  2. The Deepcs link failed.
  3. In the Code Retrieval (Search) department, is there a pre-training implementation for CodeBERT or GraphCodeBERT?
  4. Does the preprocessing part of the dataset contain data flow graph and control flow graph corresponding to the code?

Hi @skye95git I noticed that you also have questions in the deepcs repo's issues Evaluation Benchmark on the trained model #16. Have you tried to re-train DeepCS on the codesearchnet dataset? Maybe we can discuss it.

Unfortunately, I only retrained the model on the data set mentioned in the paper.

skye95git avatar Jun 01 '22 07:06 skye95git