code-docstring-corpus
code-docstring-corpus copied to clipboard
Question about creating a dataset format for NeuralCodeSum
Hello. @Avmb I have a question about dataset format of NeuralCodeSum. When I checked, https://github.com/wasiahmad/NeuralCodeSum/tree/master/data The dataset was from this repositories as you supported the dataset to NeuralCodesum.
Could I know how you make a dataset format for NeuralCodeSum? It was made like token word list without underscore and others. If there is some script to parse code to dataset format or way, I hope to know it.
Thank you:)
I'm not sure what processing NerualCodeSum uses, the dataset was created using these scripts: https://github.com/EdinburghNLP/code-docstring-corpus/tree/master/scripts