code2vec icon indicating copy to clipboard operation
code2vec copied to clipboard

Source code parsing pipeline

Open dhas opened this issue 4 years ago • 4 comments

Hi @sonoisa,

I wasn't able to understand how you arrived at the dataset you provide in your code2vec/data directory. Could you clarify your source code parsing pipeline? If I understand correctly, you seem to have started with the parsed tokens serialized as JSON from http://groups.inf.ed.ac.uk/cup/codeattention/ and you have converted into *.txt in code2vec/data. Am I right?

Would you be able to add the code for doing this into the repo? I need to parse sources written in C which is why I'm seeking a clearer picture of parsing.

Thanks

dhas avatar Jun 08 '20 05:06 dhas