plur
plur copied to clipboard
Hoppity dataset generation
Hi,
I'm trying to run the data generation script using the cooked graphs from my dataset. However, i see in your code that you use something called hoppity_cg.tar.gz
(https://github.com/google-research/plur/blob/main/plur/stage_1/hoppity_single_ast_diff_dataset.py#L61) to get some json files. What is this used for? This was not available in the hoppity repo - is this some pre-processing that you have done on your end?
Can you please help us understand how you generated the hoppity_cg.tar.gz
file? We dont find this file in the hoppity repository.
'hoppity_cg.tar.gz': {
'url': 'https://drive.google.com/u/0/uc?id=1JdXaehWO4UocjXqIXzWtUmVpRWWBtqmE&export=download',
'sha1sum': '9f4a635408f86974a8e9739769d3ed2a52c2b907',
}
How is this file generated? Can you please provide us with the script to generate intermediate files as part of the artefact?