methods2test
methods2test copied to clipboard
Dataset Generation for Project: Missing Information
Firstly, thanks for sharing this artefact. I've been browsing through the repository and I've found your implementation and documentation to be quite valuable.
However, there seems to be a missing piece in the repo that I am particularly interested in: the data preprocessing and dataset creation. I wasn't able to find the code you used for mining GitHub repos or for mapping the focal method, focal class, and test case. This is an integral part of the project for anyone looking to replicate your process or understand it more thoroughly.
I understand that these processes can be complex and might still be a work in progress. But, I believe even the incomplete scripts would be greatly beneficial for myself and others who are following this project. I would appreciate if you could share these parts of your work. Please let me know if you need any assistance in completing or cleaning up this part.
I am more than willing to contribute.
Hi @smith-co, have you checked out the corresponding paper (https://arxiv.org/pdf/2009.05617.pdf)? Here, the data collection is described more precisely and could provide a better starting point to replicate or understand the process compared to the information listed in this repository.