datasets
datasets copied to clipboard
[feature request] PGA: download n most starred
Feedback obtained from this doc.
PGA wouldn't let me grab the top-100 most-starred repositories in language X. So I went to multitool to generate that list.
This seems like an easy thing to add if we kept the number of stars on pga.
I imagine this working in conjunction with the already existing filters, so downloading the top 100 java projects would be something like pga get -l java --top 100.
👍 For this to happen, https://github.com/src-d/datasets/issues/43 for ⭐️ that we have here for a while, must be implemented first.
Similar feedback logged in there from the participants of ml-on-code workshop.