Create Project table
Create GitHub Project table to track the progress of the datasets being added.
Features:
- [x] Self-assign issues: set assignee;
#self-assing,#self-unassing - [x] Ask for help: label with "help-wanted";
#help-wanted - [x] Finished: notify the dataset has been added;
#ready-for-review- [x] To be confirmed
Fields:
- [x] Priority: we can prioritize datasets according to their relevance/importance/interest
- [ ] Others?
Project table draft: https://github.com/orgs/bigscience-workshop/projects/8
CC: @davanstrien
This looks great - I think that it's nice to keep it fairly simple to avoid too much scrolling across the screen.
The only other possible field we could add is whether the dataset will be moved to a different organization once it is ready. My own feeling is that it's better to discuss that in the issue itself (or using the community tab on the dataset repository if it's already created).
@albertvillanova only other field I thought of might be to flag whether a dataset has been documented or not.
I am keen to allow people to contribute to the hackathon by contributing dataset cards/docuemation. I can see this either being done as part of the process of adding data but could also be a separate activity. Maybe one way to track this is to have a label for documented and in-need-of-documentation. This would complicate the closing of issues slightly though. One option would be that:
- you close the data upload part of the ticket with
#data-uploaded - at the same time you could also add the label
documentedvia#data-documentedor not if you didn't do this part.
I think this is probably over complicated though so it might be best to suggest people open a new issue once they completed a data upload if they feel that the dataset could use more documentation?
@davanstrien
- As discussed, maybe better requiring some minimal documentation at least.
- In relation to moving the dataset to a target org once ready:
- Do you think it could be part of the Issue template? Or leave it for later discussion, once the dataset is created? I don't have a strong opinion on this
Also note that the Project board fields (Priority, others...) must be manually filled by the maintainer who labels the issue as "dataset".
`
* As discussed, maybe better requiring some minimal documentation at least.
Agree, I've updated guidance to reflect this (and will add this to the reviewer guidance too)
* In relation to moving the dataset to a target org once ready: * Do you think it could be part of the Issue template? Or leave it for later discussion, once the dataset is created? I don't have a strong opinion on this
My feeling is that it makes sense to either discuss this in the issue (if it becomes obvious early on) or after the dataset has been created. At this point, it might be easier to also move the discussion to the dataset discussion tab.
Also note that the Project board fields (Priority, others...) must be manually filled by the maintainer who labels the issue as "dataset".
That sounds good, I'll make sure to document this.
I wonder if it's also nice to add a column for the 'difficulty' of adding a dataset? I think this won't always be possible to predict but some will be easier (single CSV file vs nested XML directories). My suggestion would be to have three options:
easy, medium, hard.
We could also add this at a later stage if we think it will be useful for participants.