lam Create Project table

Create GitHub Project table to track the progress of the datasets being added.

Features:

[x] Self-assign issues: set assignee; #self-assing, #self-unassing
[x] Ask for help: label with "help-wanted"; #help-wanted
[x] Finished: notify the dataset has been added; #ready-for-review
- [x] To be confirmed

Fields:

[x] Priority: we can prioritize datasets according to their relevance/importance/interest
[ ] Others?

Jun 29 '22 15:06 albertvillanova

Project table draft: https://github.com/orgs/bigscience-workshop/projects/8

CC: @davanstrien

Jul 05 '22 08:07 albertvillanova

This looks great - I think that it's nice to keep it fairly simple to avoid too much scrolling across the screen.

The only other possible field we could add is whether the dataset will be moved to a different organization once it is ready. My own feeling is that it's better to discuss that in the issue itself (or using the community tab on the dataset repository if it's already created).

Jul 05 '22 09:07 davanstrien

@albertvillanova only other field I thought of might be to flag whether a dataset has been documented or not.

I am keen to allow people to contribute to the hackathon by contributing dataset cards/docuemation. I can see this either being done as part of the process of adding data but could also be a separate activity. Maybe one way to track this is to have a label for documented and in-need-of-documentation. This would complicate the closing of issues slightly though. One option would be that:

you close the data upload part of the ticket with #data-uploaded
at the same time you could also add the label documented via #data-documented or not if you didn't do this part.

I think this is probably over complicated though so it might be best to suggest people open a new issue once they completed a data upload if they feel that the dataset could use more documentation?

Jul 05 '22 10:07 davanstrien

@davanstrien

As discussed, maybe better requiring some minimal documentation at least.
In relation to moving the dataset to a target org once ready:
- Do you think it could be part of the Issue template? Or leave it for later discussion, once the dataset is created? I don't have a strong opinion on this

Also note that the Project board fields (Priority, others...) must be manually filled by the maintainer who labels the issue as "dataset".

Jul 05 '22 14:07 albertvillanova

`

* As discussed, maybe better requiring some minimal documentation at least.

Agree, I've updated guidance to reflect this (and will add this to the reviewer guidance too)

* In relation to moving the dataset to a target org once ready:
  
  * Do you think it could be part of the Issue template? Or leave it for later discussion, once the dataset is created? I don't have a strong opinion on this

My feeling is that it makes sense to either discuss this in the issue (if it becomes obvious early on) or after the dataset has been created. At this point, it might be easier to also move the discussion to the dataset discussion tab.

Also note that the Project board fields (Priority, others...) must be manually filled by the maintainer who labels the issue as "dataset".

That sounds good, I'll make sure to document this.

Jul 05 '22 15:07 davanstrien

I wonder if it's also nice to add a column for the 'difficulty' of adding a dataset? I think this won't always be possible to predict but some will be easier (single CSV file vs nested XML directories). My suggestion would be to have three options: easy, medium, hard.

We could also add this at a later stage if we think it will be useful for participants.

Jul 08 '22 09:07 davanstrien