datasets
datasets copied to clipboard
Clean up documentation for PublicGitArchive
- [ ] Remove pga.sourced.tech (it is a nightmare to maintain yet another site)
- [ ] Remove web folder
- [ ] Remove examples folder (examples can be part of docs,
docs/examples.md
) - [ ] Move
borges-indexer
,pga
,multitool
into a new foldersrc
- [ ] Convert PublicGitArchive/README.md into an introduction to PublicGitArchive with a quickstart, including the details of the dataset that are currently only on web (essentially an abridged version of the readme style we use for engine)
- [ ] Move the current content of the README related to
borges-indexer
&multitool
intodocs/reproducing-pga,md
@smola @campoy @vmarkovtsev please review this proposal
@eiso Will we still serve the files from pga.sourced.tech
?
pga.sourced.tech needs to continue serving the files so that we keep compatibility with released tools.
I have no issue with serving from pga.sourced.tech
just want to get rid of the website. Since @campoy is on holidays, if @smola and @vmarkovtsev agree, I can start working on this.
I +1
Interesting feedback in this issue: src-d/datasets#77
What's the current status of this? If I was to work on better documenting PGA, how would I start?
@smola can you update on the tasks that Eiso mentioned above, please?
@campoy None is done. I think consolidating into a standard gitbook would be a good first step.
See https://docs.sourced.tech/datasets/publicgitarchive
This is relevant to the conversation we had this morning about website content responsibilities with @marnovo and @vcoisne
This was one year ago, but it feels like a whole epoch has passed. I am finally working on this.