PyDataset
PyDataset copied to clipboard
Provide namespaces and an index
This fantastic idea, kudos.
With the growing number of dataset your tool will support, you will quickly run out of names. And searching about a particular dataset will be hard.
I'd recommand:
- to require the dataset to have namespaces. E.G by source: "tld.domain.titanic" or by taxonomy "history.titanic.victims.#timestamp#"
- to publish a web page with an index of all data sets with their namespace and content.
- to define a procedure to add a dataset to the repo, or by plugins.
Thanks for the recommendations, will keep in mind! Now the repo relies on one source RDatasets. But surely there are plenty of stuff to be implemented as it gets larger.