pldb
pldb copied to clipboard
another possible data source
https://glosario.carpentries.org/ https://twitter.com/gvwilson/status/1566904419857440768
Hello! could you please put a little more detail about what is needed in this issue?
Thank you very much!
Great question @adriantintpilver !
My general approach to adding a data source is like this:
- Stumble upon an interesting data source, perhaps a website called https://worlds-best-best-programming-books.xyz
- Start manually adding lines to
*.pldbfiles likeworldsBestProgrammingBooks php 432 booksto get a "feel" for how best to extract the most useful data from that source and add it to our database. Usually I start small and add complexity as we go. - Once I have a "feel" for the new data source, and have "linked" about 5-10 files, I will create a single grammar file for the new data source (here is a real one from pldb.com, for example: https://github.com/breck7/pldb/blob/main/database/grammar/helloWorldCollection.grammar)
- I will commit that grammar file and those initial manual entries
- Then I will either just make some tea and add the rest of the data manually, or write a crawler script (https://github.com/breck7/pldb/tree/main/code/crawlers) to programmatically import and keep updated that data source.
That's pretty much it!
I know the docs are still pretty sparse, and especially around the grammar language not too many docs, but hopefully that helps a little bit?