Expand, organize examples and find new sample data sources
The current set of examples can be improved. They need to be triaged into either experience levels (such as "beginner", "intermediate", "advanced") and/or by document type (such as How-tos, Tutorials, or API usage example).
The examples can also be refined to contain more comments and make context clearer, ideally connecting with the corresponding API documentation.
Lastly, it would be interesting to find new sample data sources to use in the examples.
Can start collecting some sources, streaming would be great but static is ok too for examples
- https://github.com/bytewax/awesome-public-real-time-datasets
- https://github.com/ColinEberhardt/awesome-public-streaming-datasets?tab=readme-ov-file
- https://github.com/awesomedata/awesome-public-datasets
- https://www.nyc.gov/html/dot/html/about/datafeeds.shtml
Also been looking at
- https://www.usgs.gov/programs/earthquake-hazards/science/earthquake-data
- https://github.com/synthetichealth/synthea or https://github.com/Google-Health/healthcare-streaming-simulator
We have quite a comprehensive set of examples now which are broken down by experience level, so I'm going to close this out.