Contributing by adding content - how about outlining potential topics?
Hi everyone, as I've read through the wiki I think it's great, but I also think it lacks some information regarding several concepts/tools which I personally find potentially relevant for data engineers. I would like to ask you what you think about the idea of gathering a list of such topics first? Some sort of a TO DO list. I think it could enable more contributions as potential contributors would be more encouraged if they knew exactly what topics could they focus on, and if their ideas are in a scope of a data engineering wiki.
Some examples of stuff that I think could be added, even if some of them are basic and/or straight-forward:
- entries regarding message brokers/queues - what are they and some sample tools such as Apache Kafka
- infrastructure stuff - virtualization, containerization, infrastructure-as-a-code etc. tools like Docker, Kubernetes, Terraform
- data visualization, including some sample industry standard tools (for instance Power BI) and open-source (such as Superset)
- maybe some more programming stuff (more languages - for example R or Rust, some OOP concepts, Big O Notation etc.)
We could make an issue for each topic and label the ones that are good for beginners to contribute to. We've done that in the past a few times and I could revamp the contributing guidelines to make it easier/clearer.
Out of the topics you listed I'd vote for/prioritize the first 2.
I agree that the 2nd seems the most important (and 1st afterwards). I personally could contribute on these once I find some spare time, but let's see if somebody will outrace me :)