data-engineer-roadmap
data-engineer-roadmap copied to clipboard
What about modeling / simulation?
Right now this is more an invitation to discussion than a request.
What modeling techniques does a data engineer need and for what use cases? Does anybody do simulation before actually designing a system / solution? If yes: what are the tools / approaches?
Following potential use cases came to my mind:
-
logical planning: if one wants to build an app which solves real-world problems, one wants to understand that problems. To avoid forgetting about some aspects of the problems or to discover not obvious aspects, one can do real-world simulation. In
Python
there is a library calledsimpy
. Does anybody have experience using it? Also, diagrams (e.g. UML) can be used to do logical modeling for almost everything: state diagrams, data flow, components etc. - behaviour of distributed clusters (databases): I saw the following tool https://github.com/domclick/tuchanka, which imitates failures of a cluster node, waits for recovery, fixes failed node and cyclically continue testing. Anybody doing smth similar?
- communication, networking, latency: I don't have much experience working with real-time environments - what are the typical techniques simulating / modeling real-time connectivity issues? Does one do performance measuring with some small dataset, and then extrapolates the results, or are there any other approaches?
Hey there, this a really interesting topic. I haven't used any of the tools you mentioned. I'm curious what others say 👀
This could be moved to Github Discussions