developer-roadmap icon indicating copy to clipboard operation
developer-roadmap copied to clipboard

Data Engineering

Open mickeypash opened this issue 4 years ago • 11 comments

I would like to create a roadmap for Data Engineering. I myself am a backend engineer with some recent experience in Data Engineering. I'm currently making my way through Designing Data Intensive Applications and watching Andy Pavlo's CMU course on Databases. I think I could put a rough skeleton of the roadmap.

What roadmap is this issue about?

  • [ ] Frontend Roadmap
  • [ ] Backend Roadmap
  • [ ] DevOps Roadmap
  • [ ] All Roadmaps
  • [x] Data Engineering

What is this issue about?

  • [x] Discussion for a pull request I would want to open.
  • [x] Addition of a new item
  • [ ] Removal of some existing item
  • [ ] Changing in arrangement
  • [ ] General suggestion
  • [ ] Sharing an Idea
  • [ ] Something else

Please acknowledge the below listed

  • [x] This is not a duplicate issue. I have searched and there is no existing issue for this.
  • [x] I understand that these roadmaps are highly opinionated. The purpose is to not to include everything out there in these roadmaps but to have everything that is most relevant today comparing to the other options listed.
  • [x] I have read the contribution docs before opening this issue.

First pass of technologies to add

  • [ ] Kafka
  • [ ] Airflow
  • [ ] dbt
  • [ ] Redis
  • [ ] AWS Redshift
  • [ ] Snowflake
  • [ ] MongoDB
  • [ ] HDFS
  • [ ] HBase
  • [ ] Postgres
  • [ ] MySQL
  • [ ] Apache Avro
  • [ ] AWS Kinesis

mickeypash avatar Sep 09 '20 15:09 mickeypash

Good topic! I'm interested in and thinking about this, too.

I think those technologies could be grouped into some categories:

  • Data Storage (variant storage types, data models, and query features)
  • Data Processing (batch & streaming processing)
  • Data Presenting (reporting & data visualization)

Each categories (and its sub-categories) include some tools, technologies, best practices.

flniu avatar Sep 15 '20 15:09 flniu

Nice one! I'll incorporate that! I've noticed that the DevOps and Backend Roadmaps have some overlap.

mickeypash avatar Sep 15 '20 20:09 mickeypash

Just made a rough sketch just to see how to roughly lay out the items on the page, see what you think: https://excalidraw.com/#json=6203816361852928,FiB-e-dIlUV86gWTJo77uQ

mickeypash avatar Sep 15 '20 20:09 mickeypash

Screenshot 2020-09-16 at 10 29 48

mickeypash avatar Sep 16 '20 10:09 mickeypash

Good idea, I'm intrested in it too, I'm all into your idea, I mean like you can create an application which contains all the technologies, but people mainly die for storage, so I think it's better if you add MY SQL and let people design their on database through your app without them coding a database at all, You can create an application with an algorithm that does it.

sain06533 avatar Sep 16 '20 10:09 sain06533

I just discovered this project: https://github.com/datastacktv/data-engineer-roadmap which looks remarkably similar

mickeypash avatar Sep 16 '20 22:09 mickeypash

Yeah! , you are right it looks similar. I guess you can build a firewall to your project, but for that you need an incredibly high processing PC, you can buy something like a raspberry pi or banana pi, it's this credit card sized computer, which can act as a firewall of high security. You have protect your app from hackers out there.

sain06533 avatar Sep 17 '20 03:09 sain06533

@mickeypash That's a geat idea! Go ahead.

dA505819 avatar Oct 30 '20 21:10 dA505819

I've been a bit slow recently! I'll try to find the motivation to do it one of these days!

mickeypash avatar Oct 31 '20 15:10 mickeypash

i would also like if you have data engineer on your roadmap, the above one looks good, though.. quite big

deknos avatar Nov 21 '20 09:11 deknos

Hey, I'm interesting in contributed with the project. I think SPark with R and Python are important topic, if has a telegram group for the project, please show the link.

RodrigoAB93 avatar Mar 23 '22 12:03 RodrigoAB93

We are now closing any issues requesting new roadmap requests to clean up the issues.

However, this doesn't mean that the requested roadmap won't be added to the website. We are putting it in the backlog and will add it to the website depending on how it ranks in the backlog or sooner if we find someone interested in adding that roadmap. Thank you for the request though! 🙏

kamranahmedse avatar Dec 21 '22 13:12 kamranahmedse

Thanks @kamranahmedse! Apologies I lost motivation for this because I found out that someone had already taken the idea and ran with it. They have actually done a better job than I would have and their roadmap has gained a lot of popularity.

mickeypash avatar Dec 22 '22 11:12 mickeypash