distributed-dataset
distributed-dataset copied to clipboard
Spark & Hadoop compatibility.
It makes sense to work on plaing nicely with Apache Spark & Hadoop ecosystem; so that people can start using distributed-dataset alongside with their existing data pipeline.
This is an umbrella issue to track the things we can do to achieve this:
- YARN Backend: https://github.com/utdemir/distributed-dataset/issues/15
- HDFS Support: https://github.com/utdemir/distributed-dataset/issues/16
- Parquet Support: https://github.com/utdemir/distributed-dataset/issues/17