raja sekar

Results 13 comments of raja sekar

We need to have good support for multiple schedulers like Yarn, Kubernetes too. Presently only basic standalone mode is finished. I am adding this as a feature/enhancement.

I will add the sample data today for both txt and parquet formats

Hello @AmbitionXiang Hope you are doing well. Thanks for checking it and bringing out the issue. Yeah, due to data duplication in serialization, it can go out of memory very...

Hello, Sorry for a very late reply. I was taking some break from maintaining the public branch of this library for some time. Hence the delay. 240s doesn't seem correct....

I am refactoring the networking layer and scheduler heavily. It is coming very nicely. Removed the dependency of capnp. The deployment process is extremely easy now. I feel that this...

For standalone this is fine. But people can always extend it to Yarn, Mesos, Kubernetes schedulers depending on their environment. In the future, maybe we can provide the alternatives ourselves....

We do need to think about developing UI for monitoring jobs and tasks which is very essential in production. Not a priority now. But something which we eventually need to.

I have plans of integrating with Python and possibly other languages(JVM and Go) using Arrow. However, regarding datafusion, the underlying architecture of this framework closely follows that of Spark and...

That is the plan here. HDFS support is not yet implemented. Once it is done, we will release it as seperate crate.