Dan Harris

Results 49 comments of Dan Harris

Interesting, sounds like some promising improvements :)

> We can just leverage the current DataFusion Metrics system and TaskStatus update rpc and add necessary throttling/checking/aborting logic when we handle the Task finish event in the Ballista Scheduler....

> I think we need a way to differ the shuffle files generated by the intermediate stages and the final result stage. I think this should be straightforward in principle...

Hi @smallzhongfeng. Do you want to deploy ballista with multiple schedulers outside of kubernetes? Standalone mode does not currently support multiple schedulers as all it does is spins up the...

> I just want to be able to deploy multiple schedulers to ensure high availability of the scheduler. @thinkharderdev So you have two options: 1. Out of the box support...

> If we use the `push` policy, will the task fail when the scheduler switches? Or will the running tasks before the switch fail? @thinkharderdev Yes, currently the active job...

> ``` > local: > INFO tokio-runtime-worker ThreadId(02) ballista_scheduler::state::executor_manager: Reserved 0 executor slots in 588.435µs > etcd: > INFO tokio-runtime-worker ThreadId(02) ballista_scheduler::state::executor_manager: Reserved 0 executor slots in 77.631762ms > ```...

@r4ntix What would you need from `state`? I'm not so sure exposing that as a public interface is a great idea. For the most part the grpc interface has evolved...

I can take this. We've done this on our fork and can upstream it.

We (Coralogix) built our own binary jsonb format (we call it jsona for json arrow) that we are planning on open-sourcing in the next couple months (hopefully Jan/Feb time frame,...