Results 403 comments of roryqi

> we also met this error, but in fact it's rpc timeout, when we in compose client mode, the one time read exception will be catch, then try to read...

How do we judge whether to get enough shuffle servers?

> > How do we judge whether to get enough shuffle servers? > > Wait until reaching the shuffle server heartbeat interval, default 10s Maybe one heartbeat interval is not...

> Got your thought. > > > How do the yarn resourcemanager to process this problem? > > In HA resourcemanagers, there is no such problems due to the mechanism...

> Now is the single coordinator process maintained in single POD or shared StatefulSet? If using single POD, it's not a problem. If using the shared statefulset, maybe we should...

I think this design is not friendly for automation deployment. It's my doubt.

> > I think this design is not friendly for automation deployment. It's my doubt. > > The coordinator deployment could be controlled to start by operator, this is not...

> Any ideas on this? @jerqi It's ok for me if we disable this mechanism by default. Is it a safe mode for coordinator?

What's your company's hadoop version?

I think it's ok for me if we need it in our production environment.