roryqi comments

Results 403 comments of


                                            roryqi

[Problem] Inconsistent blocks when reading shuffle data

> we also met this error, but in fact it's rpc timeout, when we in compose client mode, the one time read exception will be catch, then try to read...

Introduce rejection mechanism when coordinator server is starting

How do we judge whether to get enough shuffle servers?

Introduce rejection mechanism when coordinator server is starting

> > How do we judge whether to get enough shuffle servers? > > Wait until reaching the shuffle server heartbeat interval, default 10s Maybe one heartbeat interval is not...

Introduce rejection mechanism when coordinator server is starting

> Got your thought. > > > How do the yarn resourcemanager to process this problem? > > In HA resourcemanagers, there is no such problems due to the mechanism...

Introduce rejection mechanism when coordinator server is starting

> Now is the single coordinator process maintained in single POD or shared StatefulSet? If using single POD, it's not a problem. If using the shared statefulset, maybe we should...

Introduce rejection mechanism when coordinator server is starting

I think this design is not friendly for automation deployment. It's my doubt.

Introduce rejection mechanism when coordinator server is starting

> > I think this design is not friendly for automation deployment. It's my doubt. > > The coordinator deployment could be controlled to start by operator, this is not...

Introduce rejection mechanism when coordinator server is starting

> Any ideas on this? @jerqi It's ok for me if we disable this mechanism by default. Is it a safe mode for coordinator?

Support lower Hadoop versions in client-mr

What's your company's hadoop version?

Support lower Hadoop versions in client-mr

I think it's ok for me if we need it in our production environment.