Woo-Yeon Lee

Results 20 issues of Woo-Yeon Lee

Now we have Dolphin-specific job server. We can extend it to support other frameworks, too (e.g., Pregel).

jobserver

In Pregel, we have a PageRank app for now. But we didn't seriously validate whether the app yields the correct result. We need to check it by comparing it with...

Pregel

#1191 is to introduce JobServer and its main part is client code. This client communicates with Driver through REEF's client message channel, and with other client processes (they submit a...

jobserver

Similar with #1218, we also need a component that tracks resources in JobServer. It will maintain the following information. - The total amount of resources - Resource usage status, including...

jobserver

In JobServer, many jobs are running together simultaneously, and also many job will wait to be dispatched by JobScheduler. So we need a component that tracks all these jobs and...

jobserver

Initial version of Pregel (#1170) does not support runtime reconfiguration. Ultimately, we will enable reconfiguration of the Pregel system using ET APIs. For this goal, we have to - collect...

Pregel

Current implementation of PlanCompiler is not straightforward, because it's modified from old EM's plan builder code. We need to clean it up to be intuitive for the current role.

cruise-ps

We need to add integration tests for checking multi-thread trainers, which we have recently enabled.

cruise-ps
test

In current `AsyncDolphinPlanExecutor` does not consider any failures in each step of reconfiguration. So if a failure happens `PlanExecutor`'s behavior is undefined (e.g., deadlock or complete hiding fails). To address...

cruise-ps

The server evaluators of PS service do not require the data loading service. However, in the current implementation of dolphin-async, the driver submits a context with a data loading service,...

cruise-ps