Yunseong Lee

Results 16 comments of Yunseong Lee

The main reason for this failure is our code depends on the REEF's latest `SNAPSHOT`. Whenever we need new features in REEF, we have to manually build REEF and include...

IMHO, we'd better find a way to rebuild REEF automatically somehow. I think we can add another project to Jenkins, which tests & builds Cay periodically (e.g., daily), and there...

Thanks for the report! Probably we are creating too many threads; i think it's time to look into the problem and come up with a better management on threads. I'll...

When we implement multi-threaded Trainer, we can consider two versions for threads to write their gradient updates: 1) Synchronized fashion 2) Hogwild-style (lock-free) We should build both versions and compare...

I'll start to send a PR that enables multi-thread in MLR of consistent (i.e., non-hogwild) version first.

@gyeongin Yes, both results were very similar.

Totally agreed! I'll prepare a draft and share it in this thread. Thanks for the great suggestion!

I'm sharing the draft: ![image](https://cloud.githubusercontent.com/assets/1748276/25069621/82ff090c-22c1-11e7-98f2-26a43fb0574a.png) You can find the original file [here](https://docs.google.com/presentation/d/1j3X9bWRjzarhjwlOI1S7mHkAhUQ1uhQCORKyioRYdfc/edit#slide=id.p), and I would appreciate if you have any comments/feedback. Thanks!

We can first implement the simplest version of keeping all the updates until requested to aggregate them. Then we can improve it by aggregating beforehand; for example, when the number...

Two strategies are possible for checkpoint: 1) Stop-the-world 2) Asynchronous We can easily notice a trade-off between performance and correctness.