Yunseong Lee
Yunseong Lee
Hi @qubvel, thank you so much for sharing this awesome tool! While I was running `./scripts/convert_from_tf_to_keras.sh`, attempting to convert the TF model into Keras, I encountered the following error: ```...
We are introducing worker-side cache in #1251 (which was raised #803), however, we currently refresh the cached values in a naive way; we fetch new parameters periodically (e.g., every 10...
The current Dolphin (based on EM) has a component MetricProcessor to handle the metrics for optimization. We need a similar component in Dolphin on ET as well
The current cost model does not require the Dolphin-specific server metrics, but maybe we can later want to use the metrics as well. In that case, we may have to...
This issue is for integrating Metric-related components. We need to make sure 1) the metrics are collected from workers and servers, 2) and the Driver processes the metrics as we...
We need to collect metrics (both in servers and workers) in Dolphin on ET. We may be able to reuse the worker-side code and change the Driver-side message handler (for...
The current `DataStorer` service supports the local file system only. The service must be more powerful if it supports a distributed file system like HDFS.
Subissue of #821 We've been working on Trainers that run with multiple threads. When accessing and updating model parameters, three type of synchronization are possible: 1. non-hogwild: shared model with...
So far, we evaluated the convergence by computing training error only. However, we can fall into overfitting (https://en.wikipedia.org/wiki/Overfitting), which can make a poor prediction for the unseen dataset. Instead, to...
Part of #821. To enable multi-threaded execution in workers, we need to put aside the intermediate updates (e.g., gradient, topic changes) from threads and aggregate the values when we update...