Bicheng Ying
Bicheng Ying
Thoughts: 1. Use barrier function every N iterations, which can be useful for unstable performance but not useful for heterogeneous situation. 2. Run for a very long time and relied...
Hi yangxuanfei, It is might because of the typo? `DistributedPushSumOntimizer`, note you wrote Ontimizer instead of Optimizer? The definition of this in here: https://github.com/Bluefog-Lib/bluefog/blob/master/bluefog/torch/optimizers.py#L1180 If it is note the typo,...
Oh, we didn't expose this optimizer. See the source code here https://github.com/Bluefog-Lib/bluefog/blob/master/bluefog/torch/__init__.py#L21 Because we are not satisfied the current implementation and the performance doesn't seem very good. However, if you...
Hi, can you post the environment settings? That error probably means `at::Tensor::device` can be found in the symbol. `at` is ATen library in the PyTorch library. So I guess it...
1. that should not be related to openmpi because it failed to link the symbol (that is in C++ side since our backend depends on the PyTorch). 2. I don't...
the neighbor_allreduce version is done with machine id based
Had discussions in the Cirq-Cync. This sounds reasonable and it should behave similar as the `cirq.FrozenCircuit` with tags. Just a few points should be kept in mind: 1. Hash value...
Yeah, that is not a good example. Let's use the wait gate as example: https://github.com/quantumlib/Cirq/blob/17c4e95dae4eedb8ea22bc8abee3e03c6fbef4ca/cirq-google/cirq_google/ops/wait_gate.py#L60-L66 Let we replace the line 64 as `_duration = cirq.resolve_parameters(self._duration, resolver)` instead, running the test,...