justheuristic issues

Results 66 issues of


justheuristic

Roadmap

This is a global project roadmap that states our priorities for the nearest future. These priorities can and should be disputed here or elsewhere, after which we will update the...

enhancement

discussion

[Feature Request] quality-of-life changes to examples/albert

This is a collection of miscellaneous small updates that would make examples/albert more efficient or easier to understand. __Note 1:__ if you're looking for a more advanced example where many...

enhancement

help wanted

optimize load_state_from_peers

__problem:__ if many peers join at once, they will all pick one averager (latest at the time) as a target for loading initial state. This is causes choke points as...

enhancement

help wanted

averaging

Retire prefetch_generator

We're using this dependency in one spot, where it can be replaced with ~5 lines of native code. Would be great to remove it

enhancement

good first issue

server

[Feature Request] fp16/bf16 gpu params with fp32 offloading in hivemind.Optimizer

It's something we played with a few times but did not end up merging to master. I'm creating this issue so we wouldn't forget it. It would be great if...

enhancement

help wanted

Let's add a tutorial for training VIT/ResNet50 with Decentralized SGD The intent is to use DecentralizedSGD optimizer with [vissl](https://github.com/facebookresearch/vissl) library for swav. Here's a basic tutorial for training simclr in...

enhancement

help wanted

server

justheuristic

Roadmap

[Feature Request] quality-of-life changes to examples/albert

optimize load_state_from_peers

Retire prefetch_generator

[Feature Request] fp16/bf16 gpu params with fp32 offloading in hivemind.Optimizer

Tutorial: computer vision

TrainingStateAverager does not average extra_tensors if offload_optimizer == True

[BUG][MINOR] monitor does not recover from failing to load state

[BUG][MINOR] Downloading state during averaging (and vice versa)

Sample experts with virtual batching, learning rate schedule, etc