Ajay Saini issues

Results 6 issues of


                                            Ajay Saini

Stochastic Depth Determinism

There are a few problems with StochasticDepth determinism right now: - `use_same_gpu_seed` assumes each process has exactly the same seed when instead each process has `seed = user provided seed...

bug

research

`ghost_batchnorm` with a single normalization op

## 🚀 Feature Request An implementation that made only one call to `F.batch_norm` or `F.layer_norm` would be more performant than the one we have now. Before implementing the change, some...

good first issue

research

`surgery.replace_module_classes` should copy initialization weights where possible

## 🚀 Feature Request Right now, `surgery.replace_module_classes` does not preserve any initialized model weights for modules that are replaced. However, in cases where there is a 1-1 mapping of weights...

enhancement

Add max_new_tokens to MPT handler

Let's use `max_new_tokens` and mark `max_length` as deprecated - it's much clearer

Rename MOSAICML_API_TOKEN to MOSAICML_API_KEY

[Do not merge] Hosted handler for MPT

Opening this for review but do not merge