SwitchTransformers icon indicating copy to clipboard operation
SwitchTransformers copied to clipboard

Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"

Results 3 SwitchTransformers issues
Sort by recently updated
recently updated
newest added

Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.8.11 to 1.9.0. Release notes Sourced from pypa/gh-action-pypi-publish's releases. v1.9.0 💅 Cosmetic Output Improvements @​woodruffw💰 updated the tense on password nudge in #234 @​shenxianpeng💰 helped us disable...

dependencies
github_actions
no-pr-activity

**Describe the bug** Shape mismatch is found in the computation of auxiliary loss values: https://github.com/kyegomez/SwitchTransformers/blob/36a1ea01448e56242222b68201207a7219d72b4b/switch_transformers/model.py#L70-L74 where `load` is of shape `[num_experts, dim]` and `importance` is of shape `[batch_size, dim]`. Testing...

bug

Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.8.11 to 1.10.3. Release notes Sourced from pypa/gh-action-pypi-publish's releases. v1.10.3 💅 Cosmetic Output Improvements In #270, @​facutuesca💰 made a follow-up to their previous PR #250, making the...

dependencies
github_actions