Jake Tae
Jake Tae
This PR resolves #149 by implementing a `ModelInspector` class similar to transformer's [`DebugUnderflowOverflow`](https://huggingface.co/transformers/debugging.html). - Using pytorch fwd/bwd hooks log multiple things about each model's submodule and its args. 1. fwd/bwd...
This PR fixes #203 by using the `args` global variable holder to save and access model parameter counts during gigaflops counting. This is sensible given that the number of model...
In `training.py`, we have https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/fd1e1da967c74e598acfc011031474663ef5845e/megatron/training.py#L818 However, this appears to be wasted compute since the model parameter count does not change. We can refactor the code so that `get_parameters_in_billions` is called...
This PR addresses #114 to check that Megatron-DS to Hugging Face transformers works as intended.
[WIP] Fixes: #189.
## Motivation #177 allows for train iterations to be skipped. This functionality is achieved by using a separate internal counter to keep track of skipped iterations instead of tinkering with...
## Context The general API for TTS inference looks like this: ```python from TTS.api import tts model = tts("tts_models/en/ljspeech/glow-tts", gpu=True) ``` `gpu` is a `bool`, so users can only specify...
## Issue In `trainer`, the `inspect` module is used to remove extraneous dataset columns. https://github.com/huggingface/transformers/blob/60d51ef5123d949fd8c59cd4d3254e711541d278/src/transformers/trainer.py#L722-L728 However, `torch.compile` modifies the signature of the forward function of the original model, so `inspect.signature`...
# What does this PR do? This PR adds FastSpeech2, a transformer-based text-to-speech model. Fixes #15166. ## Motivation While HF transformers has great support for ASR and other audio processing...
#56 set up a basic unit test, but we have to consider what kind of tests we want to run. This is especially important given that GitHub workflows does not...