metaseq icon indicating copy to clipboard operation
metaseq copied to clipboard

Repo for external large-scale work

Results 170 metaseq issues
Sort by recently updated
recently updated
newest added

**Patch Description** Describe your changes **Testing steps** Describe how you tested your changes

cla signed

## Issue Training requires flattened models, any MP and FSDP Inference requires unflattened models with FSDP 1 We wanted AML jobs which train model (which produces Flattened checkpoint output), reshards...

cla signed

⚠️ This PR likely won't work directly but I wanted to share code from our fork that may be modified to integrate ⚠️ ## Issue (This may not be 100%...

cla signed

⚠️ This PR is not intended to be merged directly, but to demonstrate documentation from our fork ⚠️ ## Issue Current documentation in Metaseq repo is very minimal. - Given...

cla signed

# Issues ## 1 Inconsistent checkpoint filenames saved by trainer In our pipeline we often have sequence of steps such as (train, reshard/unflatten, evaluate). The output files of the training...

cla signed

## ❓ Questions and Help ### Before asking: - [x] search the issues. - [x] search the docs. #### What is your question? The OPT-IML paper evaluates the models on...

question

**Patch Description** Describe your changes **Testing steps** Describe how you tested your changes

cla signed

This addresses Issue 642. When the stop token is \n\n the generation should stop after generation two new lines. Check the previous token that is generated and if it is...

cla signed

## 🐛 Bug I use the script as follow: CUDA_VISIBLE_DEVICES="0, 1, 2, 3" metaseq-train --task streaming_language_modeling \ data/pile-test/ \ --num-workers 4 \ --reset-dataloader \ --vocab-filename ./vocab/gpt2-vocab.json \ --merges-filename ./vocab/gpt2-merges.txt \...

bug

There are the ways to reshard the trained model to inference model, but how to retrain the model from the consolidated model ? (like llama)

question