metaseq
metaseq copied to clipboard
How to finetune from a consolidated model ?
There are the ways to reshard the trained model to inference model, but how to retrain the model from the consolidated model ? (like llama)
you can convert the consolidated model offline into as many shards as you like using reshard_consolidated.py