torchscale icon indicating copy to clipboard operation
torchscale copied to clipboard

Foundation Architecture for (M)LLMs

Results 33 torchscale issues
Sort by recently updated
recently updated
newest added

Hi! torchscale 0.3.0 does not include LongNet. When will a new version with LongNet be released?

(torchscale) yehuicheng@bdp-gpu04:~/torchscale/examples/fairseq$ torchrun --nproc_per_node=8 --master_port 29501 --nnodes=1 train.py /home/data/dataset/yehuicheng/LongNet_example/DNA_example/longnet_example --num-workers 0 --activation-fn gelu --share-decoder-input-output-embed --validate-interval-updates 1000 --save-interval-updates 1000 --no-epoch-checkpoints --memory-efficient-fp16 --fp16-init-scale 4 --arch transformer --task language_modeling --sample-break-mode none --tokens-per-sample 4096...

I try the script :Breadcrumbs[torchscale](https://github.com/microsoft/torchscale/tree/main)/[examples](https://github.com/microsoft/torchscale/tree/main/examples) LongNet Model,but meet issue: /fairseq/(torchscale) :~/data/results/fairseq$ torchrun --nproc_per_node=8 --master_port 29501 --nnodes=1 train.py /home/data/dataset/yehuicheng/LongNet_example/DNA_example/longnet_example --num-workers 0 --activation-fn gelu --share-decoder-input-output-embed --validate-interval-updates 1000 --save-interval-updates 1000 --no-epoch-checkpoints --memory-efficient-fp16 --fp16-init-scale...