torchscale icon indicating copy to clipboard operation
torchscale copied to clipboard

Foundation Architecture for (M)LLMs

Results 26 torchscale issues
Sort by recently updated
recently updated
newest added

i am currently working Session-Based Recommendation system problem, indeed this topic in NLP so quite under progress of improvement, and most of technics used to tackle that problem are Matrix...

![image](https://github.com/microsoft/torchscale/assets/29060320/f6dd3af7-ca26-4e4f-98f5-98a44a7c3322) ![image](https://github.com/microsoft/torchscale/assets/29060320/79969050-5c53-4eec-84bd-8e973ebef992) I have pip install setuptools,ann install orther packges is ok , my python is 3.9

I create new conda env and then ``` pip install -r requirements.txt pip install git+https://github.com/shumingma/fairseq.git@moe pip install -v -U git+https://github.com/facebookresearch/[email protected]#egg=xformers ``` installed above. however,I cant try using longvit ``` #https://github.com/microsoft/torchscale/blob/main/examples/longvit/longvit.py...

Hi there, I would like to implement LongNet for a project that is inputting numerical data into a transformer, to predict numerical data. However, for my data there are connections...

The longnet example in the readme expects LongNet.py to be named longnet.py. I also went ahead and updated all other files that import classes from longnet.py.

Hi, is there a pre-trained checkpoint available for RetNet, and if not - are there any plans to make it available? Thanks! Related to https://github.com/microsoft/unilm/issues/1474

Bumps [pillow](https://github.com/python-pillow/Pillow) from 10.0.0 to 10.2.0. Release notes Sourced from pillow's releases. 10.2.0 https://pillow.readthedocs.io/en/stable/releasenotes/10.2.0.html Changes Add keep_rgb option when saving JPEG to prevent conversion of RGB colorspace #7553 [@​bgilbert] Trim...

dependencies

torchrun --nproc_per_node=8 --nnodes=1 train.py ../../../fairseq/data-bin/wikitext-103/ --num-workers 0 --activation-fn gelu --share-decoder-input-output-embed --validate-interval-updates 1000 --save-interval-updates 1000 --no-epoch-checkpoints --memory-efficient-fp16 --fp16-init-scale 4 --arch lm_base --task language_modeling --sample-break-mode none --tokens-per-sample 4096 --optimizer adam --adam-betas "(0.9,...