SwissArmyTransformer icon indicating copy to clipboard operation
SwissArmyTransformer copied to clipboard

SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.

Results 42 SwissArmyTransformer issues
Sort by recently updated
recently updated
newest added

Hello, I would like to ask two questions as follows: 1. what determines the location of the module inserted by `add_mix` ? If not specified, is it inserted at the...

SAT_HOME路径下的权重,具体是在哪里加载,哪个函数?

https://github.com/THUDM/SwissArmyTransformer/blob/7ed825c5eb07e98d3408c6ddbfcd6e37db1d51c7/examples/cogview/pretrain_gpt2.py#L105 I'm not sure that whether the tokenizer of cogview is correctly configured. Btw nice library! Thank you.

Thank you for excellent your work on the model deployment. But I have a little doubt, it seems that a series of URLs such as cogagent are not in the...

https://github.com/THUDM/SwissArmyTransformer/blob/13c8f12324cac92ef8f60aad9e3c17262eda531e/sat/ops/local_attention_function.py:5 how to install localAttention

I loaded checkpoint to continue pretrain and found the following error. ``` [2023-09-25 15:53:27,994] [INFO] [RANK 0] Unable to load optimizer from checkpoint , exiting. Specify --no-load-rng or --finetune to...

``` Traceback (most recent call last): File "/ssd/ylying/CogVLM/basic_demo/infer_dataset.py", line 164, in main() File "/ssd/ylying/CogVLM/basic_demo/infer_dataset.py", line 36, in main model, model_args = AutoModel.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/sat/model/base_model.py", line 367, in from_pretrained mp_split_model_receive(model, use_node_group=use_node_group)...

![image](https://github.com/THUDM/SwissArmyTransformer/assets/69197635/86f16f8a-5f06-4c62-bcd5-417c1b232b7e) 或者有什么办法(或者需要注意修改哪些地方),才能实现解开对deepspeed的依赖呢?

https://github.com/THUDM/SwissArmyTransformer/blob/main/sat/model/official/mixtral_model.py

File "demo.py", line 44, in model, model_args =AutoModel.from_pretrained('/code/CogVLM-main/vicuna-7b-v1.5', args=argparse.Namespace( File "/opt/conda/lib/python3.8/site-packages/sat/model/base_model.py", line 337, in from_pretrained return cls.from_pretrained_base(name, args=args, home_path=home_path, url=url, prefix=prefix, build_only=build_only, overwrite_args=overwrite_args, **kwargs) File "/opt/conda/lib/python3.8/site-packages/sat/model/base_model.py", line 318, in from_pretrained_base...