alphanlp issues

Results 19 issues of


                                            alphanlp

why GPTActor is not to load pretained parameters in chatgpt examples

### 🐛 Describe the bug with strategy.model_init_context(): if args.model == 'gpt2': actor = GPTActor().cuda() critic = GPTCritic().cuda() ### Environment _No response_

bug

RuntimeError: MGLMTextSummarizationPipeline: The server socket has failed to listen on any local network address. The server socket has failed to bind to [::]:29500 (errno: 98 - Address already in use). The server socket has failed to bind to 0.0.0.0:29500

Traceback (most recent call last): File "chinese_abstract.py", line 27, in model_revision='v1.0.1', File "/data/huap/software/miniconda3/envs/ms/lib/python3.7/site-packages/modelscope/pipelines/builder.py", line 141, in pipeline return build_pipeline(cfg, task_name=task) File "/data/huap/software/miniconda3/envs/ms/lib/python3.7/site-packages/modelscope/pipelines/builder.py", line 55, in build_pipeline cfg, PIPELINES, group_key=task_name, default_args=default_args)...

in 1.3.0 Version, ROM语义相关性 model predict results is random

中文LLaMA是通过lora来训练的？

the traing log like this is Normal？ I do not find loss in the logs, and what does the "grad norm: nan" mean?

TypeError: LlamaRotaryEmbedding.forward() got an unexpected keyword argument 'seq_len'

### 🐛 Describe the bug File "/data/llmodel/miniconda3/envs/colossal/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/llmodel/miniconda3/envs/colossal/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/data/llmodel/huap/ColossalAI/applications/Colossal-LLaMA-2/colossal_llama2/utils/flash_attention_patch.py", line 133, in attention_forward cos,...

bug

alphanlp

why GPTActor is not to load pretained parameters in chatgpt examples

RuntimeError: MGLMTextSummarizationPipeline: The server socket has failed to listen on any local network address. The server socket has failed to bind to [::]:29500 (errno: 98 - Address already in use). The server socket has failed to bind to 0.0.0.0:29500

in 1.3.0 Version, ROM语义相关性 model predict results is random

中文LLaMA是通过lora来训练的？

the traing log like this is Normal？ I do not find loss in the logs, and what does the "grad norm: nan" mean?

TypeError: LlamaRotaryEmbedding.forward() got an unexpected keyword argument 'seq_len'

[FEATURE]: pretrain data example

instruct_chat_50k.json 数据问题

the model of Pre-tokenized dataset openchat_v3.2_super.train.parquet is Llama2 or Mistral?