mindnlp
mindnlp copied to clipboard
BaseTunerLayer类的get_base_layer()方法循环调用导致Python对象时超过的最大递归深度。
Describe the bug/ 问题描述 (Mandatory / 必填) 在微调模型时候,出现了因为BaseTunerLayer类的get_base_layer()方法的循环引用,导致调用Python对象时超过的最大递归深度。
-
Hardware Environment(
Ascend
/GPU
/CPU
) / 硬件环境:GPU
Please delete the backend not involved / 请删除不涉及的后端: /device ascend/GPU/CPU/kirin/等其他芯片
-
Software Environment / 软件环境 (Mandatory / 必填): -- MindSpore version (e.g., 1.7.0.Bxxx) :2.2.11 -- Python version (e.g., Python 3.7.5) :3.9 -- OS platform and distribution (e.g., Linux Ubuntu 16.04): -- GCC/Compiler version (if compiled from source):
-
Excute Mode / 执行模式 (Mandatory / 必填)(
PyNative
/Graph
):
Please delete the mode not involved / 请删除不涉及的模式: /mode pynative /mode graph
Related testcase / 关联用例 (Mandatory / 必填):
# 出问题的内部代码
class BaseTunerLayer(ABC):
def get_base_layer(self) -> nn.Cell:
base_layer = self
while hasattr(base_layer, "base_layer"):
base_layer = base_layer.base_layer
return base_layer
@property
def weight(self) -> Tensor:
# This is required for some transformers code, e.g. for T5, weight is accessed as:
# self.wo.weight
# where "wo" is the adapter layer.
# https://github.com/huggingface/transformers/blob/78f6ed6c70b29c1560780e3869a7ad4c6b3d2710/src/transformers
# /models/t5/modeling_t5.py#L292
base_layer = self.get_base_layer()
weight = base_layer.weight
return weight
To Reproduce / 重现步骤 (Mandatory / 必填) Steps to reproduce the behavior:
对MindNLP进行编包装包 然后在mindnlp/llm/peft/里面选择一个模型放到一个新的目录下 使用有MindNLP包的环境执行里面的train.py进行微调训练,例子如下: $ ./run_peft.sh
Expected behavior / 预期结果 (Mandatory / 必填) A clear and concise description of what you expected to happen.
Screenshots/ 日志 / 截图 (Mandatory / 必填)
(yy_env) daiyuxin@mindspore:/data1/yy/train_gpt_bigcode$ ./run_peft.sh
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.614 seconds.
Prefix dict has been built successfully.
/home/daiyuxin/miniconda3/envs/yy_env/lib/python3.9/site-packages/datasets/load.py:2516: FutureWarning: 'use_auth_token' was deprecated in favor of 'token' in version 2.14.0 and will be removed in 3.0.0.
You can remove this warning by passing 'token=<use_auth_token>' instead.
warnings.warn(
Size of the train set: 5875. Size of the validation set: 30
0%| | 0/400 [00:00<?, ?it/s]Token indices sequence length is longer than the specified maximum sequence length for this model (7483 > 2048). Running this sequence through the model will result in indexing errors
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [00:03<00:00, 112.99it/s]
The character to token ratio of the dataset is: 3.31
FIM is not supported by tokenizer, disabling FIM
FIM is not supported by tokenizer, disabling FIM
Traceback (most recent call last):
File "/data1/yy/train_gpt_bigcode/train.py", line 187, in
Additional context / 备注 (Optional / 选填)
具体的例子是哪个?
具体的例子是哪个?
falcon和gpt都会这样
fixed