PaddleNLP [Question]: mbart多对多模型下载后有5G，我在A40 显存44G上面跑不了吗？运行报内存溢出

[Question]: mbart多对多模型下载后有5G，我在A40 显存44G上面跑不了吗？运行报内存溢出

Open Amy234543 opened this issue 2 years ago • 3 comments

请提出你的问题

place = "gpu" paddle.set_device(place) model_name = "mbart-large-50-many-to-many-mmt" tokenizer = MBartTokenizer.from_pretrained(model_name) model = MBartForConditionalGeneration.from_pretrained(model_name, src_lang="en_XX") model.eval() def postprocess_response(seq, bos_idx, eos_idx): """Post-process the decoded sequence.""" eos_pos = len(seq) - 1 for i, idx in enumerate(seq): if idx == eos_idx: eos_pos = i break seq = [ idx for idx in seq[:eos_pos + 1] if idx != bos_idx and idx != eos_idx ] res = tokenizer.convert_ids_to_string(seq) return res bos_id = tokenizer.lang_code_to_id["zh_CN"] eos_id = model.mbart.config["eos_token_id"]

inputs = "PaddleNLP is a powerful NLP library with Awesome pre-trained models and easy-to-use interface," input_ids = tokenizer(inputs)["input_ids"] input_ids = paddle.to_tensor(input_ids, dtype='int64').unsqueeze(0) with paddle.no_grad(): outputs, _ = model.generate(input_ids=input_ids, forced_bos_token_id=bos_id,max_length=50,use_faster=False, use_fp16_decoding=False, )

result = postprocess_response(outputs[0].numpy().tolist(), bos_id, eos_id)

print("Model input:", inputs) print("Result:", result)

这是我测试的代码，结果报错 4787b13f7907af17dd63bc9b20da738

Sep 21 '22 05:09 Amy234543

你好，我这边在V100-32G上可以运行，显存占用为12G。你的GPU上是不是有别的程序占着显存呢？PaddlePaddle版本是 2.3.1.post101

Sep 22 '22 10:09 gongel

确实，EB 级别的显存占用不合理。判断是某些数值获取异常导致出现极大，个别 op 计算占用显存激增。

可以定位下这里是否是执行到这里就发生报错。或者直接使用

import paddle
a = paddle.to_tensor([0], dtype="int32")
paddle.arange(a, a+1, dtype="int64")

是否会出现问题。

若是，怀疑可能是环境问题，麻烦再提供下机器环境配置。

也可尝试修改这里代码为

decoder_inputs_embed_pos = self.decoder_embed_positions(
            decoder_input_ids.shape, past_key_values_length.cuda())

看看是否能解决。

Sep 29 '22 11:09 FrostML

import paddle a = paddle.to_tensor([0], dtype="int32") paddle.arange(a, a+1, dtype="int64") 执行以上代码会报错，环境
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 cuda 11.7 cudnn 8.4 paddlepaddle-gpu 2.3.2 paddlenlp git clone 下载的最新的 @FrostML

paddle安装是用的pip，我看了下文档cuda 11.7 只能docker安装吗

Sep 30 '22 16:09 Amy234543

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动，被标记为stale。

Dec 07 '22 07:12 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天，即将关闭。

Dec 22 '22 00:12 github-actions[bot]

PaddleNLP PaddleNLP copied to clipboard

[Question]: mbart多对多模型下载后有5G，我在A40 显存44G上面跑不了吗？运行报内存溢出

请提出你的问题

PaddleNLP
PaddleNLP copied to clipboard