Amy comments

Results 8 comments of

Amy

[Question]: mbart多对多模型下载后有5G，我在A40 显存44G上面跑不了吗？运行报内存溢出

import paddle a = paddle.to_tensor([0], dtype="int32") paddle.arange(a, a+1, dtype="int64") 执行以上代码会报错，环境 gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 cuda 11.7 cudnn 8.4 paddlepaddle-gpu 2.3.2 paddlenlp git clone 下载的最新的 @FrostML paddle安装是用的pip，我看了下文档cuda 11.7 只能docker安装吗

Does fasttransformer support the m2m100 model？

@chi2liu Hello, what quantitative operations have you performed on m2m100 to speed up model reasoning

Trying to Load Model Quantized for TensorRT Fails

After the seq2seq model is converted to onnx, there are three files. When I load it into GPU, there will be an error. ‘CUDA_ERROR_OUT_OF_MEMORY: out of memory’ How can I...

Trying to Load Model Quantized for TensorRT Fails

> What model are you trying to load @Amy234543? And can you provide the sizes of the generated onnx files? m2m100_418M,After pruning, the size of the model is 1.4g, and...

Trying to Load Model Quantized for TensorRT Fails

> Seems like reasonable sizes. Can you provide a script to reproduce the issue? @Amy234543 @NouamaneTazi import torch from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer from optimum.onnxruntime import ORTModelForSeq2SeqLM cuda_idx = 0...

To fine-tune how much gpu is required for the BELLE-7B-2M model, I am now fine-tuning the error memory overflow reported on the a100

![image](https://user-images.githubusercontent.com/107381937/229062316-ee91fb9e-9f06-4155-abda-d4b8f5e062da.png) @weberrr

To fine-tune how much gpu is required for the BELLE-7B-2M model, I am now fine-tuning the error memory overflow reported on the a100

A100 80G显存够微调 BELLE-7B-2M非量化的模型吗？量化版的模型微调后不能正确回答@weberrr

FastDeploy模型、硬件支持计划

硬件型号：nvidia-smi A40 和A100 操作系统：Linux 模型：paddle的mbart模型需要mbart模型部署翻译服务接口