PaddleX icon indicating copy to clipboard operation
PaddleX copied to clipboard

模型裁剪报错TypeError: reduce() of empty sequence with no initial value

Open smallwhi opened this issue 4 years ago • 8 comments

问题类型:其它

PaddleX版本
paddlex==1.3.10

问题描述

========================

加载的是faster rcnn模型,但是visualize的是yolo的敏感度分析文件,故无法正常运行

我加载的是faster_rcnn_r50_fpn.sensi的模型,visualize也是faster_rcnn_r50_fpn的分析文件报了这个错误,报错信息如下: image 代码如下: image

是模型分析参数的文件生成的不对吗?还是faster_rcnn_r50_fpn不能这样裁剪,如果这种方法不行应该怎么裁剪我的模型? 求大佬指点

smallwhi avatar Jul 18 '21 13:07 smallwhi

paddle以及paddleslim的版本分别是多少?

另外麻烦贴一下模型训练的脚本以及敏感度分析的脚本,以便我们复现。

will-jl944 avatar Jul 19 '21 07:07 will-jl944

paddleslim-1.1.1 paddlepaddle-2.1.0 训练: import matplotlib import os os.environ['CUDA_VISIBLE_DEVICES'] = '0' import paddlex as pdx

from paddlex.cls import transforms

from paddlex.det import transforms

train_transforms = transforms.Compose([ transforms.RandomHorizontalFlip(), transforms.Normalize(), transforms.ResizeByShort(short_size=800, max_size=1333), transforms.Padding(coarsest_stride=32) ])

eval_transforms = transforms.Compose([ transforms.Normalize(), transforms.ResizeByShort(short_size=800, max_size=1333), transforms.Padding(coarsest_stride=32), ])

train_dataset = pdx.datasets.VOCDetection( data_dir='/home/aistudio/work/train_data', file_list='/home/aistudio/work/train_data/train_list.txt', label_list='/home/aistudio/work/train_data/labels.txt', transforms=train_transforms, shuffle=True) eval_dataset = pdx.datasets.VOCDetection( data_dir='/home/aistudio/work/train_data', file_list='/home/aistudio/work/train_data/val_list.txt', label_list='/home/aistudio/work/train_data/labels.txt', transforms=eval_transforms)

num_classes = len(train_dataset.labels) + 1

model = pdx.det.FasterRCNN(num_classes=num_classes, backbone='ResNet50')

model.train( num_epochs=200, save_interval_epochs=50, train_dataset=train_dataset, train_batch_size=8, eval_dataset=eval_dataset, learning_rate=0.01, lr_decay_epochs=[8, 11], save_dir='output/faster_rcnn_r50_fpn', use_vdl=True)

参数敏感度分析: import os os.environ['CUDA_VISIBLE_DEVICES'] = '0,4,5' import paddlex as pdx

model = pdx.load_model('/home/aistudio/work/output/faster_rcnn_r50_fpn/best_model')

eval_dataset = pdx.datasets.VOCDetection( data_dir='/home/aistudio/work/train_data', file_list='/home/aistudio/work/train_data/val_list.txt', label_list='/home/aistudio/work/train_data/labels.txt', transforms=model.eval_transforms)

pdx.slim.prune.analysis( model, dataset=eval_dataset, batch_size=128, save_file='faster_rcnn_r50_fpn.sensi.data')

smallwhi avatar Jul 19 '21 07:07 smallwhi

问题类型:其它

PaddleX版本 paddlex==1.3.10

问题描述

========================

加载的是faster rcnn模型,但是visualize的是yolo的敏感度分析文件,故无法正常运行

我加载的是faster_rcnn_r50_fpn.sensi的模型,visualize也是faster_rcnn_r50_fpn的分析文件报了这个错误,报错信息如下: image 代码如下: image

是模型分析参数的文件生成的不对吗?还是faster_rcnn_r50_fpn不能这样裁剪,如果这种方法不行应该怎么裁剪我的模型? 求大佬指点

感谢指正,已经复现出1.3.x版本的pdx.slim.visualize的该问题,该问题已经修复,见pr https://github.com/PaddlePaddle/PaddleX/pull/965 。

您这边最快的修改方式是,直接修改已安装的paddlex。例如直接修改/usr/local/lib/python3.7/site-packages/paddlex/cv/models/slim/visualize.py。具体paddlex的安装路径可以通过pip uninstall paddlex然后选择n,来查看。

FlyingQianMM avatar Jul 19 '21 08:07 FlyingQianMM

非常感谢!我已经可以正常visualize faster_rcnn_r50_fpn.sensi.data了。但是我在接下来继续进行裁剪训练发现需要降级paddle从paddlepaddle==2.1.0降到paddlepaddle==1.8.5,再进行裁剪训练时发现裁剪训练没有用到GPU只用到了CPU,速度太慢了,有什么办法可以解决吗?我的训练代码在paddlepaddle==2.1.0时是可以使用GPU的,但是降级后同样的代码声明了使用GPU,但是没有按预期使用GPU,我的aistudio算力卡仍有时长,是在GPU环境下执行的。 训练过程截图如下: image 只使用CPU信息以及训练预估时间如下: !!! The CPU_NUM is not specified, you should set CPU_NUM in the environment variable list. CPU_NUM indicates that how many CPUPlace are used in the current task. And if this parameter are set as N (equal to the number of physical CPU core) the program may be faster.

export CPU_NUM=24 # for example, set CPU_NUM as number of physical CPU core which is 24.

!!! The default number of CPU_NUM=1. 2021-07-19 20:14:02 [INFO] [TRAIN] Epoch=1/100, Step=2/492, loss=0.269279, loss_cls=0.159569, loss_bbox=0.080821, loss_rpn_cls=0.016622, loss_rpn_bbox=0.012267, lr=0.000852, time_each_step=107.75s, eta=1480:54:55 我的训练代码如下:

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1,3'

from paddlex.det import transforms
import paddlex as pdx


train_transforms = transforms.Compose([
    transforms.RandomHorizontalFlip(), 
    transforms.Normalize(),
    transforms.ResizeByShort(short_size=800, max_size=1333), 
    transforms.Padding(coarsest_stride=32)
])

eval_transforms = transforms.Compose([
    transforms.Normalize(),
    transforms.ResizeByShort(short_size=800, max_size=1333),
    transforms.Padding(coarsest_stride=32),
])

train_dataset = pdx.datasets.VOCDetection(
                    data_dir='/home/aistudio/work/train_data',
                    file_list='/home/aistudio/work/train_data/train_list.txt',
                    label_list='/home/aistudio/work/train_data/labels.txt',
                    transforms=train_transforms,
                    shuffle=True)
eval_dataset = pdx.datasets.VOCDetection(
                    data_dir='/home/aistudio/work/train_data',
                    file_list='/home/aistudio/work/train_data/val_list.txt',
                    label_list='/home/aistudio/work/train_data/labels.txt',
                    transforms=eval_transforms)

num_classes = len(train_dataset.labels) + 1

model = pdx.det.FasterRCNN(num_classes=num_classes, backbone='ResNet50')

model.train(
    num_epochs=100,
    save_interval_epochs=50,
    train_dataset=train_dataset,
    train_batch_size=8,
    eval_dataset=eval_dataset,
    learning_rate=0.01,
    lr_decay_epochs=[8, 11],
    pretrain_weights='output/faster_rcnn_r50_fpn/best_model',
    save_dir='output/fast_rcnn_r50_fpn_prune',
    sensitivities_file='/home/aistudio/work/faster_rcnn_r50_fpn.sensi.data',
    eval_metric_loss=0.05,
    use_vdl=True)

smallwhi avatar Jul 19 '21 12:07 smallwhi

paddlepaddle-1.8.5装的是gpu版本么? gpu版本一般为paddlepaddle-gpu==xxx,比如: python3 -m pip install paddlepaddle-gpu==1.8.5.post107 -f https://paddlepaddle.org.cn/whl/stable.html

安装文档: https://www.paddlepaddle.org.cn/documentation/docs/zh/1.8/install/index_cn.html

heliqi avatar Jul 19 '21 12:07 heliqi

非常感谢!安装了paddlepaddle==1.8.5.post107后可以正常使用gpu训练了。由于提示中没有-gpu的后缀,我没能意识到paddlepaddle与paddlepaddle-gpu的区别

smallwhi avatar Jul 19 '21 12:07 smallwhi

使用paddlex进行裁剪训练配置eval_metric_loss=0.05,把fast_rcnn_r50_fpn裁剪后得到fast_rcnn_r50_fpn_prune的预测结果出现多种类别的框且没有NMS、框的定位不准问题 paddlex==1.3.11 预测图片结果如下: image 训练代码如下:

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from paddlex.det import transforms
import paddlex as pdx


train_transforms = transforms.Compose([
    transforms.RandomHorizontalFlip(), 
    transforms.Normalize(),
    transforms.ResizeByShort(short_size=800, max_size=1333), 
    transforms.Padding(coarsest_stride=32)
])

eval_transforms = transforms.Compose([
    transforms.Normalize(),
    transforms.ResizeByShort(short_size=800, max_size=1333),
    transforms.Padding(coarsest_stride=32),
])

train_dataset = pdx.datasets.VOCDetection(
                    data_dir='/home/aistudio/work/train_data',
                    file_list='/home/aistudio/work/train_data/train_list.txt',
                    label_list='/home/aistudio/work/train_data/labels.txt',
                    transforms=train_transforms,
                    shuffle=True)
eval_dataset = pdx.datasets.VOCDetection(
                    data_dir='/home/aistudio/work/train_data',
                    file_list='/home/aistudio/work/train_data/val_list.txt',
                    label_list='/home/aistudio/work/train_data/labels.txt',
                    transforms=eval_transforms)

num_classes = len(train_dataset.labels) + 1

model = pdx.det.FasterRCNN(num_classes=num_classes, backbone='ResNet50')

model.train(
    num_epochs=150,
    save_interval_epochs=50,
    train_dataset=train_dataset,
    train_batch_size=8,
    eval_dataset=eval_dataset,
    learning_rate=0.01,
    lr_decay_epochs=[8, 11],
    pretrain_weights='output/faster_rcnn_r50_fpn/best_model',
    save_dir='output/fast_rcnn_r50_fpn_prune',
    sensitivities_file='/home/aistudio/work/faster_rcnn_r50_fpn.sensi.data',
    eval_metric_loss=0.05,
    use_vdl=True)

smallwhi avatar Jul 21 '21 01:07 smallwhi

剪裁之后模型精度会降低,重新训练的时候可以调整下训练参数,尽可能提升下精度。

FlyingQianMM avatar Jul 21 '21 03:07 FlyingQianMM