PaddleVideo
PaddleVideo copied to clipboard
predict a video, but have exception
Describe the bug
在测试 FootballAction 项目时, 我把视频替换成自己的足球比赛视频。 然后修改了 TSN: batch_size 128 -> 32, 运行 predict.py 出现下列异常。
不知道是什么原因导致的, 请解答
添加一下 log
aistudio@jupyter-87105-1472898:~/lnwork/predict$ python predict.py 2021-01-23 18:27:50,091-INFO: load model ... 2021-01-23 18:27:50,101-INFO: ---------------- Infer Arguments ---------------- 2021-01-23 18:27:50,102-INFO: COMMON: 2021-01-23 18:27:50,102-INFO: fps:5 2021-01-23 18:27:50,102-INFO: use_gpu:True 2021-01-23 18:27:50,102-INFO: imgs_path:data/test/frames/ 2021-01-23 18:27:50,102-INFO: pcm_path:data/test/audio/11.pcm 2021-01-23 18:27:50,102-INFO: feature_path:data/test/feature/output_11.pkl 2021-01-23 18:27:50,102-INFO: label_dic:configs/index_label_football_8.json 2021-01-23 18:27:50,102-INFO: props_path:tmp/bmn.txt 2021-01-23 18:27:50,102-INFO: classify_path:tmp/result.json 2021-01-23 18:27:50,102-INFO: clssify_topk:1 2021-01-23 18:27:50,102-INFO: TSN: 2021-01-23 18:27:50,102-INFO: name:TSN 2021-01-23 18:27:50,102-INFO: weight_path:/home/aistudio/lnwork/checkpoints/models_tsn/TSN_epoch36.pdparams 2021-01-23 18:27:50,102-INFO: format:jpg 2021-01-23 18:27:50,102-INFO: num_classes:8 2021-01-23 18:27:50,102-INFO: seg_num:7 2021-01-23 18:27:50,102-INFO: seglen:1 2021-01-23 18:27:50,102-INFO: image_mean:[0.485, 0.456, 0.406] 2021-01-23 18:27:50,102-INFO: image_std:[0.229, 0.224, 0.225] 2021-01-23 18:27:50,102-INFO: num_layers:50 2021-01-23 18:27:50,102-INFO: short_size:256 2021-01-23 18:27:50,102-INFO: target_size:224 2021-01-23 18:27:50,102-INFO: num_reader_threads:12 2021-01-23 18:27:50,102-INFO: buf_size:1024 2021-01-23 18:27:50,102-INFO: batch_size:32 2021-01-23 18:27:50,102-INFO: image_scale:2048 2021-01-23 18:27:50,102-INFO: audio_scale:640 2021-01-23 18:27:50,102-INFO: AUDIO: 2021-01-23 18:27:50,102-INFO: name:AUDIO 2021-01-23 18:27:50,102-INFO: weight_path:/home/aistudio/lnwork/checkpoints/models_audio/audio.pdparams 2021-01-23 18:27:50,102-INFO: sample_rate:16000 2021-01-23 18:27:50,102-INFO: pcm_file:tmp/1.pcm 2021-01-23 18:27:50,102-INFO: feature_names:['audio'] 2021-01-23 18:27:50,102-INFO: feature_dims:[[50, 64]] 2021-01-23 18:27:50,103-INFO: lstm_size_audio:1024 2021-01-23 18:27:50,103-INFO: batch_size:32 2021-01-23 18:27:50,103-INFO: BMN: 2021-01-23 18:27:50,103-INFO: name:BMN 2021-01-23 18:27:50,103-INFO: subset:test 2021-01-23 18:27:50,103-INFO: weight_path:/home/aistudio/lnwork/checkpoints/models_bmn/BMN_epoch19.pdparams 2021-01-23 18:27:50,103-INFO: window_step:200 2021-01-23 18:27:50,103-INFO: tscale:200 2021-01-23 18:27:50,103-INFO: dscale:200 2021-01-23 18:27:50,103-INFO: feat_dim:2048 2021-01-23 18:27:50,103-INFO: prop_boundary_ratio:0.5 2021-01-23 18:27:50,103-INFO: num_sample:32 2021-01-23 18:27:50,103-INFO: num_sample_perbin:3 2021-01-23 18:27:50,103-INFO: batch_size:1 2021-01-23 18:27:50,103-INFO: num_threads:8 2021-01-23 18:27:50,103-INFO: nms_thread:0.7 2021-01-23 18:27:50,103-INFO: score_thread:0.01 2021-01-23 18:27:50,103-INFO: output_path:logs/BMN_results 2021-01-23 18:27:50,103-INFO: result_path:logs/BMN_results 2021-01-23 18:27:50,103-INFO: ACTION: 2021-01-23 18:27:50,103-INFO: name:ActionNet 2021-01-23 18:27:50,103-INFO: dataset:lstmdata 2021-01-23 18:27:50,103-INFO: bone_nework:None 2021-01-23 18:27:50,103-INFO: subset:test 2021-01-23 18:27:50,103-INFO: drop_rate:0.5 2021-01-23 18:27:50,103-INFO: feature_num:2 2021-01-23 18:27:50,103-INFO: feature_names:['rgb', 'audio'] 2021-01-23 18:27:50,103-INFO: feature_dims:[2048, 1024] 2021-01-23 18:27:50,103-INFO: embedding_size:512 2021-01-23 18:27:50,103-INFO: lstm_size_img:2048 2021-01-23 18:27:50,103-INFO: lstm_size_audio:1024 2021-01-23 18:27:50,103-INFO: num_classes:8 2021-01-23 18:27:50,103-INFO: topk:1 2021-01-23 18:27:50,103-INFO: with_bn:True 2021-01-23 18:27:50,103-INFO: weight_path:/home/aistudio/lnwork/checkpoints/models_lstm/ActionNet_epoch14_acc82.34873749404478.pdparams 2021-01-23 18:27:50,103-INFO: batch_size:1 2021-01-23 18:27:50,103-INFO: nms_thread:0.01 2021-01-23 18:27:50,103-INFO: nms_offset:10 2021-01-23 18:27:50,103-INFO: classify_score_thread:0.05 2021-01-23 18:27:50,103-INFO: iou_score_thread:0.1 2021-01-23 18:27:50,104-INFO: ------------------------------------------------- 2021-01-23 18:27:50,104-INFO: TSN 2021-01-23 18:27:50,104-INFO: /home/aistudio/lnwork/checkpoints/models_tsn/TSN_epoch36.pdparams 2021-01-23 18:27:53,196-INFO: AUDIO 2021-01-23 18:27:53,196-INFO: /home/aistudio/lnwork/checkpoints/models_audio/audio.pdparams 2021-01-23 18:27:53,597-INFO: BMN 2021-01-23 18:27:53,597-INFO: /home/aistudio/lnwork/checkpoints/models_bmn/BMN_epoch19.pdparams 2021-01-23 18:29:10,982-INFO: ACTION 2021-01-23 18:29:10,982-INFO: /home/aistudio/lnwork/checkpoints/models_lstm/ActionNet_epoch14_acc82.34873749404478.pdparams 2021-01-23 18:29:18,874-INFO: step0: load model time: 1.4795087655385335 min
2021-01-23 18:29:18,875-INFO: predict ...
2021-01-23 18:29:18,875-INFO: /home/aistudio/lnwork/datasets/EuroCup2016/mp4/1111.mp4
2021-01-23 18:29:18,875-INFO: len of self.video_path 227
W0123 18:29:20.616609 184 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 9.0
W0123 18:29:20.621582 184 device_context.cc:260] device: 0, cuDNN Version: 7.6.
2021-01-23 18:29:23,429-INFO: feature shape (224, 2048) (32, 1024)
2021-01-23 18:29:23,429-INFO: step1: feature extract time: 0.0759026845296224 min
2021-01-23 18:29:23,680-INFO: (26,)
2021-01-23 18:29:23,681-INFO: step2: proposal time: 0.004182108243306478 min
{'image_feature': array([[0.4803195 , 0.32037884, 0.53200346, ..., 0.31350306, 1.0510458 ,
1.4193231 ],
[0.45862836, 0.34426287, 0.50272864, ..., 0.28883642, 1.0431803 ,
1.4018294 ],
[0.338656 , 0.37180123, 0.4288231 , ..., 0.3153581 , 0.9265923 ,
1.5697651 ],
...,
[0.46327233, 0.53256756, 0.11387858, ..., 0.16173136, 0.23192742,
0.3625664 ],
[0.2868911 , 0.44077075, 0.14487827, ..., 0.03247827, 0.45835313,
0.9584287 ],
[0.37922782, 0.45163244, 0.4061401 , ..., 0.10330537, 0.24302031,
0.6173564 ]], dtype=float32), 'audio_feature': array([[0.02114511, 0.01836166, 0.00679044, ..., 0.01285091, 0. ,
0. ],
[0.02131153, 0.10744908, 0.00779211, ..., 0.00548674, 0.08177835,
0.07966843],
[0.08607582, 0. , 0.00150466, ..., 0.09599936, 0.02981716,
0.0011116 ],
...,
[0. , 0.46094784, 0.08641196, ..., 0.35587826, 0. ,
0. ],
[0. , 0.11275528, 0.01857419, ..., 0.23341149, 0. ,
0.0059924 ],
[0. , 0.2097485 , 0. , ..., 0.03459844, 0.07636683,
0. ]], dtype=float32)}
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py:1070: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "predict.py", line 116, in
C++ Call Stacks (More useful to developers):
0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int) 1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) 2 paddle::framework::Tensor::mutable_data(paddle::platform::Place const&, paddle::framework::proto::VarType_Type, unsigned long) 3 paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const 4 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, int>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, long>, paddle::operators::ElementwiseAddKernel<paddle::platform::CUDADeviceContext, paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&) 5 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const 6 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const 7 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&) 8 paddle::framework::Executor::RunPartialPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, long, long, bool, bool, bool) 9 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) 10 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool, bool)
Python Call Stacks (More useful to users):
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2610, in append_op
attrs=kwargs.get("attrs", None))
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 135, in append_bias_op
attrs={'axis': dim_start})
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/nn.py", line 1732, in fc
pre_activation = helper.append_bias_op(pre_bias, dim_start=num_flatten_dims)
File "/home/aistudio/work/PaddleVideo/FootballAction/predict/models/action/action.py", line 139, in build_model
initializer=fluid.initializer.NormalInitializer(scale=0.0)))
File "/home/aistudio/work/PaddleVideo/FootballAction/predict/prop_net.py", line 72, in init
self.infer_model.build_model()
File "predict.py", line 40, in load_model
classify_model = net_prop.ModelProp(infer_configs, "ACTION")
File "predict.py", line 107, in
Error Message Summary:
Error: When calling this method, the Tensor's numel must be equal or larger than zero. Please check Tensor::dims, or Tensor::Resize has been called first. The Tensor's shape is [-1, 4096] now [Hint: Expected numel() >= 0, but received numel():-4096 < 0:0.] at (/paddle/paddle/fluid/framework/tensor.cc:45) [operator < elementwise_add > error]
We have received your question. We will arrange RD to analyze and answer it as soon as possible.
从log中看到,提取的图像和音频特征大小分别为(224, 2048) (32, 1024),这里有两个问题, 一个是音画可能不同步,一般图像的T等于音频T的5倍,不过这个不会导致报错,因为后面会做对齐 另一个问题,提取的音频32x1024,小于40s(BMN的最小windows=200,即40s),会造成BMN的proposal无法获取对应长度的音频特征,导致报错 建议用大于40s的视频测试,保证提取的图像特征大于200 * 2048,音频大于40 * 1024
可以提供一下你的视频吗,我想测试看看
视频地址 https://yunedit.bj.bcebos.com/football%2F1111.mp4
@wgh1989 上面提供的视频, 用模型基本上就预测不出动作,感觉模型要求的视频,是不是很严格
抱歉,一直没登陆,回复晚了
这个主要原因是特征提取,最后不足batch的数据被丢弃了,这个在最新版本中已经修复
我刚把视频下载,新代码跑了一下,是没问题的
如图,在每个模型的data_reader部分都加了最后不足batch的数据
另外就是,一定要保证视频时长 > 40s,如果觉得这个太长,可以在训练BMN的时候调整这个参数