decord icon indicating copy to clipboard operation
decord copied to clipboard

decord._ffi.base.DECORDError from VideoReader

Open Lijun-Yu opened this issue 5 years ago • 10 comments
trafficstars

Hi, there seems to be some problem with videos containing corrupted frames.

Video, script, and logs available at Google drive.

Video is from the MEVA dataset. It is loadable by cv2.VideoCapture, pymovie.editor.VideoFileClip, and avi_r.AVIReader but contains several corrupted frames as detailed here.

Environment: Mac OS 10.15.5, python 3.7, decord 0.4.0; also reproducible on Linux.

# Name                    Version                   Build  Channel
ca-certificates           2020.6.24                     0  
certifi                   2020.6.20                py37_0  
decord                    0.4.0                    pypi_0    pypi
libcxx                    10.0.0                        1  
libedit                   3.1.20191231         h1de35cc_1  
libffi                    3.3                  hb1e8313_2  
ncurses                   6.2                  h0a44026_1  
numpy                     1.19.0                   pypi_0    pypi
openssl                   1.1.1g               h1de35cc_0  
pip                       20.1.1                   py37_1  
python                    3.7.7                hf48f09d_4  
readline                  8.0                  h1de35cc_0  
setuptools                49.2.0                   py37_0  
sqlite                    3.32.3               hffcf06c_0  
tk                        8.6.10               hb0a8c7a_0  
wheel                     0.34.2                   py37_0  
xz                        5.2.5                h1de35cc_0  
zlib                      1.2.11               h1de35cc_3  

Script:

from decord import VideoReader
from decord import cpu

vr = VideoReader('2018-03-07.16-55-06.17-00-06.school.G336.avi', ctx=cpu(0))
for i in range(len(vr)):       
    frame = vr[i]
    print(i, frame.shape)

Log:

[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
Invalid UE golomb code
[NULL @ 0x7fde3829b200] pps_id 3199971767 out of range
[NULL @ 0x7fde3829b200] sps_id 3 out of range
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] sps_id 3 out of range
[NULL @ 0x7fde3829b200] SEI type 0 size 568 truncated at 256
[NULL @ 0x7fde3829b200] non-existing PPS 176 referenced
[NULL @ 0x7fde3829b200] SEI type 0 size 568 truncated at 256
[NULL @ 0x7fde3829b200] non-existing PPS 176 referenced
Invalid UE golomb code
[NULL @ 0x7fde3829b200] pps_id 3199971767 out of range
[NULL @ 0x7fde3829b200] sps_id 32 out of range
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
Invalid UE golomb code
[NULL @ 0x7fde3829b200] pps_id 3199971767 out of range
[NULL @ 0x7fde3829b200] sps_id 3 out of range
[NULL @ 0x7fde3829b200] missing picture in access unit with size 56
[NULL @ 0x7fde3829b200] sps_id 3 out of range
[NULL @ 0x7fde3829b200] SEI type 0 size 568 truncated at 256
[NULL @ 0x7fde3829b200] non-existing PPS 176 referenced
[NULL @ 0x7fde3829b200] SEI type 0 size 568 truncated at 256
[NULL @ 0x7fde3829b200] non-existing PPS 176 referenced
Invalid UE golomb code
[NULL @ 0x7fde3829b200] pps_id 3199971767 out of range
[NULL @ 0x7fde3829b200] sps_id 32 out of range
[h264 @ 0x7fde1800b400] No start code is found.
[h264 @ 0x7fde1800b400] Error splitting the input into NAL units.
[h264 @ 0x7fde18048400] No start code is found.
[h264 @ 0x7fde18048400] Error splitting the input into NAL units.
[h264 @ 0x7fde18048a00] No start code is found.
[h264 @ 0x7fde18048a00] Error splitting the input into NAL units.
[h264 @ 0x7fde1808c200] No start code is found.
[h264 @ 0x7fde1808c200] Error splitting the input into NAL units.
[h264 @ 0x7fde1808c800] No start code is found.
[h264 @ 0x7fde1808c800] Error splitting the input into NAL units.
[h264 @ 0x7fde18016600] No start code is found.
[h264 @ 0x7fde18016600] Error splitting the input into NAL units.
0 (1080, 1920, 3)
... (Omitted)
103 (1080, 1920, 3)
Traceback (most recent call last):
  File "test.py", line 6, in <module>
    frame = vr[i]
  File "/Users/lijun/Applications/miniconda3/envs/decord/lib/python3.7/site-packages/decord/video_reader.py", line 92, in __getitem__
    return self.next()
  File "/Users/lijun/Applications/miniconda3/envs/decord/lib/python3.7/site-packages/decord/video_reader.py", line 104, in next
    arr = _CAPI_VideoReaderNextFrame(self._handle)
  File "/Users/lijun/Applications/miniconda3/envs/decord/lib/python3.7/site-packages/decord/_ffi/_ctypes/function.py", line 175, in __call__
    ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
  File "/Users/lijun/Applications/miniconda3/envs/decord/lib/python3.7/site-packages/decord/_ffi/base.py", line 63, in check_call
    raise DECORDError(py_str(_LIB.DECORDGetLastError()))
decord._ffi.base.DECORDError: [16:53:10] /Users/travis/build/zhreshold/decord-distro/decord/src/video/ffmpeg/threaded_decoder.cc:288: [16:53:10] /Users/travis/build/zhreshold/decord-distro/decord/src/video/ffmpeg/threaded_decoder.cc:216: Check failed: avcodec_send_packet(dec_ctx_.get(), pkt.get()) >= 0 (-1094995529 vs. 0) Thread worker: Error sending packet.

Stack trace returned 7 entries:
[bt] (0) 0   libdecord.dylib                     0x0000000112db2d70 dmlc::StackTrace(unsigned long) + 464
[bt] (1) 1   libdecord.dylib                     0x0000000112db2a54 dmlc::LogMessageFatal::~LogMessageFatal() + 52
[bt] (2) 2   libdecord.dylib                     0x0000000112df2b16 decord::ffmpeg::FFMPEGThreadedDecoder::WorkerThreadImpl() + 390
[bt] (3) 3   libdecord.dylib                     0x0000000112df0d29 decord::ffmpeg::FFMPEGThreadedDecoder::WorkerThread() + 25
[bt] (4) 4   libdecord.dylib                     0x0000000112df41ee void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (decord::ffmpeg::FFMPEGThreadedDecoder::*)(), decord::ffmpeg::FFMPEGThreadedDecoder*> >(void*) + 62
[bt] (5) 5   libsystem_pthread.dylib             0x00007fff72534109 _pthread_start + 148
[bt] (6) 6   libsystem_pthread.dylib             0x00007fff7252fb8b thread_start + 15



Stack trace returned 10 entries:
[bt] (0) 0   libdecord.dylib                     0x0000000112db2d70 dmlc::StackTrace(unsigned long) + 464
[bt] (1) 1   libdecord.dylib                     0x0000000112db2a54 dmlc::LogMessageFatal::~LogMessageFatal() + 52
[bt] (2) 2   libdecord.dylib                     0x0000000112df0cae decord::ffmpeg::FFMPEGThreadedDecoder::CheckErrorStatus() + 142
[bt] (3) 3   libdecord.dylib                     0x0000000112df171a decord::ffmpeg::FFMPEGThreadedDecoder::Pop(decord::runtime::NDArray*) + 58
[bt] (4) 4   libdecord.dylib                     0x0000000112de812c decord::VideoReader::NextFrameImpl() + 108
[bt] (5) 5   libdecord.dylib                     0x0000000112de8318 decord::VideoReader::NextFrame() + 24
[bt] (6) 6   libdecord.dylib                     0x0000000112ddcb9d std::__1::__function::__func<decord::runtime::$_1, std::__1::allocator<decord::runtime::$_1>, void (decord::runtime::DECORDArgs, decord::runtime::DECORDRetValue*)>::operator()(decord::runtime::DECORDArgs&&, decord::runtime::DECORDRetValue*&&) + 77
[bt] (7) 7   libdecord.dylib                     0x0000000112daf5b6 DECORDFuncCall + 70
[bt] (8) 8   libffi.7.dylib                      0x000000010226bead ffi_call_unix64 + 85
[bt] (9) 9   ???                                 0x00007ffeede011b0 0x0 + 140732889305520

Lijun-Yu avatar Jul 15 '20 21:07 Lijun-Yu

Which version of decord, and could you share a corrupted video (ideally a small-sized one)?

Edit: sorry, saw the video link above

innerlee avatar Jul 16 '20 02:07 innerlee

BTW how to deal with corrupted frames? Raise an error or a warning? Return a black frame?

innerlee avatar Jul 16 '20 02:07 innerlee

It depends on what the user cares about, e.g. object detection (just warning and skip would be fine), tracking (maybe reuse the previous frame), optical flow (maybe an interpolated frame). I guess it would be better to offer options for users to choose from. Some packages I have tried:

  • OpenCV (cv2.VideoCapture): skip without warning in Python, but FFmpeg prints obscure warnings (like the ones above such as [NULL @ 0x7fde3829b200]). It is not desired when you care about the frame index to align with annotation.
  • Pims with PyAV: return a black frame without warning in Python, also warnings from FFmpeg. It affects the detection and tracking performance.
  • MoviePy: return the previously available frame without warning.
  • AVI-R: option to skip or return previously available frame; option for warning.

Lijun-Yu avatar Jul 16 '20 04:07 Lijun-Yu

I tested the video. The pts are all -9223372036854775808, i.e., AV_NOPTS_VALUE. The current (master) logic of decord relies on pts. One improvement of the logic is to use dts when pts is not available.

However, even when replace the null pts with dts, decord still cannot decode this video.

That's what I know by far.

innerlee avatar Jul 16 '20 04:07 innerlee

@Lijun-Yu corrupted packet can cause more complicated situtations, for example, a single corrupted frame can affect all consequential frame decoding until the next keyframe. Multiple corrupted frame across the entire video can make it completely useless during training.

decord is intended for training models and the best hander for corrupted frame is to throw the error and let user determine by

  • catch the DecordError and throw away the entire video
  • try catch and locate the corrupted frame range, and only access the valid part of the video(this can be improved by decord by returning valid range maybe?)

Any thoughts?

zhreshold avatar Jul 17 '20 20:07 zhreshold

@zhreshold Agreed, corrupted frames should not be used in training. But there can still be situations where people want to decode a corrupted video for inference.

I guess for the VideoReader, it could be enough to just throw the error and let users decide whether to throw away the video/batch/batch or try to recover it. Locating the corrupted frame range may require decoding the whole video, which does not seem efficient in this direct access context. But it could be better to include documentation stating that errors can be raised in certain conditions.

For the VideoLoader, it may be better for users to control its behavior by choosing from some options, e.g. skip the batch, skip the video, or try to recover it first and skip if failed. Or the errors raised from its internal loading procedure could cause a mess. Just some initial thoughts.

Lijun-Yu avatar Jul 19 '20 03:07 Lijun-Yu

I meet the same problem

THUSIGSICLAB avatar May 30 '22 06:05 THUSIGSICLAB

Try num_threads=1

prismformore avatar May 07 '24 06:05 prismformore