depthai-python icon indicating copy to clipboard operation
depthai-python copied to clipboard

Segfault running single stream mono at 5 fps

Open chris-piekarski opened this issue 1 year ago • 8 comments

I started seeing this segfault after upgrading to 2.29. Not every time but a coupl ofe times a day. Thoughts?

terminate called without an active exception Stack trace (most recent call last): #31 Object "python3", at 0x58a6b6e3daeb, in _PyObject_MakeTpCall #30 Object "python3", at 0x58a6b6e50f63, in #29 Object "python3", at 0x58a6b6e3cd13, in _PyObject_FastCallDictTstate #28 Object "python3", at 0x58a6b6e31ae7, in _PyEval_EvalFrameDefault #27 Object "python3", at 0x58a6b6e47aeb, in _PyFunction_Vectorcall #26 Object "python3", at 0x58a6b6e32cf1, in _PyEval_EvalFrameDefault #25 Object "python3", at 0x58a6b6e54be0, in #24 Object "python3", at 0x58a6b6e31ae7, in _PyEval_EvalFrameDefault #23 Object "python3", at 0x58a6b6e47aeb, in _PyFunction_Vectorcall #22 Object "python3", at 0x58a6b6e33f58, in _PyEval_EvalFrameDefault #21 Object "python3", at 0x58a6b6e47aeb, in _PyFunction_Vectorcall #20 Object "python3", at 0x58a6b6e36b79, in _PyEval_EvalFrameDefault #19 Object "python3", at 0x58a6b6e47aeb, in _PyFunction_Vectorcall #18 Object "python3", at 0x58a6b6e374c7, in _PyEval_EvalFrameDefault #17 Object "python3", at 0x58a6b6e3db4a, in _PyObject_MakeTpCall #16 Object "/usr/local/lib/python3.10/dist-packages/depthai.cpython-310-x86_64-linux-gnu.so", at 0x702fcaea686a, in #15 Object "python3", at 0x58a6b6e3deea, in #14 Object "python3", at 0x58a6b6e5158a, in #13 Object "python3", at 0x58a6b6e5500f, in #12 Object "python3", at 0x58a6b6e3db4a, in _PyObject_MakeTpCall #11 Object "python3", at 0x58a6b6e47281, in #10 Object "/usr/local/lib/python3.10/dist-packages/depthai.cpython-310-x86_64-linux-gnu.so", at 0x702fcaeaa629, in #9 Object "/usr/local/lib/python3.10/dist-packages/depthai.cpython-310-x86_64-linux-gnu.so", at 0x702fcaefd177, in #8 Object "/usr/local/lib/python3.10/dist-packages/depthai.cpython-310-x86_64-linux-gnu.so", at 0x702fcb13b691, in dai::Device::Device<bool, true>(dai::Pipeline const&, dai::DeviceInfo const&, bool) #7 Object "/usr/local/lib/python3.10/dist-packages/depthai.cpython-310-x86_64-linux-gnu.so", at 0x702fcb146301, in dai::DeviceBase::tryStartPipeline(dai::Pipeline const&) #6 Object "/usr/local/lib/python3.10/dist-packages/depthai.cpython-310-x86_64-linux-gnu.so", at 0x702fcb1536bf, in dai::DeviceBase::startPipelineImpl(dai::Pipeline const&) #5 Object "/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x702fcdaae276, in std::terminate() #4 Object "/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x702fcdaae20b, in #3 Object "/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x702fcdaa2b9d, in #2 Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x702fe46287f2, in abort #1 Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x702fe4642475, in raise #0 Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x702fe46969fc, in pthread_kill

chris-piekarski avatar Dec 13 '24 05:12 chris-piekarski

Thanks for the report @chris-piekarski !

Would you mind providing an MRE (https://stackoverflow.com/help/minimal-reproducible-example) and tell us what version did you use before?

DId this started happening between 2.28.0.0 and 2.29.0.0?

moratom avatar Dec 13 '24 15:12 moratom

I just downgraded to 2.26.0.0 and the issue still occurs on this camera.

MRE - on ubuntu 22 start poe gen 2 camera using a yolov7 tiny model and run at 5 fps. Wait a few seconds and see segfault. Seems isolated to this one camera. Is there a self-test I can run it?

chris-piekarski avatar Dec 13 '24 20:12 chris-piekarski

@moratom do you have any confirmed M12 connector failures? To get this camera to stop segfaulting all we did was unscrew the M12, wait 10 seconds and screw it back on. Then the camera ran fine for the last week. We just had a second camera start throwing the same segault and we did the same thing and its runing again. I think its possible the M12 connector are experience physical hardware issues on the gen 2 POE versions.

chris-piekarski avatar Dec 20 '24 22:12 chris-piekarski

@chris-piekarski we've had issues if the M12 connector is over-tightened, which can short connections, though this is generally user error.

Is it possible that the connector was over tightened on the two cameras you've exhibited issues?

moratom avatar Jan 22 '25 12:01 moratom

@moratom yes that is possible. Is there a recommended torque? Is there a recommended cable provider that helps prevent this issue?

chris-piekarski avatar Jan 24 '25 03:01 chris-piekarski

@chris-piekarski

To get this camera to stop segfaulting all we did was unscrew the M12, wait 10 seconds and screw it back on

seems like the power cycling the device did the trick, I doubt it's anything to do with connector itself. From our webshop we have this written:

Rotate the locking ring (not the connector body) clockwise (CW) with <0.8 Nm torque to securely connect the cable.

We also have a quick gif for it:

Erol444 avatar Jan 24 '25 11:01 Erol444

@Erol444 no, power cycling the device didn't do the trick. We tried that 30 times. After a power cycle it would run for an hour max on some but otherwise never recovered. We had to throw the cameras in the trash and use new ones to get the systems running again. This occurred on at least seven cameras in the last month.

chris-piekarski avatar Jan 24 '25 14:01 chris-piekarski

@chris-piekarski we could process this as RMA, and it would be helpful if we can get the device in our office for debugging, whether it's SW or HW issue. Could you submit this form? https://www.luxonis.com/rma

Erol444 avatar Jan 24 '25 16:01 Erol444