frigate
frigate copied to clipboard
[Detector Support]: Fatal Python error: Segmentation fault
Describe the problem you are having
Launching v13 with NVIDIA branch causes a bootloop with the above error and no other explanation. I was told in a different support ticket that my NVIDIA driver version was too new
I have since downgraded to Driver v535.129.03 which is supposedly stable according to the last ticket I opened (https://github.com/blakeblackshear/frigate/issues/9575)
The error still is present.
Version
v13
Frigate config file
mqtt:
enabled: true
host: 192.168.1.102
user: frigate
password: PASSWORD
# detectors:
# cpu1:
# type: cpu
# num_threads: 2
# birdseye:
# enabled: True
# restream: false
# mode: continuous
# width: 1280
# height: 720
# quality: 8
go2rtc:
streams:
Rear_Deck:
- rtsp://admin:[email protected]:554/h264Preview_01_main
Rear_Deck_sub:
- rtsp://admin:[email protected]:554/h264Preview_01_sub
Garage_Camera:
- rtsp://admin:[email protected]:554/cam/realmonitor?channel=1&subtype=0
Garage_Camera_sub:
- rtsp://admin:[email protected]:554/cam/realmonitor?channel=1&subtype=1
ffmpeg:
hwaccel_args: preset-nvidia-h265
rtmp:
enabled: False
cameras:
############## REAR DECK ##################
Rear_Deck:
ffmpeg:
inputs:
- path: rtsp://127.0.0.1:8554/Rear_Deck_sub
input_args: preset-rtsp-restream
roles:
- detect
- path: rtsp://127.0.0.1:8554/Rear_Deck
input_args: preset-rtsp-restream
roles:
- record
output_args:
record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -c:a aac
objects:
track:
- person
- dog
- bird
- cat
detect:
width: 1280
height: 720
fps: 4
record:
enabled: True
events:
retain:
default: 2
snapshots:
enabled: True
Garage_Camera:
ffmpeg:
inputs:
- path: rtsp://127.0.0.1:8554/Garage_Camera_sub
input_args: preset-rtsp-restream
roles:
- detect
- path: rtsp://127.0.0.1:8554/Garage_Camera
input_args: preset-rtsp-restream
roles:
- record
output_args:
record: -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v copy -c:a aac
# record: preset-record-generic-audio-aac
objects:
track:
- person
- dog
- cat
- car
- package
detect:
width: 1280
height: 720
fps: 4
record:
enabled: True
events:
retain:
default: 2
snapshots:
enabled: True
docker-compose file or Docker CLI command
docker run
-d
--name='frigate'
--net='bridge'
-e TZ="America/New_York"
-e HOST_OS="Unraid"
-e HOST_HOSTNAME="CozsNAS"
-e HOST_CONTAINERNAME="frigate"
-e 'FRIGATE_RTSP_PASSWORD'='********!'
-e 'PLUS_API_KEY'='********'
-e 'NVIDIA_VISIBLE_DEVICES'='GPU-53a3b891-6d7b-8fe8-bd57-9467c8797875'
-e 'NVIDIA_DRIVER_CAPABILITIES'='compute,utility,video'
-e 'YOLO_MODELS'='yolov4-416,yolov4-tiny-416'
-e 'USE_FP16'='false'
-e 'TRT_MODEL_PREP_DEVICE'='0'
-l net.unraid.docker.managed=dockerman
-l net.unraid.docker.webui='http://[IP]:[PORT:5000]'
-l net.unraid.docker.icon='https://raw.githubusercontent.com/yayitazale/unraid-templates/main/frigate.png'
-p '5000:5000/tcp'
-p '8554:8554/tcp'
-p '8555:8555/tcp'
-p '8555:8555/udp'
-p '1984:1984/tcp'
-v '/mnt/user/appdata/frigate':'/config':'rw'
-v '/mnt/user/Frigate Recordings/':'/media/frigate':'rw'
-v '/etc/localtime':'/etc/localtime':'rw'
--shm-size=256mb
--mount type=tmpfs,target=/tmp/cache,tmpfs-size=1000000000
--restart unless-stopped
--gpus=all 'ghcr.io/blakeblackshear/frigate:stable-tensorrt'
d4d83afad657559c53468c5ebc065e6caf904150a1c06d31719cfa84768c6afa
Relevant log output
2024-02-11 11:51:47.953981276 Fatal Python error: Segmentation fault
2024-02-11 11:51:47.953989283
2024-02-11 11:51:47.953991367 Thread 0x00001506c59ee6c0 (most recent call first):
2024-02-11 11:51:47.954045897 File "/usr/lib/python3.9/threading.py", line 312 in wait
2024-02-11 11:51:47.954133590 File "/usr/lib/python3.9/multiprocessing/queues.py", line 233 in _feed
2024-02-11 11:51:47.954194783 File "/usr/lib/python3.9/threading.py", line 892 in run
2024-02-11 11:51:47.954276894 File "/usr/lib/python3.9/threading.py", line 954 in _bootstrap_inner
2024-02-11 11:51:47.954332450 File "/usr/lib/python3.9/threading.py", line 912 in _bootstrap
2024-02-11 11:51:47.954343287
2024-02-11 11:51:47.954345312 Current thread 0x00001506ea62f740 (most recent call first):
2024-02-11 11:51:47.954442917 File "/opt/frigate/frigate/detectors/plugins/tensorrt.py", line 168 in <listcomp>
2024-02-11 11:51:47.954540476 File "/opt/frigate/frigate/detectors/plugins/tensorrt.py", line 167 in _do_inference
2024-02-11 11:51:47.954632394 File "/opt/frigate/frigate/detectors/plugins/tensorrt.py", line 286 in detect_raw
2024-02-11 11:51:47.954722466 File "/opt/frigate/frigate/object_detection.py", line 75 in detect_raw
2024-02-11 11:51:47.954846892 File "/opt/frigate/frigate/object_detection.py", line 125 in run_detector
2024-02-11 11:51:47.954996257 File "/usr/lib/python3.9/multiprocessing/process.py", line 108 in run
2024-02-11 11:51:47.955127573 File "/usr/lib/python3.9/multiprocessing/process.py", line 315 in _bootstrap
2024-02-11 11:51:47.955272691 File "/usr/lib/python3.9/multiprocessing/popen_fork.py", line 71 in _launch
2024-02-11 11:51:47.955386735 File "/usr/lib/python3.9/multiprocessing/popen_fork.py", line 19 in __init__
2024-02-11 11:51:47.955491455 File "/usr/lib/python3.9/multiprocessing/context.py", line 277 in _Popen
2024-02-11 11:51:47.955586815 File "/usr/lib/python3.9/multiprocessing/context.py", line 224 in _Popen
2024-02-11 11:51:47.955687173 File "/usr/lib/python3.9/multiprocessing/process.py", line 121 in start
2024-02-11 11:51:47.955783568 File "/opt/frigate/frigate/object_detection.py", line 183 in start_or_restart
2024-02-11 11:51:47.955869434 File "/opt/frigate/frigate/object_detection.py", line 151 in __init__
2024-02-11 11:51:47.955992198 File "/opt/frigate/frigate/app.py", line 453 in start_detectors
2024-02-11 11:51:47.956082578 File "/opt/frigate/frigate/app.py", line 683 in start
2024-02-11 11:51:47.956155258 File "/opt/frigate/frigate/__main__.py", line 17 in <module>
2024-02-11 11:51:47.956251165 File "/usr/lib/python3.9/runpy.py", line 87 in _run_code
2024-02-11 11:51:47.956342911 File "/usr/lib/python3.9/runpy.py", line 197 in _run_module_as_main
2024-02-11 11:51:49.659338836 [INFO] Starting go2rtc healthcheck service...
2024-02-11 11:52:02.661642001 [2024-02-11 11:52:02] frigate.watchdog INFO : Detection appears to be stuck. Restarting detection process...
2024-02-11 11:52:02.685462534 [2024-02-11 11:52:02] detector.tensorrt INFO : Starting detection process: 1255
2024-02-11 11:52:02.984562929 [2024-02-11 11:52:02] frigate.detectors.plugins.tensorrt INFO : Loaded engine size: 39 MiB
2024-02-11 11:52:03.343516305 [2024-02-11 11:52:03] frigate.detectors.plugins.tensorrt INFO : [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +6, GPU +8, now: CPU 158, GPU 230 (MiB)
2024-02-11 11:52:03.356055956 [2024-02-11 11:52:03] frigate.detectors.plugins.tensorrt INFO : [MemUsageChange] Init cuDNN: CPU +2, GPU +10, now: CPU 160, GPU 240 (MiB)
2024-02-11 11:52:03.360742717 [2024-02-11 11:52:03] frigate.detectors.plugins.tensorrt INFO : [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +40, now: CPU 0, GPU 40 (MiB)
2024-02-11 11:52:03.369792099 [2024-02-11 11:52:03] frigate.detectors.plugins.tensorrt INFO : [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 120, GPU 232 (MiB)
2024-02-11 11:52:03.370187842 [2024-02-11 11:52:03] frigate.detectors.plugins.tensorrt INFO : [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 120, GPU 240 (MiB)
2024-02-11 11:52:03.370321313 [2024-02-11 11:52:03] frigate.detectors.plugins.tensorrt INFO : [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +13, now: CPU 0, GPU 53 (MiB)
Operating system
UNRAID
Install method
Docker Compose
Coral version
CPU (no coral)
Any other information that may be helpful
No response
what GPU do you have again?
Edited Post to remove the Frigtate Plus API key and RTSP password that was visible in the Docker CLI command.
No one has ANY suggestions (besides buying a CORAL device) on how to get my Frigtate instance up and running??
seg faults are difficult because it is usually something related to the host or the hardware and there is no info about what is going wrong.
From your previous post logs we can see that as soon as the model is initialized there is a seg fault indicating some failure to communicate correctly. Many users use this type of setup on unraid so it seems there is nothing particular about that. You could try a memtest and see if perhaps system memory is failing.
memtest complete. 0 errors.
Next suggestion please?
I have the same error. Need help I have a eufy cam2 pro which only sends a stream when a motion is detected. I suspect this could be a potential cause. Any thoughts?
I'm experiencing the exact same error on TrueNAS Scale w/ GTX 1060
@hvardhan20 and @jdgiddings - I hope you both get a response but, if my past experience holds true, it doesn't look good. CPU detection worked fine. GPU detection worked fine...... until they bundled it all into one container.
Hard to fix issues when you don't have any support from anyone here.
There are many tensorrt users so this seems to be a very isolated problem. Like I said before, seg faults are difficult to debug and without being able to reproduce there really isn't any good way to move towards solving the problem because it is not clear what is causing this other than something on the host.
The logic to compile the models is the same as before just done automatically, that is unlikely to be causing this. It could be due to using newer libraries / tensorrt version but that was done to support the latest Nvidia GPUs and also unrelated to frigate building the models automatically.
here's the output from nvidia-smi on the host. I believe these are all supported versions
NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2
I'm experimenting with different models right now to see if any do not cause the error. I will report back
yolov7-320 does not throw the segfault
which model did you use that did?
yolov7x-640 and yolov7x-320 were both throwing the error on my machine
I did some more testing. Any model larger than yolov7-320 throws the same segfault error
I just wanted to add another voice here -- I am able to run yolov7x-320, but if I attempt to run yolov7x-640, I get a segfault (the same as the OP). I'm on a GTX 1650 Super. My setup is a bit odd:
- My system is running TrueNAS SCALE.
- Inside TrueNAS, I have set up a VM (because the Docker implementation of SCALE sucks).
- The VM is running Ubuntu 22.04.
- I'm also running CasaOS (but I don't think that matters, because I loaded Frigate via docker compose CLI).
- My test camera is streaming via RTSP through go2rtc, with WebRTC and MPE working correctly.
Let me know if I can do anything to help debug this.
[edit] I previously said my 1650 is an LHR. This is incorrect. My 3060 is LHR, and I confused the two.