frigate
frigate copied to clipboard
Object Detector Abstraction
This is an initial change to abstract the edgtpu module to a more generic Object Detector. The idea is to be able to add different inference runtimes into the Detectors folder, add a new DetectorTypeEnum, and an else-if in the LocalObjectDetector init function to setup the new detector type.
The object_detection module keeps the multi-process logic to run the detector asynchronously and apply the detection thresholds. All the detector modules are required to do is setup the runtime, load the model, and process a tensor (detect_raw).
Future work to build on this:
- Add an OpenVino detector module
- Add a RKNPU/RKNN detector module
- Possibly add a config for a "plugin" folder that will be scanned at runtime for user-supplied detector implementations?
Thanks, this will be a good basis for #3626
It would be nice to add some tests for this new module. Might be difficult / not a good idea to test actual detection, but at least testing that the correct detector is used when sending config metadata would be good.
It would be nice to add some tests for this new module. Might be difficult / not a good idea to test actual detection, but at least testing that the correct detector is used when sending config metadata would be good.
Shouldn't be too hard if we can mock the edgetpu detector inside a unit test? I'll take a look at that.
Yeah that's what I was thinking
As I am working on an OpenVino detector using this branch, I'm finding it would be helpful if the configuration for a model were more descriptive. Has there been any work to adjust the shape of the input/output tensors for different models?
I see some work in #1022, #3099, and #2548. Was there any preference on how any of these handled different models?
I would prefer that each model implementation defines it's own function to prepare the input data in shared memory and then translates the outputs into a common format. Keep in mind that the detection process is extremely sensitive to small increases in processing time. When using a Coral, an extra copy of a frame in memory is enough to increase the total inference times by a significant percentage.
Looks like the input tensor shape can be transformed without a memory copy. This can be handled by the detector, so I changed the "model_shape" to be the whole model config in most places. The one place I needed to use the model config in video.py is to set the correct colorspace when we convert from YUV. We'll only want to do this conversion once, so we need to know the correct colorspace for the model input at that time.
I haven't had time to review this in detail yet, but Frigate+ models will be trained on the YUV colorspace, so don't make any assumptions about colorspace here.
The default colorspace should remain RGB as was used for the current default model and there is an option for YUV
that will only crop/resize and skip the colorspace transformation call. I was finding a mix of RGB and BGR in the Intel Open Model Zoo.
Please rebase this PR and to point it to dev
branch
I rebased and pushed one more commit for the model input config. I switched it to an enum to specify NCHW vs NHWC tensor shapes instead of an array of arbitrary format. I thought that other shapes might be out there, but I haven't come across any besides these two yet. Best to keep it easier to configure for now.
@NateMeyer this looks great! May I ask if you have any example config/model where you use this new functionality?
@NateMeyer this looks great! May I ask if you have any example config/model where you use this new functionality?
Hey. The config changes should be included in the documentation updates. I implemented an OpenVino detector using this abstraction in #3768