HASS-amazon-rekognition icon indicating copy to clipboard operation
HASS-amazon-rekognition copied to clipboard

Multiple ROI boxes or non-square shape?

Open geftactics opened this issue 4 years ago • 21 comments

Love the integration! Thanks for your work!

When my cameras detect motion, I run a snapshot image through this and generate a notification if we see a person.

Is there a way to apply multiple ROI boxes in order to build an area, or maybe an exclusion box? My path area is an odd shape, and I want to exclude the public path next to my house, as it causes people alerts that I don't want to worry about!

I'm guessing that you probably just pass these on to AWS API? :/

camera

geftactics avatar Feb 04 '21 15:02 geftactics

I have a couple of related issues on deepstack:

  • https://github.com/robmarkcole/HASS-Deepstack-object/issues/184
  • https://github.com/robmarkcole/HASS-Deepstack-object/issues/180

The challenge is to make it not too complicated to configure. If you can make any suggestions about what you want and how it could be configured that would be very helpful

robmarkcole avatar Feb 05 '21 17:02 robmarkcole

I did think about polygons etc to build a more complex zone, but as you suggested would make configuration complex and hard to setup/test/debug. If we allowed multiple ROI boxes, it becomes a bit of a pain do define complex/non-rectangular zones.

If the focus is on simplicity, how about having the option to supply an exclusion mask... This would be a transparent PNG of the same dimensions as the source image. In this file we black out the areas to exclude, we then overlay this image on top of the source image before passing onto rekogntion. For the image your addon saves out with the boxes etc drawn on, it could still use the original source image, rather than the one that we actually sent to rekognition.

This method would allow very complex zones to be applied easily.

geftactics avatar Feb 07 '21 11:02 geftactics

Can you provide a working example, either a PR or just some working python code?

Your suggestion would provide simple config but then require users to use an extra tool to generate the mask. I am open to the idea of a Hassion addon which could be used for that. But then again an addon could also be used to generate the config even if it was just some complicated text

robmarkcole avatar Feb 07 '21 12:02 robmarkcole

I did try and do it against the actual codebase, but couldn't quite fathom it... Here's a basic example in isolation to show the concept working though...

from PIL import Image, UnidentifiedImageError

config_maskfile = 'mask.png'

camera = Image.open('camera.jpg')
image_for_rekogntion = camera.copy()

if config_maskfile:
  try:
    mask = Image.open('mask.png')
    image_for_rekogntion.paste(mask, (0, 0), mask)
  except (FileNotFoundError, UnidentifiedImageError):
      print('ERROR: Could not open a valid mask file')

# Use this for saving locally and adding boxes
camera.show()

# Send this for processing by Rekogntition
image_for_rekogntion.show()

camera mask

geftactics avatar Feb 07 '21 12:02 geftactics

How is the mask created?

robmarkcole avatar Feb 07 '21 12:02 robmarkcole

Transparent PNG file - Can use any image editor, gimp, photoshop, etc... I'm sure most people with the level of technical ability to be running home assistant with custom addons, and AWS keys etc will have no problem doing this?

geftactics avatar Feb 07 '21 13:02 geftactics

Probably true, but I want to roll this back into HA at some point and simplify as much as possible. One issue with using a mask before processing is that any object that is on the border risks being not detected since it will be cropped. Any comment on this? My current approach runs detection on the full frame, then uses the center of the object to decide if it's inside the ROI OR NOT

robmarkcole avatar Feb 07 '21 13:02 robmarkcole

Ah, that is a good point - I might have a look at how we can specify polygon coordinates then

geftactics avatar Feb 07 '21 13:02 geftactics

Building a polygon and then using shapely to test for presence of a point with the polygon is fairly effortless... Here I build a simple 4 point polygon for my ROI (Try using my camera image above as camera.jpg), I've also added two test points, one inside and one outside. We're loading a list of coordinates from YAML, then converting them to a list of tuples...

import yaml
from PIL import Image, ImageDraw
from shapely.geometry import Point, Polygon

with open('test.yaml') as file:
  yaml = yaml.load(file, Loader=yaml.FullLoader)
  roi_points = []
  for p in yaml['roi_points']:
      roi_points.append(tuple(p))

camera = Image.open('camera.jpg')
draw = ImageDraw.Draw(camera)
draw.polygon(roi_points, outline='LightGreen')
camera.show()

poly = Polygon(roi_points)

test_point_1 = Point(100, 100) # Outside our polygon
test_point_2 = Point(400, 400) # Inside our polygon

print(test_point_1.within(poly)) # False
print(test_point_2.within(poly)) # True

and then, this loads from a yaml file like so...

roi_points:
  - [300, 0]
  - [440, 0]
  - [1250, 1070]
  - [400, 1070]

geftactics avatar Feb 07 '21 15:02 geftactics

@geftactics I like this suggestion, will do. For assisting with config my plan is to create a streamlit app that will allow the user to draw all the required ROI (for polygons require this feature), then generate the required yaml which can be pasted into their config. This app can be hosted for free also

robmarkcole avatar Feb 08 '21 06:02 robmarkcole

@geftactics do you know of any other integrations (preferably official) that are using config which involves an array?

robmarkcole avatar Feb 09 '21 05:02 robmarkcole

I think many do, but just not in YAML formatted this way... Not sure if it helps, but my code above will also work with YAML of the following format:

roi_points:
  - 
    - 300
    - 0
  -
    - 440
    - 0
  -
    - 1250
    - 1070
  -
    - 400
    - 1070

geftactics avatar Feb 09 '21 08:02 geftactics

Based on https://github.com/robmarkcole/HASS-amazon-rekognition/issues/92 I have chosen not to use Shapely. Also given the convoluted config it appears is necessary, and the fact that this is incompatible with config flow, this whole approach of using ROI needs to be considered carefully. I am beginning to think that what is required is a separate and dedicated integration/tool for monitoring the object_detected events

robmarkcole avatar Feb 10 '21 03:02 robmarkcole

I thought about the options above a bit more, and have one more idea to throw into the mix...

  • We use a mask image as per example above, but do not crop/mask the original image before sending to AWS Rekognition.
  • Image is sent for detection as it is now, for each object we look at the center point
  • The equivalent center point on the mask image is then grabbed using PIL.getpixel()
  • If the pixel is of a specific (configurable?) colour, we consider it within our ROI (so if pixel was white in my example)

Essentially user supplies a mask image with a coloured zone as ROI, if centerpoint falls on a pixel of that colour, it's valid.

Simple config, very flexible/multiple ROI zones... Can be implemented with PIL easily.

geftactics avatar Feb 12 '21 11:02 geftactics

That is an interesting suggestion. A person could literally use a Paint application to draw whatever regions they would want, and we would not need to do any kind of complex calculation to determine if an object is inside the region - just look up a pixel value :-) As mentioned in above comment I think this justifies a standalone integration, which could then be used by any image processing integration that outputs object locations

robmarkcole avatar Feb 12 '21 11:02 robmarkcole

We do need to be able to allow the user to reference the colour they have used in the YAML. For example, we can't always just rely only black, as some users may naturally have black areas of their camera image. Ideally they would pick a contrasting colour that is not likely to appear in the frame.

Hopefully having it as a separate integration would still allow me to use it in my automations!

I do motion alert -> send to rekognition -> if person count>1 send alert

geftactics avatar Feb 12 '21 14:02 geftactics

I think for the timebeing I will do:

  • allow multiple ROI, named
  • add 4 point polygon

Combining these two a person can create a wide range of ROI

robmarkcole avatar Feb 13 '21 09:02 robmarkcole

Works for me!! :)

geftactics avatar Feb 13 '21 09:02 geftactics

After more thought I think the best approach is to apply the mask to the image before processing the image

robmarkcole avatar Jul 23 '21 06:07 robmarkcole

If you are open to the masking idea - maybe the method posted above on 12th Feb will yield the most accurate results when objects are cropped by the mask

geftactics avatar Jul 23 '21 06:07 geftactics

The binary mask should be 0/1 pixel values only, this is then applied to the image before processing. The config will then simply be the path to the mask file

robmarkcole avatar Jul 23 '21 07:07 robmarkcole