py-feat
py-feat copied to clipboard
Inconsistent order of detected faces
When detect_video()
finds multiple faces, they do not appear to have a consistent ordering with respect to their position in the video.
For example, in a recorded video call between two people, where one speaker is in a box on the left and the other speaker in a box on the right, each frame index has two rows in the resulting dataframe from detect_video()
, one for each face. But sometimes the left speaker is the first entry in that frame index, and sometimes the second. This is apparent from the FaceRectX
value.
frame FaceRectX FaceRectY
2 48 418.904871 43.552213
3 48 83.042467 91.174475
4 72 93.583826 92.987578
5 72 421.968295 43.727639
For a two-speaker video call, it's easy enough to group by index and order by X value (and multi-speaker calls could do the same thing using both X and Y); maybe consider putting a note in the docs stating that order isn't guaranteed?
This issue is related to #198, although it's simpler for the video call use case, as heads aren't moving around much and so it doesn't require a latent representation to keep track.