yolov5face-toolkit
yolov5face-toolkit copied to clipboard
what's the output dimensions means in face detection only model?
output is type: float32[1,25200,16]
what does 16 mean here?
my blog about this model: https://zhuanlan.zhihu.com/p/461878005
arxiv paper: https://arxiv.org/abs/2105.12931
official repo: https://github.com/deepcam-cn/yolov5-face
16 = 4(bbox offsets) + 1 (object prob, foreground or not) + 10(5 landmarks) + 1(face prob)