yolov5-face
yolov5-face copied to clipboard
求ONNX的正确转换方法
export.py转换出来的ONNX不是最终结果,求正确转换的方法,我是菜鸟
你所谓的正确的应该是啥样的
What he probably meant is that the onnx model has only the output of the strides instead of the detection layer output + the strides. Sorry for answering in english, I don't know Chinese and used Google translator. I'm also now understanding how to decode the strides into detection to use the onnx runtime or even openvino as inference engine, if I discover how I post it here.
You are right, I am not expert of AI, please any of you post the method and convert script here, thanks.
| | beauty_rank | | @.*** | 签名由网易邮箱大师定制 在2021年12月15日 21:19,Luis Felipe de Melo @.***> 写道:
What he probably meant is that the onnx model has only the output of the strides instead of the detection layer output + the strides. Sorry for answering in english, I don't know Chinese and used Google translator. I'm also now understanding how to decode the strides into detection to use the onnx runtime or even openvino as inference engine, if I discover how I post it here.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
For follow-up procedure, see https://github.com/deepcam-cn/yolov5-face/blob/0f695a0fad36a2d2299aa3afa4f05eceb344228b/torch2tensorrt/main.py#L94
Well I also made the post-processing (Detection Layer output) manually with numpy. You can check the function bellow:
- pred: The inference result from the model
- stride: the stride array of your model, e.g. [8, 16, 32] for yolov5m
- image_size: a tuple (h,w) containing the input image shape
def process_anchors(pred, stride, image_size=(800,800)):
layers = 3
# anchors = len(ANCHORS[0])//2
grid = [np.zeros(1)]*layers
a = np.array(ANCHORS).astype(np.float).reshape(layers,-1,2) # shape (layers, anchors, 2)
anchor_grid = a.copy().reshape(layers,1,-1,1,1,2) # shape (layers, 1, anchors, 1, 1, 2)
z = []
for i in range(layers):
ny, nx = (dim // stride[i] for dim in image_size)
bs = pred[i].shape[0] # batch size
if grid[i].shape[2:4] != pred[i].shape[2:4]:
grid[i] = make_grid(nx, ny)
y = np.full_like(pred[i], 0)
class_range = list(range(5)) + list(range(15,15+1))
y[..., class_range] = sigmoid(pred[i][..., class_range])
y[..., 5:15] = pred[i][..., 5:15]
y[...,0:2] = (y[..., 0:2] * 2. - 0.5 + grid[i]) * stride[i] # xy (center)
y[...,2:4] = (y[..., 2:4] * 2) **2 * anchor_grid[i] # wh
y[..., 5:7] = y[..., 5:7] * anchor_grid[i] + grid[i] * stride[i] # All landmarks xy
y[..., 7:9] = y[..., 7:9] * anchor_grid[i] + grid[i] * stride[i] # All landmarks xy
y[..., 9:11] = y[..., 7:9] * anchor_grid[i] + grid[i] * stride[i] # All landmarks xy
y[..., 11:13] = y[..., 11:13] * anchor_grid[i] + grid[i] * stride[i] # All landmarks xy
y[..., 13:15] = y[..., 13:15] * anchor_grid[i] + grid[i] * stride[i] # All landmarks xy
z.append(y.reshape(bs, -1, 16))
return z
Also worth mentioning, I updated the export.py module and generated an onnx model with dynamic input, so it receives images with different dimension from 800x800, since letterbox padding won't always produce such result.
export.py转换出来的ONNX不是最终结果,求正确转换的方法,我是菜鸟
@beautyrank 已支持,请参考https://github.com/deepcam-cn/yolov5-face/pull/108#issue-1083085258
Hi @luisfmnunes I tried your conversion method and it seems to have broadcasting issue. I am guessing it is because I am not setting ANCHORS properly. Could you share how are you coding those? Best, /M
Hi @luisfmnunes I tried your conversion method and it seems to have broadcasting issue. I am guessing it is because I am not setting ANCHORS properly. Could you share how are you coding those? Best, /M
@mlourencoeb you can find all my onnx detection python script bellow:
import os
import cv2
import sys
import time
import argparse
import numpy as np
import logging as log
import onnxruntime as ort
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
from .utils.datasets import letterbox
from .utils.general import check_img_size, xywh2xyxy, xyxy2xywh
ANCHORS = [[4,5, 8,10, 13,16], # P3/8
[23,29, 43,55, 73,105], # P4/16
[146,217, 231,300, 335,433]] # P5/32
def iou(boxes, scores, iou_thres):
areas = (boxes[:,2] - boxes[:,0] + 1) * (boxes[:,3]-boxes[:,1] + 1) # (x2 - x1) * (y2 - y1)
order = scores.argsort()[::-1]
keep = []
while order.size > 0:
i = order[0]
keep.append(i)
xx1 = np.maximum(boxes[i,0], boxes[order[1:],0])
yy1 = np.maximum(boxes[i,1], boxes[order[1:],1])
xx2 = np.minimum(boxes[i,2], boxes[order[1:],2])
yy2 = np.minimum(boxes[i,3], boxes[order[1:],3])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= iou_thres)[0]
order = order[inds + 1]
return keep
def sigmoid(x):
return 1 / ( 1 + np.exp(-x) )
def make_grid(nx=20, ny=20):
xv, yv = np.meshgrid(np.arange(nx),np.arange(ny))
return np.stack((xv,yv),2).reshape(1,1,ny,nx,2).astype(np.float)
def process_anchors(pred, stride, image_size=(800,800)):
layers = 3
# anchors = len(ANCHORS[0])//2
grid = [np.zeros(1)]*layers
a = np.array(ANCHORS).astype(np.float).reshape(layers,-1,2) # shape (layers, anchors, 2)
anchor_grid = a.copy().reshape(layers,1,-1,1,1,2) # shape (layers, 1, anchors, 1, 1, 2)
z = []
for i in range(layers):
ny, nx = (dim // stride[i] for dim in image_size)
bs = pred[i].shape[0] # batch size
if grid[i].shape[2:4] != pred[i].shape[2:4]:
grid[i] = make_grid(nx, ny)
y = np.full_like(pred[i], 0)
class_range = list(range(5)) + list(range(15,15+1))
y[..., class_range] = sigmoid(pred[i][..., class_range])
y[..., 5:15] = pred[i][..., 5:15]
y[...,0:2] = (y[..., 0:2] * 2. - 0.5 + grid[i]) * stride[i] # xy (center)
y[...,2:4] = (y[..., 2:4] * 2) **2 * anchor_grid[i] # wh
y[..., 5:7] = y[..., 5:7] * anchor_grid[i] + grid[i] * stride[i] # All landmarks xy
y[..., 7:9] = y[..., 7:9] * anchor_grid[i] + grid[i] * stride[i] # All landmarks xy
y[..., 9:11] = y[..., 9:11] * anchor_grid[i] + grid[i] * stride[i] # All landmarks xy
y[..., 11:13] = y[..., 11:13] * anchor_grid[i] + grid[i] * stride[i] # All landmarks xy
y[..., 13:15] = y[..., 13:15] * anchor_grid[i] + grid[i] * stride[i] # All landmarks xy
z.append(y.reshape(bs, -1, 16))
return z
def nms_face(pred, conf_th = 0.3, iou_th = 0.5):
outputs = [None] * len(pred)
for xi,x in enumerate(pred):
x = x[np.where(x[:,4] > conf_th)]
# print(x)
min_wh, max_wh = 2, 4096
x[:, 15:] *= x[:, 4:5]
box = xywh2xyxy(x[:,:4])
conf = x[:, 15:].max(1, keepdims=True)
j = x[:, 15:].argmax(1)
# print(conf, j)
x = np.concatenate((box, conf, x[:, 5:15], j.astype(np.float).reshape(-1,1)),1)[np.where(conf.flatten() > conf_th)]
if not x.shape[0]:
continue
c = x[:, 15:16] * max_wh
boxes, scores = x[:, :4] + c, x[:, 4]
i = iou(boxes, scores, iou_th)
# print(i)
outputs[xi] = x[i]
return [output for output in outputs if output is not None]
def clip_coords(boxes, img_shape):
boxes[:, [0,2]].clip(0, img_shape[1])
boxes[:, [1,3]].clip(0, img_shape[0])
def scale_coords(img1_shape, coords, img0_shape, ratio_pad=None):
if ratio_pad is None:
gain = min(img1_shape[0]/img0_shape[0], img1_shape[1]/img0_shape[1])
pad = (img1_shape[1] - img0_shape[1] * gain)/2, (img1_shape[0] - img0_shape[0] * gain) / 2
else:
gain = ratio_pad[0][0]
pad = ratio_pad[1]
coords[:, [0, 2]] -= pad[0] # x padding
coords[:, [1, 3]] -= pad[1] # y padding
coords[:, :4] /= gain # anchor coordinate to pixel coordinate
clip_coords(coords, img0_shape)
return coords
def scale_coords_landmarks(img1_shape, coords, img0_shape, ratio_pad=None):
if ratio_pad is None:
gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1])
pad = (img1_shape[1] - img0_shape[1] * gain)/2, (img1_shape[0] - img0_shape[0] * gain) / 2
else:
gain = ratio_pad[0][0]
pad = ratio_pad[1]
coords[:, [0, 2, 4, 6, 8]] -= pad[0]
coords[:, [1, 3, 5, 7, 9]] -= pad[1]
coords[:, :10] /=gain
coords[:,[0, 2, 4, 6, 8]].clip(0, img0_shape[1])
coords[:,[1, 3, 5, 7, 9]].clip(0, img1_shape[1])
return coords
def post_process(det, orgimg, img):
# print(det)
gn = np.array(orgimg.shape)[[1,0,1,0]]
gn_lks = np.array(orgimg.shape)[[1,0]*5]
results = []
for i, d in enumerate(det):
results += [[]]
if len(d):
d[:, :4] = scale_coords(img.shape[2:], d[:, :4], orgimg.shape).round()
d[:, 5:15] = scale_coords_landmarks(img.shape[2:], d[:, 5:15], orgimg.shape).round()
for j in range(d.shape[0]):
xywh = (xyxy2xywh(d[j, :4].reshape(1,4)) / gn).reshape(-1).tolist()
conf = d[j, 4]
landmarks = (d[j, 5:15]/gn_lks).reshape(-1).tolist()
class_num = d[j, 15]
orgimg = show_results(orgimg, xywh, conf, landmarks, class_num)
results[i] += [[d[j, :4].reshape(-1).astype(np.int).tolist(), conf, d[j, 5:15].reshape(-1).astype(np.int).tolist()]]
return orgimg, results
def normcenter2point(img_shape, xywh):
h,w,c = img_shape
x1 = int(xywh[0] * w - 0.5 * xywh[2] * w)
y1 = int(xywh[1] * h - 0.5 * xywh[3] * h)
x2 = int(xywh[0] * w + 0.5 * xywh[2] * w)
y2 = int(xywh[1] * h + 0.5 * xywh[3] * h)
return x1,y1,x2,y2
def show_results(img, xywh, conf, landmarks, class_num):
h,w,c = img.shape
tl = 1 or round(0.002 * (h + w) / 2) + 1 # line/font thickness
x1, y1, x2, y2 = normcenter2point(img.shape, xywh)
cv2.rectangle(img, (x1,y1), (x2, y2), (0,255,0), thickness=tl, lineType=cv2.LINE_AA)
clors = [(255,0,0),(0,255,0),(0,0,255),(255,255,0),(0,255,255)]
for i in range(5):
point_x = int(landmarks[2 * i] * w)
point_y = int(landmarks[2 * i + 1] * h)
cv2.circle(img, (point_x, point_y), tl+1, clors[i], -1)
tf = max(tl - 1, 1) # font thickness
label = str(conf)[:5]
cv2.putText(img, label, (x1, y1 - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
return img
def detection(im_path, img_size = 800, conf_th = 0.3, iou_th = 0.5, model_path = '/home/luisnunes/Documents/yolov5m-face.onnx', threads = 1):
if isinstance(img_size,tuple):
img_size = img_size[0]
start = time.time()
sess_opt = ort.SessionOptions()
sess_opt.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
sess_opt.inter_op_num_threads = threads
sess_opt.intra_op_num_threads = threads
model = ort.InferenceSession(model_path, sess_options=sess_opt)
inputs = model.get_inputs()
for i, name in enumerate([input.name for input in inputs]):
input_name = name
outputs = model.get_outputs()
onames = [o.name for o in outputs]
stride = []
for i, name in enumerate(onames):
stride += [int(name.split('_')[-1])]
max_stride = max(*stride)
origimg = cv2.imread(im_path)
if origimg is None:
return None, None, None, None # CORRIGIR ISSO AQUI DANDO ERRO!
img0 = origimg.copy()
h0, w0 = img0.shape[:2]
r = img_size / max(h0,w0)
if r != 1:
interpolation = cv2.INTER_AREA if r < 1 else cv2.INTER_LINEAR
img0 = cv2.resize(img0, (int(w0 * r), int(h0 * r)), interpolation=interpolation)
sz = check_img_size(img_size, s=max_stride)
img = letterbox(img0, new_shape=sz)[0]
img = img[:, :, ::-1].transpose(2,0,1).copy() #BGR to RGB and HWC to CHW
img = img.astype(np.float32)
img /= 255
img = img[np.newaxis, ...]
inf_start = time.time()
pred = model.run(onames, {input_name: img})
pred = process_anchors(pred, stride, img.shape[2:])
inf_end = time.time() - inf_start
# for i,p in enumerate(pred): # Transforms predictions into (bs, GridSize, 16)
# p = p.reshape(p.shape[0],-1,16)
# pred[i] = p
concat = np.concatenate(pred, axis=1) # Concatenates all 3 stride layers results for NMS
det = nms_face(concat, conf_th, iou_th)
out_img, results = post_process(det, origimg, img)
end = time.time() - start
return out_img, results, inf_end, end
def detect(im_file, img_size = 800, conf_th = 0.3, iou_th = 0.5, model_path = '/home/luisnunes/Documents/yolov5m-face.onnx', threads = 1, **kwargs):
if isinstance(img_size,tuple):
img_size = img_size[0]
start = time.time()
sess_opt = ort.SessionOptions()
sess_opt.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
sess_opt.inter_op_num_threads = threads
sess_opt.intra_op_num_threads = threads
model = ort.InferenceSession(model_path, sess_options=sess_opt)
inputs = model.get_inputs()
for i, name in enumerate([input.name for input in inputs]):
input_name = name
outputs = model.get_outputs()
onames = [o.name for o in outputs]
stride = []
for i, name in enumerate(onames):
stride += [int(name.split('_')[-1])]
max_stride = max(*stride)
origimg = cv2.imread(im_file)
if origimg is None:
return None, None
img0 = origimg.copy()
h0, w0 = img0.shape[:2]
r = img_size / max(h0,w0)
if r != 1:
interpolation = cv2.INTER_AREA if r < 1 else cv2.INTER_LINEAR
img0 = cv2.resize(img0, (int(w0 * r), int(h0 * r)), interpolation=interpolation)
sz = check_img_size(img_size, s=max_stride)
img = letterbox(img0, new_shape=sz)[0]
img = img[:, :, ::-1].transpose(2,0,1).copy() #BGR to RGB and HWC to CHW
img = img.astype(np.float32)
img /= 255
img = img[np.newaxis, ...]
inf_start = time.time()
pred = model.run(onames, {input_name: img})
pred = process_anchors(pred, stride, img.shape[2:])
inf_end = time.time() - inf_start
# for i,p in enumerate(pred): # Transforms predictions into (bs, GridSize, 16)
# p = p.reshape(p.shape[0],-1,16)
# pred[i] = p
concat = np.concatenate(pred, axis=1) # Concatenates all 3 stride layers results for NMS
det = nms_face(concat, conf_th, iou_th)
out_img, results = post_process(det, origimg, img)
results = results[0]
end = time.time() - start
det = []
landmarks = []
for result in results:
det += [result[0]+[result[1]]] # [x1 y1 x2 y2 confidence]
landmarks += [result[2]]
return det, landmarks
def main(args):
stride = []
log.basicConfig(format="[ %(levelname)s ] %(message)s", level= log.INFO, stream = sys.stdout)
start = time.time()
log.info("Creating ONNX session from model {}".format(args.onnx_model))
sess_opt = ort.SessionOptions()
sess_opt.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
sess_opt.inter_op_num_threads = 1
sess_opt.intra_op_num_threads = 1
model = ort.InferenceSession(args.onnx_model, sess_options=sess_opt)
inputs = model.get_inputs()
log.info("Model Inputs:")
for i, name in enumerate([input.name for input in inputs]):
input_name = name
print("\t[{}] - {}".format(i,name))
outputs = model.get_outputs()
onames = [o.name for o in outputs]
log.info("Model Outputs:")
for i, name in enumerate(onames):
print("\t[{}] - {}".format(i, name))
stride += [int(name.split('_')[-1])]
max_stride = max(*stride)
log.info("Maximum Stride = {}".format(max_stride))
origimg = cv2.imread(args.image)
if origimg is None:
log.error("Cannot load an image from file {}".format(args.image))
exit(-1)
img0 = origimg.copy()
h0, w0 = img0.shape[:2]
r = args.img_size / max(h0,w0)
if r != 1:
interpolation = cv2.INTER_AREA if r < 1 else cv2.INTER_LINEAR
img0 = cv2.resize(img0, (int(w0 * r), int(h0 * r)), interpolation=interpolation)
sz = check_img_size(args.img_size, s=max_stride)
img = letterbox(img0, new_shape=sz)[0]
# cv2.imshow('padded img', img)
# cv2.waitKey(10000)
img = img[:, :, ::-1].transpose(2,0,1).copy() #BGR to RGB and HWC to CHW
img = img.astype(np.float32)
img /= 255
img = img[np.newaxis, ...]
pred = model.run(onames, {input_name: img})
pred = process_anchors(pred, stride, img.shape[2:])
for i,p in enumerate(pred):
p = p.reshape(p.shape[0],-1,16)
print('[{}]: {}'.format(i, p.shape))
pred[i] = p
concat = np.concatenate(pred, axis=1)
print("Concat Shape:", concat.shape)
det = nms_face(concat, args.conf_th, args.iou_th)
out_img = post_process(det, origimg, img)[0]
log.info("Total elapsed time {:.4f}s".format(time.time()-start))
cv2.imshow("result",out_img)
cv2.waitKey(10000)
def parseArguments(argv):
parser = argparse.ArgumentParser()
parser.add_argument("onnx_model", type=str, help="Path to the onnx_model")
parser.add_argument("--image", type=str, help="Path to load an image for post inference")
parser.add_argument("--img_size", type=int, help="The size of image resizing", default=800)
parser.add_argument("--conf_th", type=float, help = "Threshold for confidence filtering", default=0.3)
parser.add_argument("--iou_th", type=float, help="Threshold for NMS intersection over union", default = 0.5)
return parser.parse_args(argv)
if __name__ == "__main__":
main(parseArguments(sys.argv[1:]))
Thanks for sharing @luisfmnunes. Still facing some issues, my models seems to not have the output_"stride" name in the output layer. I just exported with export.py by changing the model name. Any idea?
Thanks for sharing @luisfmnunes. Still facing some issues, my models seems to not have the output_"stride" name in the output layer. I just exported with export.py by changing the model name. Any idea?
This is on me, my version is a bit behind from the current commit of this repo and I didn't organize it on a fork (I mixed with another repo of mine and removed the git elements for no submodule depedency) and I made some changes to the input and output to respect the standard of my other application operational flow.
I made some tweaks myself to the export script, so basically this is my export.py
"""Exports a YOLOv5 *.pt model to ONNX and TorchScript formats
Usage:
$ export PYTHONPATH="$PWD" && python models/export.py --weights ./weights/yolov5s.pt --img 640 --batch 1
"""
import argparse
import sys
import time
sys.path.append('./') # to run '$ python *.py' files in subdirectories
import torch
import torch.nn as nn
import models
from models.experimental import attempt_load
from utils.activations import Hardswish, SiLU
from utils.general import set_logging, check_img_size
from onnxsim import simplify
import onnx
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--weights', type=str, default='./yolov5s.pt', help='weights path') # from yolov5/models/
parser.add_argument('--img_size', nargs='+', type=int, default=[640, 640], help='image size') # height, width
parser.add_argument('--batch_size', type=int, default=1, help='batch size')
parser.add_argument('--onnx2pb', action='store_true', default=False, help='export onnx to pb')
opt = parser.parse_args()
opt.img_size *= 2 if len(opt.img_size) == 1 else 1 # expand
print(opt)
set_logging()
t = time.time()
# Load PyTorch model
model = attempt_load(opt.weights, map_location=torch.device('cpu')) # load FP32 model
model.eval()
labels = model.names
# Checks
gs = int(max(model.stride)) # grid size (max stride)
opt.img_size = [check_img_size(x, gs) for x in opt.img_size] # verify img_size are gs-multiples
# Input
img = torch.zeros(opt.batch_size, 3, *opt.img_size) # image size(1,3,320,192) iDetection
# Update model
for k, m in model.named_modules():
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
if isinstance(m, models.common.Conv): # assign export-friendly activations
if isinstance(m.act, nn.Hardswish):
m.act = Hardswish()
elif isinstance(m.act, nn.SiLU):
m.act = SiLU()
# elif isinstance(m, models.yolo.Detect):
# m.forward = m.forward_export # assign forward (optional)
if isinstance(m, models.common.ShuffleV2Block):#shufflenet block nn.SiLU
for i in range(len(m.branch1)):
if isinstance(m.branch1[i], nn.SiLU):
m.branch1[i] = SiLU()
for i in range(len(m.branch2)):
if isinstance(m.branch2[i], nn.SiLU):
m.branch2[i] = SiLU()
model.model[-1].export = True # set Detect() layer export=True
model.model[-1].export_cat = False
y = model(img) # dry run
# ONNX export
print('\nStarting ONNX export with onnx %s...' % onnx.__version__)
f = opt.weights.replace('.pt', '.onnx') # filename
model.fuse() # only for ONNX
input_names=['data']
output_names=['stride_' + str(int(x)) for x in model.stride]
# output_names = ['outputs_{}'.format(int(model.stride.max()))]
dynamic_axes = {out: {0: '?', 2: '?', 3: '?'} for out in output_names}
dynamic_axes[input_names[0]] = {0: '?', 2: '?', 3: '?'}
torch.onnx.export(model, img, f, verbose=False, opset_version=12, input_names=input_names,
output_names=output_names, dynamic_axes=dynamic_axes)
#ONNX Simplifier
# Checks
onnx_model = onnx.load(f) # load onnx model
onnx.checker.check_model(onnx_model) # check onnx model
# print(onnx.helper.printable_graph(onnx_model.graph)) # print a human readable model
print('ONNX export success, saved as %s' % f)
# Finish
print('\nExport complete (%.2fs). Visualize with https://github.com/lutzroeder/netron.' % (time.time() - t))
print('Simplifying ONNX model')
model_simp, check = simplify(onnx_model, dynamic_input_shape=True, input_shapes={"data":(1,3,800,800)})
assert check, "Simplified ONNX model could not be validated"
onnx.save(model_simp, f)
print("\nExport simplified model complete.")
# PB export
if opt.onnx2pb:
print('download the newest onnx_tf by https://github.com/onnx/onnx-tensorflow/tree/master/onnx_tf')
from onnx_tf.backend import prepare
import tensorflow as tf
outpb = f.replace('.onnx', '.pb') # filename
# strict=True maybe leads to KeyError: 'pyfunc_0', check: https://github.com/onnx/onnx-tensorflow/issues/167
tf_rep = prepare(onnx_model, strict=False) # prepare tf representation
tf_rep.export_graph(outpb) # export the model
out_onnx = tf_rep.run(img) # onnx output
# check pb
with tf.Graph().as_default():
graph_def = tf.GraphDef()
with open(outpb, "rb") as f:
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name="")
with tf.Session() as sess:
init = tf.global_variables_initializer()
input_x = sess.graph.get_tensor_by_name(input_names[0]+':0') # input
outputs = []
for i in output_names:
outputs.append(sess.graph.get_tensor_by_name(i+':0'))
out_pb = sess.run(outputs, feed_dict={input_x: img})
print(f'out_pytorch {y}')
print(f'out_onnx {out_onnx}')
print(f'out_pb {out_pb}')
Hi @luisfmnunes
I converted to onnx . I used https://github.com/DefTruth/lite.ai.toolkit/blob/main/docs/hub/lite.ai.toolkit.hub.onnx.md yolov5face onnx . But it has always return face score 0.98....
How I can use the onnx model converted by your python code above ?
input and output looks different
data and stride_32 ? i am tryign to inference in c++
The srucial part is the landmark detection . Its feeding the recognition model so better landmark better result.
By the way do you know any quick way to decide the face quality ? I used https://github.com/deepcam-cn/FaceQuality and looks very promisin but cant convert it to onnx specially backbone model output.
Due to I am mainly using c++ so need onnx
Best
Hi @luisfmnunes
I converted to onnx . I used https://github.com/DefTruth/lite.ai.toolkit/blob/main/docs/hub/lite.ai.toolkit.hub.onnx.md yolov5face onnx . But it has always return face score 0.98....
How I can use the onnx model converted by your python code above ?
input and output looks different
data and stride_32 ? i am tryign to inference in c++
The srucial part is the landmark detection . Its feeding the recognition model so better landmark better result.
By the way do you know any quick way to decide the face quality ? I used https://github.com/deepcam-cn/FaceQuality and looks very promisin but cant convert it to onnx specially backbone model output.
Due to I am mainly using c++ so need onnx
Best
It's long since I used yolo v5 for this purpose, but the script above simply receive commands line args to get a pretrained PyTorch model (hence the weights option), build an nn.Module of YoloV5 and load the state dict on it, then it exports an ONNX model from it. I don't remember if the model has in-built preprocessing and post processing, but I believe it is unlikely, so there are three options, implement the equivalent preprocessing and post processing in C++, implement nn.Modules using PyTorch functions to implement pre and postprocessing and export them as ONNX Model, or use any other ONNX Compiler to build your own graph with the operations of pre and post processing.