Craig Hsin comments

Results 8 comments of


                                            Craig Hsin

About Res50 accuracy on ShanghaiTechB

Thank you author, and @haoranD thanks. I have reviewed the setting.py/config.py, found that the differences are only on the size of val batch and the training epoch. but from "12-28_09-18_SHHB_Res50_1e-05.txt"...

Flow result is a mess when the adjcent two frames are almost the same(no action or very small action).

Dear all, this is because of the normalization issue, this flow_norm can solve all your problem: https://github.com/tomrunia/OpticalFlow_Visualization/pull/7

Question about MOT17 dataset

Let me answer the question (1) by my self, because FRCNN/SDP/DPM are from different detectors, so choose one set would be enough But I am still curious about question (2)

Anchor setup

I could think of another reason is that maybe person class is used in autoanchor instead of vehicle class: I found yolov5 is with some horizontal anchors, too: https://github.com/ultralytics/yolov5/blob/master/models/yolov5s.yaml#L7 -...

Anchor setup

Dear Author: I just found the bug, if you print out: https://github.com/hustvl/YOLOP/blob/main/lib/utils/autoanchor.py#L87 " shapes = img_size * dataset.shapes / dataset.shapes.max()" you will find the shapes is [360, 640], but you...

请问一下，输入一张人脸如何预测是否为真人照片呢？

@SoftwareGift 作者您好感谢您的分享. 想请教 1. 使用RGB作为feathernetB的 input是可以的吗? 因为看您的input有三个channel, 还是说depth map 也有三个channel呢? 2. 想请问按照您上面的叙述, 这样不就只考虑1024 feature vector中的index 0/1而已了吗?

Is the test data successful? The model training shows good results, but the single picture test is bad. Can someone help me?

@lihuikenny 您好, 有人说feathernetB这样做 (我是使用RGB input): output1 = net(image1) soft_output = torch.softmax(output1, dim=-1) preds = soft_output.to('cpu').detach().numpy() _, predicted = torch.max(soft_output.data, 1) predicted = predicted.to('cpu').detach().numpy() print(predicted) 我总感觉怪怪的, 因为feature有1024, 但是结果只看是index0 or 1. 所以不太懂作者后面接BCEloss...

Is the test data successful? The model training shows good results, but the single picture test is bad. Can someone help me?

@lihuikenny 了解了, 看了loss, 也许作者在训练的时候就是对前两维做BCEloss, 从作者的回复看来也是这样, 但这样后面stream mudule产生的其他维度就消失了, 有些可惜所以我本来估计不是这样. 加了FC layer 1024->2 的做法合理, 但作者论文中有提到儘量不想要用 FC就是为了省cost, 所以我一直想不通. 不过目前的input应该是深度图, 我可能还是得单单考虑 RGB, 所以可能找别的work研究看看, 谢谢您