light_head_rcnn icon indicating copy to clipboard operation
light_head_rcnn copied to clipboard

No OpKernel was registered to support Op 'NMS' with these attrs

Open liumusicforever opened this issue 7 years ago • 14 comments

Hello , I'm try to run your test.py on my env . but meet the problem below , did I make some mistake ? (I could inport /lib/lib_kernel/lib_fast_nms/nms_op.py but could't use it .)

Caused by op 'resnet_v1_101_5/NMS', defined at:
  File "test.py", line 242, in <module>
    eval_all(args)
  File "test.py", line 162, in eval_all
    proc.start()
  File "/usr/lib/python3.4/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/usr/lib/python3.4/multiprocessing/context.py", line 212, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/lib/python3.4/multiprocessing/context.py", line 267, in _Popen
    return Popen(process_obj)
  File "/usr/lib/python3.4/multiprocessing/popen_fork.py", line 21, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.4/multiprocessing/popen_fork.py", line 77, in _launch
    code = process_obj._bootstrap()
  File "/usr/lib/python3.4/multiprocessing/process.py", line 254, in _bootstrap
    self.run()
  File "/usr/lib/python3.4/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "test.py", line 107, in worker
    func, inputs = load_model(model_file, dev)
  File "test.py", line 38, in load_model
    net.inference('TEST', inputs)
  File "/home/aipr/dennis_codebase/light_head_rcnn/experiments/lizeming/light_head_rcnn.ori_res101.coco.ps_roialign/network_desp.py", line 164, in inference
    anchors, num_anchors, is_tfchannel=True, is_tfnms=False)
  File "/home/aipr/dennis_codebase/light_head_rcnn/lib/detection_opr/rpn_batched/proposal_opr.py", line 95, in proposal_opr
    cur_proposals, nms_thresh, post_nms_topN)
  File "<string>", line 43, in nms
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/op_def_library.py", line 328, in apply_op
    op_type_name, name, **keywords)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'NMS' with these attrs.  Registered devices: [CPU], Registered kernels:
  device='GPU'; T in [DT_FLOAT]

         [[Node: resnet_v1_101_5/NMS = NMS[T=DT_FLOAT, max_out=1000, nms_overlap_thresh=0.7](resnet_v1_101_5/Gather)]]

liumusicforever avatar Apr 22 '18 08:04 liumusicforever

@liumusicforever Pls compile nms first. Ensure .so is generated in /lib/lib_kernel/lib_fast_nms/. If not, just run make.sh.

zengarden avatar Apr 22 '18 09:04 zengarden

@zengarden , I already ran make.sh and have a fast_nms.so , why still print the error above ?was the path linking problem ?

liumusicforever avatar Apr 22 '18 11:04 liumusicforever

updata : I can pass when running python3 nms_op.py , but meet same error message on running python3 nms_test.py

    nms_out = nms_op.nms(rois, 0.5, 200)
  File "<string>", line 43, in nms
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/op_def_library.py", line 328, in apply_op
    op_type_name, name, **keywords)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'NMS' with these attrs.  Registered devices: [CPU], Registered kernels:
  device='GPU'; T in [DT_FLOAT]

         [[Node: NMS = NMS[T=DT_FLOAT, max_out=200, nms_overlap_thresh=0.5](Const)]]

It may have something wrong occur in calling nms_op.nms operasion . thanks !

liumusicforever avatar Apr 22 '18 13:04 liumusicforever

NMS works in my machine, if you still have problem, your can change is_tfnms to "True" to directly use tf_nms in network_desp.py.

            rois, roi_scores = proposal_opr(
                rpn_cls_prob, rpn_bbox_pred, im_info, mode, cfg.stride,
                anchors, num_anchors, is_tfchannel=True, is_tfnms=False)

zengarden avatar Apr 23 '18 09:04 zengarden

Very thanks for your help ! I found some problem with my process:

  1. I didn't change CUDA_VISIBLE_DEVICES variable to zero (only one gpu)
  2. I can't run this repo with cudnn==7.1.3 , but fix in reinstalling 7.0.4 , here was the error message :
2018-04-24 08:57:14.806152: E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7103 (compatibility version 7100) but source was compiled with 7004 (compatibility version 7000).  If using a binary install, up
grade your CuDNN library to match.  If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
2018-04-24 08:57:14.806720: F tensorflow/core/kernels/conv_ops.cc:717] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms)
Aborted (core dumped)

thanks again !

liumusicforever avatar Apr 24 '18 04:04 liumusicforever

Hello, I got very similar errors. I'm using tensorflow 1.8, cuda-9.0 and cudnn 7.0.5. I tried to change is_tfnms to "True" to directly use tf_nms in network_desp.py but I got different error in stead.

     [[Node: resnet_v1_101_6/PSAlignPool = PSAlignPool[T=DT_FLOAT, group_size=7, sample_height=2, sample_width=2, spatial_scale=0.0625](resnet_v1_101_6/Relu, resnet_v1_101_5/concat_3)]]

Caused by op 'resnet_v1_101_6/PSAlignPool', defined at: File "test.py", line 241, in eval_all(args) File "test.py", line 131, in eval_all func, inputs = load_model(model_file, devs[0]) File "test.py", line 38, in load_model net.inference('TEST', inputs) File "/home/yanjun/projects/light_head_rcnn/experiments/lizeming/light_head_rcnn.ori_res101.coco.ps_roialign/network_desp.py", line 199, in inference sample_height=2, sample_width=2, spatial_scale=1.0/16.0) File "", line 48, in ps_align_pool File "/home/yanjun/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 328, in apply_op op_type_name, name, **keywords) File "/home/yanjun/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/yanjun/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op op_def=op_def) File "/home/yanjun/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'PSAlignPool' with these attrs. Registered devices: [CPU], Registered kernels: device='GPU'; T in [DT_FLOAT]

     [[Node: resnet_v1_101_6/PSAlignPool = PSAlignPool[T=DT_FLOAT, group_size=7, sample_height=2, sample_width=2, spatial_scale=0.0625](resnet_v1_101_6/Relu, resnet_v1_101_5/concat_3)]]

yanjunxu avatar May 11 '18 21:05 yanjunxu

You can check your tensorflow env (is gpu version or not) , the problem may cause that session will create operation on GPU but your enc registered on CPU.

liumusicforever avatar May 29 '18 08:05 liumusicforever

@zengarden hi, your work is very excellent! For my case, i can not use your fast_nms because there is no cuda9 in my machine. if i choose tf_nms, except modifying here: rois, roi_scores = proposal_opr( rpn_cls_prob, rpn_bbox_pred, im_info, mode, cfg.stride, anchors, num_anchors, is_tfchannel=True, is_tfnms=True) anywhere else should i modify?

liangzimei avatar Jun 11 '18 07:06 liangzimei

If you only have one GPU,set variable os.environ["CUDA_VISIBLE_DEVICES"] = '0',rather than 1.

esdgdhln avatar Jun 13 '18 08:06 esdgdhln

Hi, @liumusicforever When you run test.py, have you met the error as follows? image I really can't solve the problem, please help me, Thx.

wsycl avatar Jun 18 '18 02:06 wsycl

@wsycl can you pass nms_test.py in light_head_rcnn/lib/lib_kernel/lib_fast_nms/ ?

and make sure .so in your lib_fast_nms folder

liumusicforever avatar Jun 19 '18 04:06 liumusicforever

@liumusicforever Thx, I have solved the problems. Can i ask you one more question? How to change the .json file to .odformat?

wsycl avatar Jun 22 '18 02:06 wsycl

@wsycl you can change your json metadata like .odgt format , here is the simple example to convert Pascal dataset to odgt (in your case , you can focus on main() function):

import cv2
import os
import xml.etree.cElementTree as ET
import json

def get_img_list(work_path):
    image_list = []
    for root , subdir , files in os.walk(work_path):
        for img_filename in files:
            if '.jpg' in img_filename :
                image_path = root+'/'+img_filename
                annot_filename = img_filename.replace('.jpg','.xml')
                annot_path = os.path.join(root,'../','Annotations',annot_filename)
                if os.path.exists(image_path) and os.path.exists(annot_path):
                    image_list.append([image_path,annot_path])
      
    return image_list

def xmlreader(xml_file_path):
    boxlist=[]
    if not os.path.exists(xml_file_path):
        print('No xml file')
        return 0,0,boxlist
    else:
        f=open(xml_file_path).read()
        if len(f) < 10:
            return 0,0,boxlist
        root=ET.fromstring(f)
        size = root.find('size')[:]
        
        w = float(size[0].text)
        h = float(size[1].text)
        
        objects=root.findall('object')
        for ob in objects:
            xmin,ymin,xmax,ymax =ob.find('bndbox')[:]
            xmin = float(xmin.text)/w
            ymin = float(ymin.text)/h
            xmax = float(xmax.text)/w
            ymax = float(ymax.text)/h
            l = [xmin,ymin,xmax,ymax]
            boxlist.append(l)
        if len(boxlist)==0:
            boxlist=[]

        return w,h,boxlist

def voc_parser(root,limit = 1000000000):
    
    datas =  get_img_list(root)
    image_list = []
    
    for i,(img_path,annot_path) in enumerate(datas):
        w,h,bboxes   = xmlreader(annot_path)
        image_list.append([img_path,bboxes])
        if i % 1000 == 0:
            print ('get training items : {}'.format(len(image_list)))
        if i > limit:
            return image_list
    return image_list

def main():
    voc_root = '<path to pascal dataset>'
    output = 'out.odgt'
    output_list = []
    for i ,(img_path , bboxes) in enumerate(voc_parser(voc_root)):
        print ('processing {} images : {}'.format(i,img_path))
        img = cv2.imread(img_path)
        h,w = img.shape[:2]
        meta_dict = {
            "gtboxes":[],
            "fpath":img_path,
            "dbName":"COCO",
            "dbInfo":{  
                "vID":"COCO_trainval2014_womini",
                "frameID":-1
            },
            "width":w,
            "height":h,
            "ID":img_path.split('/')[-1]
        }
        for bbox in bboxes:
            xmin,ymin,xmax,ymax = bbox[:]
            xcen = int((xmin + xmax)/2*w)
            ycen = int((ymin + ymax)/2*h)
            x_w  = int((xmax - xmin)/2*w)
            y_w  = int((ymax - ymin)/2*h)
            box_info_dict = {
                "box":[  
                    xcen,
                    ycen,
                    x_w,
                    y_w
                ],
                "occ":0,
                "tag":"person",
                "extra":{  
                    "ignore":0
                }
            }
            meta_dict['gtboxes'].append(box_info_dict)
        output_list.append(json.dumps(meta_dict)+ "\n")
    with open(output, "w") as f:
        f.writelines(output_list)
    
if __name__ == '__main__':
    main()


liumusicforever avatar Jun 27 '18 06:06 liumusicforever

@liumusicforever ,Thxs a lot。

wsycl avatar Jun 27 '18 07:06 wsycl