mxnet-ssd icon indicating copy to clipboard operation
mxnet-ssd copied to clipboard

[Question] How to debug custom operators?

Open taras-sereda opened this issue 7 years ago • 3 comments

Hi. I would like to debug custom operators, c++/cuda part of code. What steps should I do to accomplish this?

What I've tried:

  1. Built mxnet with debug symbols
  2. gdb --args python ssd_debug.py
  3. And unfortunately was not able to step into the actual code.

that's what I've got from GDB:

[New Thread 0x1403 of process 2958]
warning: unhandled dyld version (15)
[New Thread 0x1503 of process 2958]

Thread 3 received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 0x1503 of process 2958]
0x0000000100005000 in ?? ()
import mxnet as mx


class SimpleData(object):

    def __init__(self, data):
        self.data = data


data_shape = (1,3,5,5)

data_node = mx.sym.Variable('data')
anchors = mx.sym.MultiBoxPrior(data_node, sizes='(0.1)', ratios='(1)', clip=True, name='multibox_layers')

mod = mx.mod.Module(anchors)
mod.bind(data_shapes=[('data', data_shape)])
mod.init_params()


input_data = mx.nd.ones(data_shape)
mod.forward(data_batch=SimpleData([input_data]))
bbox_priors = mod.get_outputs()[0].asnumpy()
print(bbox_priors.shape)

taras-sereda avatar Jan 27 '17 15:01 taras-sereda

Use export MXNET_ENGINE_TYPE=NaiveEngine to disable mutli-threading first.

zhreshold avatar Jan 29 '17 21:01 zhreshold

Thanks for the suggestion. I've tried with MXNET_ENGINE_TYPE=NaiveEngine MXNet starts as Naive Engine - as stated in logs. still I can't stop on break point specified.

How do you check the validity of the code while developing new operators? I was not able to find test cases of core mxnet operators as well, which is quite strange to me.

(gdb) b MultiBoxPrior::Forward
Function "MultiBoxPrior::Forward" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (MultiBoxPrior::Forward) pending.
(gdb) run
Starting program: /usr/local/bin/python ssd_debug.py
[New Thread 0x1403 of process 45323]
warning: unhandled dyld version (15)
[New Thread 0x1503 of process 45323]

Thread 3 received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 0x1503 of process 45323]
0x0000000100005000 in ?? ()
(gdb) c
Continuing.
[14:37:34] src/engine/engine.cc:36: MXNet start using engine: NaiveEngine
[[[ 0.05        0.05        0.15000001  0.15000001]
  [ 0.25        0.05        0.35000002  0.15000001]
  [ 0.44999999  0.05        0.55000001  0.15000001]
  [ 0.64999998  0.05        0.75        0.15000001]
  [ 0.85000002  0.05        0.95000005  0.15000001]
  [ 0.05        0.25        0.15000001  0.35000002]
  [ 0.25        0.25        0.35000002  0.35000002]
  [ 0.44999999  0.25        0.55000001  0.35000002]
  [ 0.64999998  0.25        0.75        0.35000002]
  [ 0.85000002  0.25        0.95000005  0.35000002]
  [ 0.05        0.44999999  0.15000001  0.55000001]
  [ 0.25        0.44999999  0.35000002  0.55000001]
  [ 0.44999999  0.44999999  0.55000001  0.55000001]
  [ 0.64999998  0.44999999  0.75        0.55000001]
  [ 0.85000002  0.44999999  0.95000005  0.55000001]
  [ 0.05        0.64999998  0.15000001  0.75      ]
  [ 0.25        0.64999998  0.35000002  0.75      ]
  [ 0.44999999  0.64999998  0.55000001  0.75      ]
  [ 0.64999998  0.64999998  0.75        0.75      ]
  [ 0.85000002  0.64999998  0.95000005  0.75      ]
  [ 0.05        0.85000002  0.15000001  0.95000005]
  [ 0.25        0.85000002  0.35000002  0.95000005]
  [ 0.44999999  0.85000002  0.55000001  0.95000005]
  [ 0.64999998  0.85000002  0.75        0.95000005]
  [ 0.85000002  0.85000002  0.95000005  0.95000005]]]
[Inferior 1 (process 45323) exited normally]

taras-sereda avatar Feb 04 '17 12:02 taras-sereda

I guess due to the mxnet execute engine, you may not be able to stop anywhere you want. And I would like to know a elegant way to debug as well. For me, the current solution is to insert code blocks in operator, and manually debug the status.

zhreshold avatar Feb 04 '17 18:02 zhreshold