MXNet.jl icon indicating copy to clipboard operation
MXNet.jl copied to clipboard

Call to mx.get_updater(optimizer) gives UndefRefError

Open ngphuoc opened this issue 7 years ago • 3 comments

I am trying to adapt cnn_text_classification example from python, https://github.com/dmlc/mxnet/tree/master/example/cnn_text_classification, to julia. I have error for the following part:

  optimizer = mx.SGD(lr=0.1, momentum=0.9, weight_decay=0.00001)
  updater = mx.get_updater(optimizer)
  for batch in mx.eachbatch(train_provider)
    data = mx.get_data(train_provider, batch)
    label = mx.get_label(train_provider, batch)

    num_correct = 0
    num_total = 0

    m.data[:] = data[1]
    m.label[:] = label[1]

    # forward backward
    mx.forward(m.cnn_exec, is_train=true)
    mx.backward(m.cnn_exec)

    # eval on training data
    a = copy(m.cnn_exec.outputs[1])
    num_correct += sum(label .== mapslices(indmax, a, 1)[:])
    num_total += length(label)

    # update weights
    sum_norm = 0
    for (idx, weight, grad, name) in m.param_blocks
      grad /= N
      l2_norm = copy(norm(grad))[1]
      sum_norm += l2_norm * l2_norm
    end

    MAX_GRAD_NORM = 5.0
    sum_norm = sqrt(sum_norm)
    for (idx, weight, grad, name) in m.param_blocks
      if sum_norm > MAX_GRAD_NORM
        grad *= (MAX_GRAD_NORM / sum_norm)
      end

      println("idx ", idx)
      println("grad ", grad)
      println("weight ", weight)
      updater(idx, grad, weight)  # <-- this line gives error

      # reset gradient to zero
      grad[:] = 0.0
    end

    train_acc = num_correct * 100.0 / num_total
    println("train_acc ", train_acc)
  end
idx 1
grad mx.NDArray{Float32}(784,100)
weight mx.NDArray{Float32}(784,100)
ERROR: LoadError: UndefRefError: access to undefined reference
 in update(::MXNet.mx.SGD, ::Int64, ::MXNet.mx.NDArray, ::MXNet.mx.NDArray, ::MXNet.mx.NDArray) at /home/phuoc/.julia/v0.5/MXNet/src/optimizers/sgd.jl:53
 in (::MXNet.mx.#updater#903{MXNet.mx.SGD,Dict{Int64,Any}})(::Int64, ::MXNet.mx.NDArray, ::MXNet.mx.NDArray) at /home/phuoc/.julia/v0.5/MXNet/src/optimizer.jl:188
 in train() at cnn.jl:126
 in include_from_node1(::String) at ./loading.jl:488
while loading cnn.jl, in expression starting on line 167

Is the error from MXNet.jl or mxnet core library? I can run the cnn_text_classification python version normally.

ngphuoc avatar Dec 05 '16 13:12 ngphuoc

This error is weird. It seems to be from the Julia side instead of of libmxnet. I'm not really sure what does it mean by a UndefRefError. Index out of bound, or undefined methods all lead to other error messages.

pluskid avatar Dec 09 '16 09:12 pluskid

@ngphuoc The UndefRefError is because you are accessing the optimizer.state field and it has not been initialized. You need to initialize it before calling update. These lines in the gist should do the trick: https://gist.github.com/facundoq/93a9d90c52c94aa9b329c47a4150d288#file-mnist-lenet-without-fit-jl-L83-L86

optimizer = mx.SGD(lr=0.05, momentum=0.9, weight_decay=0.00001)
op_state = mx.OptimizationState(batch_size)
optimizer.state = op_state

@pluskid maybe optimizers should be initialized automatically when calling mx.get_updater?? If another initial state is needed it could be overwritten afterwards.

@pluskid There aren't any examples of training a net without using the fit function in Julia in the examples folder (and online I found just this: http://www.laketide.com/custom-networks-in-mxnet-on-julia-part-1/, but it doesn't use the default optimizers and the code is show as images) . Maybe that gist would be useful as an example in the repo?

facundoq avatar Apr 19 '17 14:04 facundoq

@facundoq Unfortunately, yes, currently the new API (based on modules) in Python has not been fully ported to Julia yet. So intermediate level API for training is still not very convenient in the Julia side.

pluskid avatar Apr 30 '17 18:04 pluskid