cxxnet icon indicating copy to clipboard operation
cxxnet copied to clipboard

unrolled LSTM layer with batch BPTT

Open xiw9 opened this issue 10 years ago • 1 comments

An fast unrolled LSTM layer similar to https://github.com/BVLC/caffe/pull/1873 Need 2 inputs, The input sequence nodes_in[0] and the corresponding sequence label nodes_in[1]. nodes_in[0] size: [batch_size][1][1][input_width] nodes_in[1] size: [batch_size][1][1][1] Example config file:

data = train
iter = csv
  filename = "...."
  has_header = 0
iter = attachtxt
  txtfilename = "...."
iter = threadbuffer
  buffer_size = 4
iter = end

eval = val
iter = csv
  filename = "...."
  has_header = 0
iter = attachtxt
  txtfilename = "...."
iter = threadbuffer
  buffer_size = 4
iter = end

extra_data_shape[0] = 1,1,1
extra_data_num = 1

netconfig=start
layer[in,in_1->2] = lstm:lstm1
  nhidden = 1024
  parallel_size = 8
layer[2,in_1->3] = lstm:lstm2
  nhidden = 512
  parallel_size = 8
layer[3->4] = fullc:fc1
  nhidden = 51
layer[4->4] = softmax:softmax1
netconfig=end

# evaluation metric
metric = error

max_round = 40
num_round = 40

# input shape not including batch
input_shape = 1,1,4096

batch_size = 512

xiw9 avatar May 21 '15 02:05 xiw9

Thanks for your PR. I am still working on general case of LSTM, which is able to run CNN + LSTM together. In that branch we have special text IO for sequence. However I still have some bug to fix. Do you have interest to work together on that branch?

antinucleon avatar May 21 '15 02:05 antinucleon