clstm
clstm copied to clipboard
How to use pretrained model in python script
I am using a pretrained clstm language model and loading it. This is my code:
def PredictWords(self,image):
noutput = 3877
if self.lang == "EN":
pass
elif self.lang == "JP":
net = clstm.load_net("../lang/jp3877.clstm")
net.inputs.aset(image)
net.forward()
prod = net.outputs.array()
seq = clstm.Sequence()
seq.aset(target.reshape(-1,noutput,1))
aligned = clstm.Sequence()
aligned = aligned.array()
return clstm.Codec.decode(prod)
But this error occurs:
[libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of type "clstm.NetworkProto" because it is missing required fields: kind, ninput, noutput
I think I am mistaking loading the neural net model or need more specific parameters to load it. How to solve this problem?
OK, I saw the ocropus-lpred code using clstm module add the following code. Hoping it's going to work.
#setup codec
charset_file = open("./lang/charset_jp.txt","r")
self.charset = charset_file.read()
charset_file.close()
self.charset = sort(list(self.charset))
self.charset = [""," ", "~",] + [c for c in self.charset if c not in [" ","~"]]
self.codec = lstm.Codec.init(charset)
#setup normalization
lnorm = lineest.CenterNormalizer(48)
#setup neural network
self.network = clstm.make_BIDILSTM()
self.network.init(codec.size(), 100, lnorm.target_height)
self.network = clstm.CNetwork(network)
if self.lang == "EN":
pass
elif self.lang == "JP":
net = clstm.load_net("../lang/jp3877.clstm")
AFAIK the code in ocropus-lpred uses a radically different clstm API than the one that's currently implemented. If you like you can take a look at my model handling code in kraken. If you don't want to wade through everything you should be able to use the model out of the box although I have to admit I haven't tested using clstm models since the big breakening began (separate-derivs works though).
Thank you for the advice!
I succeeded loading the neural network model and the inference! However, the Japanese model I found on the Internet seemed that the data is corrupt(it shows that the input dim = -1, output dim = 0), and also it occurs "Segmentation Fault" when using the clstm.network.decode()
in any conditions.
Finally it worked with the original decode()
method mentioned in the ipython notebook in clstm docs and using the trained neural network model (using uw3-500 dataset) on my own.