clstm
clstm copied to clipboard
Sort and Sed commands are causing the model not to train (ERROR 1/OUT is empty)
Peace Be Upon you, I am training an Arabic model from scratch, reaching about 270,000 epochs in 32 hours, the ERROR is still 1, and the OUT is empty. The data I used for training is artificial and 100% Arabic, contains no diacritics, is 300 dpi, Times New Roman regular size 18, and I,am sure that the transcription is 100% correct. How is it that I cannot get anything recognized?
Attached (click-on): The transcribed html file The extracted png/gt.txt files The training script train.sh The produced clstm models The complete terminal log
My training script:
#!/bin/bash
set -x
set -a
sort -R manifest.txt > /tmp/manifest2.txt
sed 1,100d /tmp/manifest2.txt > train.txt
sed 100q /tmp/manifest2.txt > test.txt
report_every=1000
save_every=1000
maxtrain=2000000
target_height=48
dewarp=center
display_every=1000
test_every=1000
hidden=100
lrate=1e-4
save_name=arabic
'/home/bmwmy/Desktop/kra/clstm/clstmocrtrain' train.txt test.txt
*** charsep
got 1998 files, 100 tests
got 38 classes
.stacked: 0.0001 0.9 in 0 48 out 0 38
.stacked.parallel: 0.0001 0.9 in 0 48 out 0 200
.stacked.parallel.lstm: 0.0001 0.9 in 0 48 out 0 100
.stacked.parallel.reversed: 0.0001 0.9 in 0 48 out 0 100
.stacked.parallel.reversed.lstm: 0.0001 0.9 in 0 48 out 0 100
.stacked.softmax: 0.0001 0.9 in 0 200 out 0 38
.
.
.
ERROR 8000 1 6321 6321
8000
TRU نونمٔوي
ALN
OUT
ERROR 9000 1 6321 6321
.
.
.
ERROR 268000 1 6321 6321
268000
TRU تثبل لاق تثبل مك لاق هثعب مث ماع ةٔيام هللا هتامٔاف
ةٔيام تثبل لب لاق موي ضعب ؤا اموي
ALN
OUT
ERROR 269000 1 6321 6321
269000
TRU رفكي نٕاف ةوبنلاو مكحلاو باتكلا مهانيتٓا نيذلا كٔيلؤا
اموق اهب انلكو دقف ءالٔوه اهب
ALN
OUT
I found the solution, it seems to be a weird problem I removed these lines from the training script, and the problem was solved:
sort -R manifest.txt > /tmp/manifest2.txt
sed 1,100d /tmp/manifest2.txt > train.txt
sed 100q /tmp/manifest2.txt > test.txt
Therefore, you can run these commands in the terminal, and after they finish executing close the current terminal, then you must open a new terminal and run the training script.
After removing these lines from the training script, and then running the training script again the ERROR was changing and the ALN and OUT contained data.
Somehow sed and sort were causing the model not to train, what a weird problem.