TableBank
TableBank copied to clipboard
error in running recognition model ?
While running this command after having recognition model in my local model.pt
python translate.py -model model.pt --src_dir recognition.jpg -output pred.txt
usage: translate.py [-h] [-config CONFIG] [-save_config SAVE_CONFIG] --model
MODEL [MODEL ...] [--fp32] [--avg_raw_probs]
[--data_type DATA_TYPE] --src SRC [--src_dir SRC_DIR]
[--tgt TGT] [--shard_size SHARD_SIZE] [--output OUTPUT]
[--report_bleu] [--report_rouge] [--report_time]
[--dynamic_dict] [--share_vocab]
[--random_sampling_topk RANDOM_SAMPLING_TOPK]
[--random_sampling_temp RANDOM_SAMPLING_TEMP]
[--seed SEED] [--beam_size BEAM_SIZE]
[--min_length MIN_LENGTH] [--max_length MAX_LENGTH]
[--max_sent_length] [--stepwise_penalty]
[--length_penalty {none,wu,avg}] [--ratio RATIO]
[--coverage_penalty {none,wu,summary}] [--alpha ALPHA]
[--beta BETA] [--block_ngram_repeat BLOCK_NGRAM_REPEAT]
[--ignore_when_blocking IGNORE_WHEN_BLOCKING [IGNORE_WHEN_BLOCKING ...]]
[--replace_unk] [--phrase_table PHRASE_TABLE] [--verbose]
[--log_file LOG_FILE]
[--log_file_level {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET,50,40,30,20,10,0}]
[--attn_debug] [--dump_beam DUMP_BEAM] [--n_best N_BEST]
[--batch_size BATCH_SIZE] [--gpu GPU]
[--sample_rate SAMPLE_RATE] [--window_size WINDOW_SIZE]
[--window_stride WINDOW_STRIDE] [--window WINDOW]
[--image_channel_size {3,1}]
translate.py: error: the following arguments are required: --src/-src
pytorch 0.4.1
py36_cuda9.2.148_cudnn7.1.4_1
Can someone tell what could be error ?
I don't know what to pass as an argument to -src
I see here that -src
means Source sequence to decode (one line per sequence)
which is what I don't understand !!
this doesn't help either, mentioned in this issue
Has anyone tried running pretrained table recognition model ?
hi Minghao Li, can you please tell me how to use your pre-trained model for Table Structure Recognition. currently, I am using An image which contains a table and I am trying to extract table out of that. But it is giving me am an empty table.
Script to run code: OpenNMT-py admin$ python translate.py -data_type img -model model.pt -src_dir data/im2text/images -src data/im2text/src-test.txt -output pred.txt -max_length 150 -beam_size 5 -verbose
output:
can you please help me with this. I am stuck on this?
References: http://opennmt.net/OpenNMT-py/im2text.html https://github.com/OpenNMT/OpenNMT-py https://conversationhub.blob.core.windows.net/tablebank/model_zoo/Recognition_all_without_copyright/model.pt
Hi @rahulsinghpatel
Can you tell me what's in data/im2text/src-test.txt
?
@ankur7721 Follow the instruction on this link and see the help of -train_src. http://opennmt.net/OpenNMT-py/im2text.html
hi @rahulsinghpatel , what's the meaning of empty table? can you show me the output in pred.txt? This model require the input image is a whole table without anything else.
@ankur7721 your question, answer here. -src data/im2text/src-test.txt means the input image location give to this txtfile.
Refer here, https://github.com/OpenNMT/OpenNMT-py/issues/1533#issue-482436956
hi @rahulsinghpatel, i am also getting poor result. not extracting the text from the source image.
my output here
[2019-08-19 17:50:44,610 INFO] Translating shard 0.
/usr/local/lib/python3.6/dist-packages/torchtext/data/field.py:359: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
var = torch.tensor(arr, dtype=self.dtype, device=device)
SENT 1: None
PRED 1: <tabular> <tbody> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> </tbody> </tabular>
PRED SCORE: -1.2848
PRED AVG SCORE: -0.0111, PRED PPL: 1.0111
For clarification I am giving sample output:
this is generating only table markup. NO DATA
Like this:
<tabular> <tbody> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> </tbody> </tabular>
tbody - table body. tr - table row. tdy - cell with data. tdn - cell with no data.
Hope this is helpful.
i am getting this error "AssertionError: Cannot use _dir with TextDataReader." while running translate.py
For clarification I am giving sample output: this is generating only table markup. NO DATA Like this:
<tabular> <tbody> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> </tbody> </tabular>
tbody - table body. tr - table row. tdy - cell with data. tdn - cell with no data.
Hope this is helpful.
Do you know how I can turn these tags into images so that I can easily observe if there are any errors?
@liminghao1630 @MuruganR96 hey i am also getting the empty cell. anyone have figure it out? this is my colab notebook. i have cropped only table area still model not able to get text. https://colab.research.google.com/drive/1xeOQ5IUpwjDmCU6orwHd2hP3coOaau1t?usp=sharing
我收到此错误“断言错误:无法将 _dir 与 TextDataReader 一起使用”。运行 translate.py 时
Has this been solved? I'm stuck here, too