PICK-pytorch icon indicating copy to clipboard operation
PICK-pytorch copied to clipboard

Training on colab

Open vpvsankar opened this issue 4 years ago • 15 comments

Is it possible to train this model on colab? I have a small dataset.

vpvsankar avatar Oct 01 '20 10:10 vpvsankar

I believe colab doesn't provide distributed training. Currently, the code in this repo runs on distributed server. You can modify the code regarding distributed training and train on colab.

dipesh-commits avatar Oct 02 '20 15:10 dipesh-commits

Here i found blog https://medium.com/analytics-vidhya/extracting-structured-data-from-invoice-96cf5e548e40 in which they upload colab notebook. That will show how to preprocess SROIE dataset for this repo and train on colab.

kbrajwani avatar Oct 04 '20 03:10 kbrajwani

@kbrajwani thank you so much brother : ). were you able to get the results mentioned in the paper?

vpvsankar avatar Oct 04 '20 14:10 vpvsankar

@vpvsankar i have not trained more than 30 epochs as its takes too much time so i didn't go to compared the result.

kbrajwani avatar Oct 04 '20 14:10 kbrajwani

I am getting this error, but the path i gave is correct

File "train.py", line 162, in entry_point(config) File "train.py", line 126, in entry_point main(config, local_master, logger if local_master else None) File "train.py", line 34, in main train_dataset = config.init_obj('train_dataset', pick_dataset_module) File "/content/drive/My Drive/PICK-pytorch/parse_config.py", line 105, in init_obj return getattr(module, module_name)(*args, **module_args) File "/content/drive/My Drive/PICK-pytorch/data_utils/pick_dataset.py", line 64, in init raise FileNotFoundError('Entity folder is not exist!') FileNotFoundError: Entity folder is not exist! Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 263, in main() File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 259, in main cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', '-u', 'train.py', '--local_rank=0', '-c', 'config.json', '-d', '0', '--local_world_size', '1']' returned non-zero exit status 1.

vpvsankar avatar Oct 04 '20 14:10 vpvsankar

It worked thanks

vpvsankar avatar Oct 04 '20 14:10 vpvsankar

@kbrajwani Hi, great job! Would you like to merge your code for processing the SROIE into this repository? We can't do this currently for some reasons.

tengerye avatar Oct 05 '20 10:10 tengerye

@tengerye I would love to do but first of all let me tell you its not so much perfect. If you are okay with my logic i will merge the code. Here is my logic

for key,value in sorted(entities.items()): ## Here df[9] is transcript which comes from box/ folder csv file ## if transcript contain entity value then i am adding key to that transcript

idx = df[df[9].str.contains('|'.join(map(str.strip, value.split(','))))].index
df.loc[idx, 10] = key

## df[idx,10] is ner tag for that transcript as per require in PICk-pytorch boxes_and_transcripts folder
## index ,coordinates x1_1,y1_1,x2_1,y2_1,x3_1,y3_1,x4_1,y4_1, transcript , ner tag

Due to intended I can't able to explain better here. It's well explains in blog.

kbrajwani avatar Oct 05 '20 11:10 kbrajwani

@kbrajwani Thanks for your kind reply. I will take a look at your blog.

tengerye avatar Oct 06 '20 05:10 tengerye

I'm getting this error when running the training while trying the distributed and non-distributed way:

Traceback (most recent call last): File "train.py", line 166, in entry_point(config) File "train.py", line 130, in entry_point main(config, local_master, logger if local_master else None) File "train.py", line 35, in main set_vocab(config['train_dataset']['args']['entities_list']) KeyError: 'entities_list'

Any clue about how to fix it?

tomaschild avatar Oct 14 '20 17:10 tomaschild

@tomaschild if you are using blog notebook please clone their fork repo https://github.com/dlmade/Pick.Pytorch.Sroie which they have used while training. it already have preprocessed sroie dataset.

kbrajwani avatar Oct 14 '20 17:10 kbrajwani

I am getting this error, but the path i gave is correct

File "train.py", line 162, in entry_point(config) File "train.py", line 126, in entry_point main(config, local_master, logger if local_master else None) File "train.py", line 34, in main train_dataset = config.init_obj('train_dataset', pick_dataset_module) File "/content/drive/My Drive/PICK-pytorch/parse_config.py", line 105, in init_obj return getattr(module, module_name)(*args, **module_args) File "/content/drive/My Drive/PICK-pytorch/data_utils/pick_dataset.py", line 64, in init raise FileNotFoundError('Entity folder is not exist!') FileNotFoundError: Entity folder is not exist! Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 263, in main() File "/usr/local/lib/python3.6/dist-packages/torch/distributed/launch.py", line 259, in main cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', '-u', 'train.py', '--local_rank=0', '-c', 'config.json', '-d', '0', '--local_world_size', '1']' returned non-zero exit status 1.

It worked thanks

@vpvsankar How did you solve it? I am also getting the same error. Please help.

pranavstha11 avatar Dec 08 '20 06:12 pranavstha11

@kbrajwani @pranavstha11 Did you found the solution for error: Entity folderis not exist?

keshav-qubitrics avatar Feb 26 '21 13:02 keshav-qubitrics

Hey @keshav-qubitrics , There is error in https://github.com/dlmade/Pick.Pytorch.Sroie/blob/master/config.json file. you have to change the path of data as per your working directory. Check line number 61 to 64 and 73 to 76.

dlmade avatar Feb 26 '21 14:02 dlmade

Thank you for your response @dlmade

keshav-qubitrics avatar Feb 26 '21 17:02 keshav-qubitrics