mmocr
mmocr copied to clipboard
json.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4)
Can someone help me please?
Did your data format match the parser in Config? Please provide a short sample of your data annotation and the full config file.
Also, next time please follow the template to post your issue. It's designed to help everyone understand the situation thoroughly.
Also these chapters may be helpful: https://mmocr.readthedocs.io/en/latest/tutorials/dataset_types.html https://mmocr.readthedocs.io/en/latest/tutorials/blank_recog.html
I'm trying to provide my dataset as the example segocr this link (https://github.com/open-mmlab/mmocr/blob/main/configs/base/recog_datasets/seg_toy_data.py). The dataset provided by me is the same as this link (https://github.com/open-mmlab/mmocr/tree/main/tests/data/ocr_char_ann_toy_dataset) I_ images I I_ train I I I_ 1.jpg I I I_ … I I_ val I I_ test I I I_ 1019.jpg I I I_ … I_ label_seg_test.txt I I_ 1019.jpg ผข9104 I I_ … I_ label_seg_train.txt I I_ {'file_name': '1.jpg', 'annotations': [{'char_text': '82-1279', 'char_box': [732.8115942028985, 1733.3333333333333, 993.6811594202898, 1727.5362318840578, 993.6811594202898, 1799.9999999999998, 728.463768115942, 1799.9999999999998]}], 'text': '82-1279'} I I_ …
@gaotongxiao This is my error
RuntimeError: Caught JSONDecodeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/klabs/anaconda3/envs/car3-env/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/klabs/anaconda3/envs/car3-env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/klabs/anaconda3/envs/car3-env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
The tricky part here is that the training set and test set are in different formats. Therefore, you need to make sure the annotation parser matches the dataset format.
For example, the parser in
https://github.com/open-mmlab/mmocr/blob/67ebc6c876bd4bd79d122cd6d525edfc08f6e37d/configs/base/recog_datasets/seg_toy_data.py#L11-L12
corresponds to
https://github.com/open-mmlab/mmocr/blob/main/tests/data/ocr_char_ann_toy_dataset/instances_train.txt
And
https://github.com/open-mmlab/mmocr/blob/67ebc6c876bd4bd79d122cd6d525edfc08f6e37d/configs/base/recog_datasets/seg_toy_data.py#L24-L28
corresponds to
https://github.com/open-mmlab/mmocr/blob/main/tests/data/ocr_char_ann_toy_dataset/instances_test.txt
@gaotongxiao I sure that the annotation parser matches the dataset format. TT Now i try to use sar model and provide anything like sar example this link (https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/seg/seg_r31_1by16_fpnocr_toy_dataset.py) then when i training this cell `from mmocr.datasets import build_dataset from mmocr.models import build_detector from mmocr.apis import train_detector import os.path as osp
datasets = [build_dataset(cfg.data.train)]
model = build_detector(
cfg.model, train_cfg=cfg.get('train_cfg'), test_cfg=cfg.get('test_cfg'))
model.CLASSES = datasets[0].CLASSES
mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir)) train_detector(model, datasets, cfg, distributed=False, validate=True)`
i found this error in the last line prepare index 2502 with error Extra data: line 1 column 5 (char 4) load index 2502 with error Extra data: line 1 column 5 (char 4)
Can you recommend me? How should this error be solved?
The dataset in this format (https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/seg/seg_r31_1by16_fpnocr_toy_dataset.py) only works for SegOCR model. It's not applicable for other models such as SAR.