DuReader icon indicating copy to clipboard operation
DuReader copied to clipboard

About convert MARCO dataset to Dureader style

Open pengwei-iie opened this issue 4 years ago • 0 comments

When using the script marcov2_to_dureader.py to convert MARCOv2 to dureader, it failed because ValueError: Trailing data

The command: sh run_marco2dureader_preprocess.sh ../Marco/train_v2.1.json ../Marco/train_v2.1_dureaderformat.json

But it occurs an error -- ValueError: Trailing data. Details as follow: Traceback (most recent call last): File "marcov1_to_dureader.py", line 33, in df = pd.read_json(sys.argv[1]) File "/home/user/anaconda3/lib/python3.6/site-packages/pandas/io/json/json.py", line 366, in read_json return json_reader.read() File "/home/user/anaconda3/lib/python3.6/site-packages/pandas/io/json/json.py", line 467, in read obj = self._get_object_parser(self.data) File "/home/user/anaconda3/lib/python3.6/site-packages/pandas/io/json/json.py", line 484, in _get_object_parser obj = FrameParser(json, **kwargs).parse() File "/home/user/anaconda3/lib/python3.6/site-packages/pandas/io/json/json.py", line 576, in parse self._parse_no_numpy() File "/home/user/anaconda3/lib/python3.6/site-packages/pandas/io/json/json.py", line 793, in _parse_no_numpy loads(json, precise_float=self.precise_float), dtype=None) ValueError: Trailing data

pengwei-iie avatar Apr 09 '20 03:04 pengwei-iie