I used the format of C-RLFT example (which has 2 json examples) in ReadMe to create 6 json examples. Then I used generate_dataset to pretokenize the data. Then used ochat.training_deepspeed.train to train the model with this data.
I got the following error:
File "openchat/ochat/training_deepspeed/train.py", line 279, in
train()
File "openchat/ochat/training_deepspeed/train.py", line 209, in train
model_engine, optimizer = create_model(args)
^^^^^^^^^^^^^^^^^^
File "openchat/ochat/training_deepspeed/train.py", line 106, in create_model
model_engine, optimizer, _, _ = deepspeed.initialize(args=args,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "python3.11/site-packages/deepspeed/init.py", line 157, in initialize
config_class = DeepSpeedConfig(config, mpu)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "python3.11/site-packages/deepspeed/runtime/config.py", line 703, in init
self._param_dict = hjson.load(open(config, "r"), object_pairs_hook=dict_raise_error_on_duplicate_keys)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "python3.11/site-packages/hjson/init.py", line 117, in load
return loads(fp.read(),
^^^^^^^^^^^^^^^^
File "python3.11/site-packages/hjson/init.py", line 190, in loads
return cls(encoding=encoding, **kw).decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "python3.11/site-packages/hjson/decoder.py", line 520, in decode
obj, end = self.raw_decode(s)
^^^^^^^^^^^^^^^^^^
File "python3.11/site-packages/hjson/decoder.py", line 559, in raw_decode
return self.scan_once(s, idx)
^^^^^^^^^^^^^^^^^^^^^^
File "python3.11/site-packages/hjson/decoder.py", line 329, in scan_once
return _scan_once(string, idx)
^^^^^^^^^^^^^^^^^^^^^^^
File "python3.11/site-packages/hjson/decoder.py", line 319, in _scan_once
return parse_object((string, idx + 1), encoding, strict,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "python3.11/site-packages/hjson/decoder.py", line 366, in JSONObject
key, end = scanKeyName(s, end, encoding, strict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "python3.11/site-packages/hjson/decoder.py", line 280, in scanKeyName
raise HjsonDecodeError("Bad key name (eof)", s, end);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
hjson.scanner.HjsonDecodeError: Bad key name (eof): line 35 column 2 (char 796)
Are you running the command in the root directory of the repo? The working directory should be the root directory.
Yes, I also gave the absolute path to the data file. The train.py said it was loading the data file fine.
Did you try the format in readme by your self? Or the small size of the data is a problem? since I only had 6 json examples in the file.
@imoneoi I think my deepspeed_config.json was corrupted. After getting a new copy, I can train the model now. Sorry for my mistake.