OU Wei

Results 6 comments of OU Wei

Here is the stack trace: ~/~/llm/transformers/src/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs) 1631 self._inner_training_loop, self._train_batch_size, args.auto_find_batch_size 1632 ) -> 1633 return inner_training_loop( 1634 args=args, 1635 resume_from_checkpoint=resume_from_checkpoint, ~/~/llm/transformers/src/transformers/trainer.py in _inner_training_loop(self, batch_size,...

Thank you very much ! I will try later. I installed peft from its source on github(main branch).

> You could fix this by commenting out these lines: > > ```python > > old_state_dict = model.state_dict > > model.state_dict = ( > > lambda self, *_, **__: get_peft_model_state_dict(self,...

Hello I already solved the problem by setting a lower kill score for the process: echo -17 /proc/myprocessid/oom_score_adj . The cpu usage is 400% when loading the model and it...

Hi do you set torch_dtype = tf.float16 or bfloat16 when you were loading the model? If yes then I think there is no problem with your code. Bloom may take...

I implemented Baichuan13b-cpp. Try it if you are interested. https://github.com/ouwei2013/baichuan13b.cpp