Zeguan Xiao
Zeguan Xiao
@00krishna Can you share the code?
> Hello @ZeguanXiao which code would you like me to share? Hi, I mean your RL code adapted from this template. I thought you have finished an RL flavor of...
@thpun Do you address this bug? I encounter the same error when I fine-tuning mBART with translation_from_pretrained_bart task. When I try to train a model from scratch, the FSDP is...
Oh, it's so regrettable.
Also, it seems `EncoderDecoderModelAdaptersMixin.iter_layers` should count decoder layer_id starting with len(self.encoder.layers)) like this? ``` def iter_layers(self) -> Iterable[Tuple[int, nn.Module]]: for i, layer in self.encoder.iter_layers(): yield i, layer encoder_layer_n = len(self.encoder.encoder.layer)...
@hSterz My current workaround is setting `model.decoder.base_model.config.adapters = model.encoder.base_model.config.adapters` and changing the` iter_layer`. It seems to work fine.
@haileyschoelkopf The ToxiGen dataset contains responses with labels and prompts to generate these responses. As @laphang presented, the generation-based toxicity evaluation was used in the Llama 2 paper. Also, it...
> Hi, > > According to the error message, one possible reason is that the fine-tuning of the model crashed. Can you check the training loss when you are fine-tuning...
I use A100 and L40. The issue occurs randomly, meaning that under the same settings, sometimes it happens and sometimes it doesn't.
@HZQ950419 Can you share your Python environment configuration? The issue may related to certain versions of transformers or tokenizers etc.