transformers GenerationConfig argument for Seq2SeqTrainer / Seq2SeqTrainingArgument

Feature request

👋

The request is for a way to pass a GenerationConfig to a Seq2SeqTrainer (through Seq2SeqTrainingArguments).

Motivation

ATOW, Seq2SeqTrainer only supports a few arguments for generation: max_length / max_new_tokens, num_beams.

Being able to pass a GenerationConfig configuration to generate would allow users to have much more control over the prediction step.

I noticed that this is already possible as in generate, if no GenerationConfig arg is given, it is retrieved from self.generation_config, itself deduced from model.config at model init (but gen args in PretrainedConfig is legacy / will be removed right ?).

Currently, overriding model.generation_config model attribute would conduct to the desired result, however this does not seem to be documented.

Your contribution

I don't know if this have been discussed. Do you think this should be added ? If not, maybe edit the documentation to clarify the model.generation_config way ?

I can help in both cases.

cc @sgugger @gante

Mar 16 '23 12:03 Natooz

I'm not sure we can have a generation_config directly in the Seq2SeqTrainingArguments as it wouldn't work with the CLI. But maybe we can have a generation_config_file argument instead? Also yes to the model.generation_config way being better documented!

Mar 16 '23 13:03 sgugger

Good point (CLI)! In that can a json file could work, and alternatively the argument could maybe accept both paths to this file and a GenerationConfig object ?

Mar 20 '23 10:03 Natooz

Yes, that works for me!

Mar 20 '23 13:03 sgugger

@Natooz that sounds great! Would you like to have a go at it?

Mar 21 '23 12:03 gante

Hey @gante, yep, just clearing my backlog, it should be done by the week-end

Mar 21 '23 13:03 Natooz

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Apr 15 '23 15:04 github-actions[bot]

(completed)

Apr 15 '23 15:04 gante

Hello! It's a great idea to be able to pass GenerationConfig to the Seq2SeqTrainer. However, it would be great to have a matching GenerationArguments class that allows parsing.

Right now I see that the documentation of GenerationConfig describes the types of the attributes in words but these are not defined in the code. Am I missing where these are defined?

Apr 15 '23 22:04 artidoro

Hey @artidoro 👋

I'm not sure if I got your question right -- were you asking for support to pass generation arguments directly to the trainer (e.g. --top-k 50), as opposed to the solution added as a result of this issue (passing a whole generation config file)?

Apr 17 '23 10:04 gante

Hi, I agree with @artidoro and would also love a GenerationArguments class that can be passed along with Seq2SeqTrainingArgument to HfArgumentParser. @gante that is also how I interpret this request.

Sep 15 '23 02:09 AADeLucia

I agree that it may make things easier for the users :) However, due to our limited bandwidth, an important question must be asked: is there any form of parameterization that you can't do through the generation_config argument?

Oct 11 '23 17:10 gante

Actually I ended up having issues with GenerationConfig* so I just pass the arguments directly to generate(**config). Only reason it would be nice to pass generation parameters directly to the command line is if I want to sweep easily over parameters like beam size/temperature. But no, there is no functionality loss that I can see.

*Issues with default arguments vs set arguments cancelling each other out

Oct 12 '23 11:10 AADeLucia

transformers transformers copied to clipboard

GenerationConfig argument for Seq2SeqTrainer / Seq2SeqTrainingArgument

Feature request

Motivation

Your contribution

transformers
transformers copied to clipboard