transformers
transformers copied to clipboard
GenerationConfig argument for Seq2SeqTrainer / Seq2SeqTrainingArgument
Feature request
👋
The request is for a way to pass a GenerationConfig
to a Seq2SeqTrainer
(through Seq2SeqTrainingArguments
).
Motivation
ATOW, Seq2SeqTrainer
only supports a few arguments for generation: max_length
/ max_new_tokens
, num_beams
.
Being able to pass a GenerationConfig
configuration to generate would allow users to have much more control over the prediction step.
I noticed that this is already possible as in generate
, if no GenerationConfig
arg is given, it is retrieved from self.generation_config
, itself deduced from model.config
at model init (but gen args in PretrainedConfig
is legacy / will be removed right ?).
Currently, overriding model.generation_config
model attribute would conduct to the desired result, however this does not seem to be documented.
Your contribution
I don't know if this have been discussed.
Do you think this should be added ? If not, maybe edit the documentation to clarify the model.generation_config
way ?
I can help in both cases.
cc @sgugger @gante
I'm not sure we can have a generation_config
directly in the Seq2SeqTrainingArguments
as it wouldn't work with the CLI. But maybe we can have a generation_config_file
argument instead? Also yes to the model.generation_config
way being better documented!
Good point (CLI)!
In that can a json file could work, and alternatively the argument could maybe accept both paths to this file and a GenerationConfig
object ?
Yes, that works for me!
@Natooz that sounds great! Would you like to have a go at it?
Hey @gante, yep, just clearing my backlog, it should be done by the week-end
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
(completed)
Hello! It's a great idea to be able to pass GenerationConfig to the Seq2SeqTrainer. However, it would be great to have a matching GenerationArguments
class that allows parsing.
Right now I see that the documentation of GenerationConfig
describes the types of the attributes in words but these are not defined in the code. Am I missing where these are defined?
Hey @artidoro 👋
I'm not sure if I got your question right -- were you asking for support to pass generation arguments directly to the trainer (e.g. --top-k 50
), as opposed to the solution added as a result of this issue (passing a whole generation config file)?
Hi, I agree with @artidoro and would also love a GenerationArguments
class that can be passed along with Seq2SeqTrainingArgument
to HfArgumentParser
. @gante that is also how I interpret this request.
I agree that it may make things easier for the users :) However, due to our limited bandwidth, an important question must be asked: is there any form of parameterization that you can't do through the generation_config
argument?
Actually I ended up having issues with GenerationConfig
* so I just pass the arguments directly to generate(**config)
. Only reason it would be nice to pass generation parameters directly to the command line is if I want to sweep easily over parameters like beam size/temperature. But no, there is no functionality loss that I can see.
*Issues with default arguments vs set arguments cancelling each other out