transformers icon indicating copy to clipboard operation
transformers copied to clipboard

GenerationConfig argument for Seq2SeqTrainer / Seq2SeqTrainingArgument

Open Natooz opened this issue 1 year ago • 5 comments

Feature request

👋

The request is for a way to pass a GenerationConfig to a Seq2SeqTrainer (through Seq2SeqTrainingArguments).

Motivation

ATOW, Seq2SeqTrainer only supports a few arguments for generation: max_length / max_new_tokens, num_beams.

Being able to pass a GenerationConfig configuration to generate would allow users to have much more control over the prediction step.

I noticed that this is already possible as in generate, if no GenerationConfig arg is given, it is retrieved from self.generation_config, itself deduced from model.config at model init (but gen args in PretrainedConfig is legacy / will be removed right ?).

Currently, overriding model.generation_config model attribute would conduct to the desired result, however this does not seem to be documented.

Your contribution

I don't know if this have been discussed. Do you think this should be added ? If not, maybe edit the documentation to clarify the model.generation_config way ?

I can help in both cases.

cc @sgugger @gante

Natooz avatar Mar 16 '23 12:03 Natooz

I'm not sure we can have a generation_config directly in the Seq2SeqTrainingArguments as it wouldn't work with the CLI. But maybe we can have a generation_config_file argument instead? Also yes to the model.generation_config way being better documented!

sgugger avatar Mar 16 '23 13:03 sgugger

Good point (CLI)! In that can a json file could work, and alternatively the argument could maybe accept both paths to this file and a GenerationConfig object ?

Natooz avatar Mar 20 '23 10:03 Natooz

Yes, that works for me!

sgugger avatar Mar 20 '23 13:03 sgugger

@Natooz that sounds great! Would you like to have a go at it?

gante avatar Mar 21 '23 12:03 gante

Hey @gante, yep, just clearing my backlog, it should be done by the week-end

Natooz avatar Mar 21 '23 13:03 Natooz

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 15 '23 15:04 github-actions[bot]

(completed)

gante avatar Apr 15 '23 15:04 gante

Hello! It's a great idea to be able to pass GenerationConfig to the Seq2SeqTrainer. However, it would be great to have a matching GenerationArguments class that allows parsing.

Right now I see that the documentation of GenerationConfig describes the types of the attributes in words but these are not defined in the code. Am I missing where these are defined?

artidoro avatar Apr 15 '23 22:04 artidoro

Hey @artidoro 👋

I'm not sure if I got your question right -- were you asking for support to pass generation arguments directly to the trainer (e.g. --top-k 50), as opposed to the solution added as a result of this issue (passing a whole generation config file)?

gante avatar Apr 17 '23 10:04 gante

Hi, I agree with @artidoro and would also love a GenerationArguments class that can be passed along with Seq2SeqTrainingArgument to HfArgumentParser. @gante that is also how I interpret this request.

AADeLucia avatar Sep 15 '23 02:09 AADeLucia

I agree that it may make things easier for the users :) However, due to our limited bandwidth, an important question must be asked: is there any form of parameterization that you can't do through the generation_config argument?

gante avatar Oct 11 '23 17:10 gante

Actually I ended up having issues with GenerationConfig* so I just pass the arguments directly to generate(**config). Only reason it would be nice to pass generation parameters directly to the command line is if I want to sweep easily over parameters like beam size/temperature. But no, there is no functionality loss that I can see.

*Issues with default arguments vs set arguments cancelling each other out

AADeLucia avatar Oct 12 '23 11:10 AADeLucia