transformers-into-vaes icon indicating copy to clipboard operation
transformers-into-vaes copied to clipboard

SAMPLE

Open WYejian opened this issue 3 years ago • 3 comments

Hello, thank you for sharing the code, how can I sample from the hidden space to generate ?

WYejian avatar Oct 16 '21 11:10 WYejian

HI @WYejian.

T5VAE defined in model_t5.py initializes an internal t5 (T5ForConditionalGeneration) defined in vendor_t5.py.

I've modified T5ForConditionalGeneration in vendor_t5.py so it takes a sampled_z parameter: https://github.com/seongminp/transformers-into-vaes/blob/16205c8da8731b0097d80eeca219a878e0397beb/vendor_t5.py#L46

Since we don't call T5ForConditionalGeneration directly (and instead interface with its wrapper, T5VAE), you can pass the sampled z as one of the "kwargs" in T5VAE's forward: https://github.com/seongminp/transformers-into-vaes/blob/16205c8da8731b0097d80eeca219a878e0397beb/model_t5.py#L70

I've not tested generation extensively with this code, though. But it should be the same as generating in any other encoder-decoder network.

Hope that helps!

seongminp avatar Oct 16 '21 15:10 seongminp

Thank you for your reply, when I run the model, it prompts 'NO module named 'generate'', what should I do? In addition, the pre-trained data and training data can be different?

WYejian avatar Oct 18 '21 13:10 WYejian

Just uploaded generate.py! Thanks for pointing that out.

Yes. I think it'll work better if we use the same data for pretraining and finetuning, but I wanted to work with datasets used in previous research. The training data is just raw text, so ideally the choice of finetuning dataset should not matter for performance but we all know in practice the domain shift in the corpus degrades benchmark performance.

seongminp avatar Oct 18 '21 23:10 seongminp