Yao Fu comments

Results 19 comments of


                                            Yao Fu

wiki2bio/original_data/word_vocab.txt

Oh that is for testing the model on the Wikibio data to text generation task, and not included in the paper. If you need this part could you send me...

wiki2bio/original_data/word_vocab.txt

Yep, that's right. With the current code I think you can get it run. But during my test, the training was unstable and may corrupt in the second epoch(loss become...

Repetition of words in the generation process

@TobiasLee Thanks for helping to answer! Training it longer is indeed a quick answer. But the model may still suffer from repetition after proper convergence. A quick solution would be...

Repetition of words in the generation process

For further discussion about architectures that prevent repetition, and its influence on sentence quality, see: https://www.aclweb.org/anthology/N18-1017/

is it possible to use this model for paraphrasing general sentences?

Oh I think this model currently may not fit for your general paraphrasing task because it's trained on MSCOCO and Quora, all quite domain-specific. The quickest way I think would...

is it possible to use this model for paraphrasing general sentences?

Hi kasra-pak, Sorry for the late reply, you could use a pre-trained translation model locally like the ones in OpenNMT: https://opennmt.net/Models-py/

Any pre-trained model available?

I'm so sorry that you are encountering the problems. I have received a few issues in the previous weeks but I'm stuck in China for a visa issue while my...

Update table to accurately reflect GPT-4 performance

Hi, Thank you for pointing this out! Indeed very important clarification here! It is a bit of hard to tell how exactly RLHF influence GPT-4 performance on GSM8k, because the...

Missing instruction fine-tuned small models

Sure that's on the TODO list. Yet the hope is unlikely to happen -- generally model's reasoning ability is very well correlated with the scale (given other things done correctly)....

Missing instruction fine-tuned small models

Hi, We have updated a list of models including vicuna, FlanT5, InstructCodeT5 and so on and their numbers on a subset of datasets are shown in the updated table. Will...