Xiaomeng Hu

Results 7 comments of Xiaomeng Hu

> I use model.generate() to generate sentences. Just like using other transformer models. I find it will cause about 2 seconds for dollly-v2-3b to generate a sentence when the max...

> Hm, shouldn't be any real difference there. Are you sure the settings are fairly equivalent and output length is the same (not just max) yeah, i'm sure i use...

> Hm, shouldn't be any real difference there. Are you sure the settings are fairly equivalent and output length is the same (not just max) what do you mean when...

> Hm, shouldn't be any real difference there. Are you sure the settings are fairly equivalent and output length is the same (not just max) ``` outputs =generator.generate( input_ids =...

> I just mean, how much output are you getting from each? the run time is proportional to the output size. You can't directly control it, but affects the comparison....

I have settled the issues. The difference is in the config file in the huggingface. you set "use_cache=False" for dolly

> Oh yeah, you don't want to measure time to download or load the model here. Make sure it's already loaded then time the generation yeah i think you'd better...