Xiaomeng Hu comments

Results 7 comments of


                                            Xiaomeng Hu

Use use_cache=True config?

> I use model.generate() to generate sentences. Just like using other transformer models. I find it will cause about 2 seconds for dollly-v2-3b to generate a sentence when the max...

Use use_cache=True config?

> Hm, shouldn't be any real difference there. Are you sure the settings are fairly equivalent and output length is the same (not just max) yeah, i'm sure i use...

Use use_cache=True config?

> Hm, shouldn't be any real difference there. Are you sure the settings are fairly equivalent and output length is the same (not just max) what do you mean when...

Use use_cache=True config?

> Hm, shouldn't be any real difference there. Are you sure the settings are fairly equivalent and output length is the same (not just max) ``` outputs =generator.generate( input_ids =...

Use use_cache=True config?

> I just mean, how much output are you getting from each? the run time is proportional to the output size. You can't directly control it, but affects the comparison....

Use use_cache=True config?

I have settled the issues. The difference is in the config file in the huggingface. you set "use_cache=False" for dolly

Use use_cache=True config?

> Oh yeah, you don't want to measure time to download or load the model here. Make sure it's already loaded then time the generation yeah i think you'd better...