Sean Owen

Results 245 comments of Sean Owen

You can't use bf16 on the V100. Did you make the change in the README? https://github.com/databrickslabs/dolly#v100-gpus

You're saying downgrading didn't help? if not, does 0.8.0 work? If it does, then I should update the requirements.txt for now

OK, if deepspeed 0.8.3 seems to resolve this, then that's done: https://github.com/databrickslabs/dolly/pull/130

Should take a few seconds. How are you generating? did you see https://github.com/databrickslabs/dolly#generating-on-other-instances for example?

Hm, shouldn't be any real difference there. Are you sure the settings are fairly equivalent and output length is the same (not just max)

I just mean, how much output are you getting from each? the run time is proportional to the output size. You can't directly control it, but affects the comparison. I'm...

Oh yeah, you don't want to measure time to download or load the model here. Make sure it's already loaded then time the generation

@matthayes I think this is a good point - the pythia models have use_cache=True. https://huggingface.co/databricks/dolly-v2-3b/blob/main/config.json#L29 I don't know a lot about this but seems like we would want to do...