Sean Owen comments

Results 245 comments of


                                            Sean Owen

Use use_cache=True config?

Matt notes that this was probably set to false because gradient checkpointing requires it to be off during training. But we can just edit the resulting model config for now...

Issue running on A100

Do you have cublas installed, and at a matching version for your CUDA drivers?

Issue running on A100

Not sure, it's running for me on CUDA 11.3 and 11.7, according to the code in the repo. I suspect it's something in the environment, but not sure what it...

Inference Time on your site?

What site are you referring to here? You should use the 'v2' models. With no particular tuning, on an A10 for example, you might expect 3-5 secs for the 3B...

Inference Time on your site?

How many GPUs for what?

Inference Time on your site?

1 GPU like an A10 on the 3B model, maybe; I haven't measured it closely. A more optimized deployment of the 12B model can hit more like 10ms / token...

Large loss jump in the beginning of second epoch in training

@matthayes I think y'all observed that during training too, and it was mysterious. I think the answer was to turn down learning rate a bit? but the final value in...

How can I re-train dolly in a clean A100 Linux machine?

You haven't said anything about the problem. You dont' need Databricks, but you would have to change the references to dbutils and the %pip install command.

Getting weird responses with my fine-tuned model

Are you OOM? maybe restarting the kernel wasn't enough somehow or something else is still attached. From this not sure what else could be the issue.

Getting weird responses with my fine-tuned model

I was thinking swapping or something. Check all your VM stats to see what might be going on, like is it even busy on the CPU