Matthew Hayes comments

Results 98 comments of


                                            Matthew Hayes

dolly2 3b, long input truncation issue, tensor shape not match error

I've done some investigation on this. Our pipeline takes the instruction and formats it into a prompt (the same prompt used for training). The prompt is about 23 tokens when...

dolly2 3b, long input truncation issue, tensor shape not match error

I'll have to think more about whether we should make any changes to the pipeline or model config. We could compute the max length by doing the math as I...

HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name'

Are you sure that path exists and contains the model? Usually I’ve seen that when the model does not exist at that path or some expected files are missing.

Added datasets in 5 languages

Rather than check in translations of the dataset into this repo I think it's preferable to upload the translation to Hugging Face. We can add a note in the README...

Large loss jump in the beginning of second epoch in training

I don't have a good explanation for why it jumps on your own checkpoint but not the pretrained GPTJ-6B. I have noticed this type of behavior before when training and...

Getting weird responses with my fine-tuned model

>it always contain my original prompt which I don't need The standard `text-generation` pipeline will output the original prompt with the completion appended. From your code above it appears this...

Getting weird responses with my fine-tuned model

>sometimes it can have many repeated sentences/phrases It's hard to know what could be causing this. It may have to do with the custom dataset. You can also try using...

Getting weird responses with my fine-tuned model

>takes about 30s to generate a response on p3dn.24xlarge, which seems too long and not normal? I haven't used that particular machine so I'm not sure what would be normal...

Getting weird responses with my fine-tuned model

Have you tried restarting the machine? Maybe restarting the kernel isn’t enough. Since it worked the first time before restarting the kernel it seems there isn’t an issue with the...

could not download data

I've fixed the dataset issue.