Nathan Cooper comments

Results 23 comments of


                                            Nathan Cooper

Open-Dialog Chatbots for Learning New Languages [Part 1] | IAmANerd

@TheHmmka I trained the larger model on one of my school's machines that had 4 1080ti's. I'm sure you could train it on a cloud service relatively easily though, but...

Open-Dialog Chatbots for Learning New Languages [Part 1] | IAmANerd

Hey @etrigger, could you show me the error you are getting when trying to download or generate the data? I tried to reproduce this, but it was working for me

Open-Dialog Chatbots for Learning New Languages [Part 1] | IAmANerd

@etrigger what an interesting error. I did a bit of digging and it seems to be an issue with colab in certain situation. Here is an issue about it: https://github.com/googlecolab/colabtools/issues/1771,...

Open-Dialog Chatbots for Learning New Languages [Part 1] | IAmANerd

@etrigger I have the format that dialoGPT requires in the data section of my blog: https://nathancooper.io/i-am-a-nerd/chatbot/deep-learning/gpt2/2020/05/12/chatbot-part-1.html#The-Data!. I recommend trying to first get it into a format that my code expects...

What I Learned (WIL) Neuroscience Month [Part 1] | IAmANerd

> > (don’t tell my Ph.D. advisor I said that). > > _laughs in spanish_ > > My brain got me at the ball and bat though. > > One...

Awesome Things I Learned Creating My Own Website | IAmANerd

Testing comment feature!

DeepSpeed Investigation: What I Learned | IAmANerd

@samyam thanks for the comment and discussing the use case for a single GPU, it clears up a lot of my confusion. I will test that out and update with...

DeepSpeed Investigation: What I Learned | IAmANerd

@samyam I did what you recommended and got a lot better results using larger batch sizes (doubling the batch size of the t5-large model compared to not using deepspeed). One...

DeepSpeed Investigation: What I Learned | IAmANerd

@thakkarparth007 that is a good point that I didn't think of. To me it does seem like gradient accum would be better for most cases except for the one you...

Make into proper lib add additional dependencies and CLI

@thiswillbeyourgithub I think I address all your comments. Lemme know what you think