len

Results 120 comments of len

chatgpt training process is publicly available, but the results are highly dependent on the fine-tuned datasets that openai, which is hard to do I think we can try webgpt, which...

I saw a project about chatgpt and felt compelled to share https://github.com/hpcaitech/ColossalAI They have made a very magical optimization of the training process. It is said that chatgpt(small) can be...

I know this is a old issue, but it seems no one has found a solution yet. I have found a solution that good work for me, so I want...

I know this is a old issue, but it seems no one has found a solution yet. I have found a solution that good work for me, so I want...

It looks like encoder => gpt => decoder maybe we can be trained alone? nanoBERT + nanoGPT

I tried this unet architecture, and it doesn't seem to work well for text generation trained model will output a lot of noise. According to my test, this noise (garbled...

> > that is to say, the unet model tends to require input with the same size as the training data, otherwise it will force complete the insufficient parts use...

> Sorry for the noise, but I have one last question. > > > But still didn't get good results, only word-level learning (based on tiktoken's train data), can output...

> Are you padding the input? If you are not padding the input to be a multiple of 32 (5 compression blocks, so: 2^5 = 32), it will have issues....

Thank you suggestions. The solution proposed by @shahradelahi to create an async function inside the event callback is essentially another form of the quick_message middleware that I have implemented, as...