len comments

Results 120 comments of

len

Making nano chatgpt

chatgpt training process is publicly available, but the results are highly dependent on the fine-tuned datasets that openai, which is hard to do I think we can try webgpt, which...

I saw a project about chatgpt and felt compelled to share https://github.com/hpcaitech/ColossalAI They have made a very magical optimization of the training process. It is said that chatgpt(small) can be...

OSError: All ports from 7860 to 7959 are in use. Please close a port

I know this is a old issue, but it seems no one has found a solution yet. I have found a solution that good work for me, so I want...

[Bug]: OSError: Port 7860 is in use. If a gradio.Blocks is running on the port, you can close() it or gradio.close_all().

I know this is a old issue, but it seems no one has found a solution yet. I have found a solution that good work for me, so I want...

GPT with UNet architecture gets the loss down to ~1.0 with no significant computation costs.

It looks like encoder => gpt => decoder maybe we can be trained alone? nanoBERT + nanoGPT

GPT with UNet architecture gets the loss down to ~1.0 with no significant computation costs.

I tried this unet architecture, and it doesn't seem to work well for text generation trained model will output a lot of noise. According to my test, this noise (garbled...

GPT with UNet architecture gets the loss down to ~1.0 with no significant computation costs.

> > that is to say, the unet model tends to require input with the same size as the training data, otherwise it will force complete the insufficient parts use...

GPT with UNet architecture gets the loss down to ~1.0 with no significant computation costs.

> Sorry for the noise, but I have one last question. > > > But still didn't get good results, only word-level learning (based on tiktoken's train data), can output...

GPT with UNet architecture gets the loss down to ~1.0 with no significant computation costs.

> Are you padding the input? If you are not padding the input to be a multiple of 32 (5 compression blocks, so: 2^5 = 32), it will have issues....

[Feature] non-blocking mode for polling

Thank you suggestions. The solution proposed by @shahradelahi to create an async function inside the event callback is essentially another form of the quick_message middleware that I have implemented, as...