wallon-ai comments

Results 9 comments of


                                            wallon-ai

[0.26.0] openai/cli.py:440: RuntimeWarning: coroutine 'FineTune.stream_events' was never awaited

> is everyone still having this issue? I keep getting the same interruption "Stream interrupted (client disconnected).", i'm hoping that its just the stream that is interrupted and not the...

what's difference between ChatGPTPluginRetriever and VectorStore Retriever

一样的

Fine-tuning

Add documentation for running inference on multiple GPUs

> @better629 My inference is also slow, though I only use a single RTX-8000 GPU. I even load the model using `load_in_8bit=True`. The inference takes around 12s for `max_new_tokens=64` and...

can not create conda environment

same issue. I'm on ubuntu.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 432.00 MiB (GPU 2; 23.65 GiB total capacity; 20.88 GiB already allocated; 259.56 MiB free;

> You ran out of GPU memory. Describe more on your setup like what you are using and what command you ran to resolve. ![f896b19c99f369fd5d354e1be3677a5](https://user-images.githubusercontent.com/25277463/225794476-276d0855-dcc4-48d6-b947-920e0dcbf804.png) batch_size=4

Train 13B data error

我也遇到了相同的问题，请问有解决办法了吗

为什么不直接用Embedding召回的结果呢？

你好，文档稍微长一点时，文档后面的内容模型好像学不到，请问你知道怎么解决这个问题吗？

为什么不直接用Embedding召回的结果呢？

但是gpt4还是有长度限制，就特别好奇chatpdf是如何解决这个问题的？ ------------------ 原始邮件 ------------------ 发件人: ***@***.***>; 发送时间: 2023年3月28日(星期二) 下午3:11 收件人: ***@***.***>; 抄送: ***@***.***>; ***@***.***>; 主题: Re: [fierceX/Document_QA] 为什么不直接用Embedding召回的结果呢？ (Issue #1) @wallon-ai 最好的办法应该是gpt4或者针对性的微调，但是微调的代价应该很大（openai 还没放出来微调api），gpt4拥有更长的上下文，这样就能解决长文档的问题。 — Reply to this email directly, view it on...