fucktx

Results 18 comments of fucktx

> 解释 csv 有没有报错? 没有错,就是很慢的嘛,有点可以确定的是,新闻的内筒长度比较长,有的可能中文汉字就上万字

> > 2000多条新闻的csv文件,初始化了10几个小时,请教下有木有其他方式初始化,比如一个我分成100分小文档,然后分别初始化后,可以直接合并output文件下面的内容 > > 你是用OpenAI API 或者 Azure OpenAI吗?还是其他模型?好像用其他模型初始化数据都有类似的问题 我是用的其他第三方模型,接口稳定很少报错,然后默认的settings文件,我只是把 concurrent_requests: 10 batch_size: 5

> 你 修改chunk大小为1200 和 100没? 没有

> How much concurrency does your LLM model service support? chat: deepseek-chat(deepseek) concurrent_requests: 10 embeddings: embedding-2(zhipu) concurrent_requests: 5

> How are your chunks divided? Or should we divide chunks according to the openai tiktoken? It is done through python - m graphrag. index -- init -- root/ The...

> This is the original discussion about chunk size which should be able to decrease the total request and your token consumption. #460 ok,tks

> For the official chunking logic you are using, you can refer to @KylinMountain’s suggestion and try increasing the chuk size. It is best to combine the logs to carefully...

@kamikazechaser Hi, please evaluate this can merge no, thank you