KAG [Bug] [KAG] 执行example里的例子时报错：Index query vector has 1536 dimensions, but indexed vectors have 1024.

Search before asking

[x] I had searched in the issues and found no similar issues.

Operating system information

Linux

What happened

使用dev/release里的docker-compose启动四个服务后，登录UI修改密码，配置了全局参数，在页面上上传知识都是正常的。当使用KAG里的example/baike或medicine时报相同的错误：报错步骤是 cd builder && python indexer.py： org.neo4j.driver.exceptions.ClientException: Failed to invoke procedure db.index.vector.queryNodes: Caused by: java.lang.IllegalArgumentException: Index query vector has 1536 dimensions, but indexed vectors have 1024.

How to reproduce

git clone kag

cd example/baike
修改kag_config.yaml
knext project restore .
knext schema commit
cd builder && python indexer.py 此步出错：
org.neo4j.driver.exceptions.ClientException: Failed to invoke procedure db.index.vector.queryNodes: Caused by: java.lang.IllegalArgumentException: Index query vector has 1536 dimensions, but indexed vectors have 1024.

Are you willing to submit PR?

[ ] Yes I am willing to submit a PR!

Jan 17 '25 04:01 zzyyll2

修改kag_config.yaml

Jan 17 '25 07:01 caszkgui

kag_config.yaml已经修改过了，如下：

Jan 17 '25 09:01 zzyyll2

原因找到了，是接口没有传递dimensions导致的，请问这里为什么不传递demensions呢？

Jan 17 '25 17:01 zzyyll2

原因找到了，是接口没有传递dimensions导致的，请问这里为什么不传递demensions呢？

The vector dimensions of the text-embedding-ada-002 model is 1536 and fixed. It's not a parameter.

Jan 21 '25 02:01 xionghuaidong

好的，我懂了，vector_dimensions的值要与model值对应，这个tips希望能录入到文档里，避免大家走弯路，谢谢。

如上图所示，假如我用text-embedding-3-large，这个维度是可变的，最小512，最大3072，如果使用这个模型的话，是设置vector_dimensions这个就可以了吗？我能把这个值设置成512吗？

Jan 21 '25 07:01 zzyyll2

I have met the similar issue with mismatch query vector and index vector dimensions while testing on the. I requested the bge-m3 model in the openai model but the response embedding is 512 d instead of 1024 d, and the values are around 0. The http request within the code seems to set encoding_format to base64 instead of float, while I set it to float, the response is 1024 d. I'm not sure whether this parameter is the cause of mismatched dimensions. Does the indexing and the querying code share the same embedding procedure and parameters? If so, there should not be mismatch between indexing and querying

Feb 25 '25 03:02 zoidburg