ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Bug]: LLMBundle.encode can't update token usage

Open mayanlong2020 opened this issue 1 year ago • 5 comments

Is there an existing issue for the same bug?

  • [X] I have checked the existing issues.

RAGFlow workspace code commit ID

n/a

RAGFlow image version

nightly(v0.15.0-17-g35580af8 full)

Other environment information

Ubuntu 24
Ragflow nightly with infinity doc engine

Actual behavior

*The parsing process was hanging up without any error message in the front page. log7

while, in the backend, there are a lot of LLMBundle.encode error, seems it can't update the LLM token usage data. log6

*My task info list below. And, the LLM status is good. it's really make me confused.

2024-12-22 00:50:35,739 INFO 20 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2024-12-22T00:50:35.737928", "boot_at": "2024-12-22T00:46:35.409047", "pending": 1, "lag": 38, "done": 133, "failed": 0, "current": {"id": "61e542aabfbb11efab780242ac150006", "doc_id": "59bb2120bfa311efb8850242ac150006", "from_page": 0, "to_page": 12, "retry_count": 0, "kb_id": "eb856922bfa211ef97d60242ac150006", "parser_id": "book", "parser_config": {"auto_keywords": 3, "auto_questions": 1, "raptor": {"use_raptor": true, "prompt": "\u8bf7\u603b\u7ed3\u4ee5\u4e0b\u6bb5\u843d\u3002 \u5c0f\u5fc3\u6570\u5b57\uff0c\u4e0d\u8981\u7f16\u9020\u3002 \u6bb5\u843d\u5982\u4e0b\uff1a\n {cluster_content}\n\u4ee5\u4e0a\u5c31\u662f\u4f60\u9700\u8981\u603b\u7ed3\u7684\u5185\u5bb9\u3002", "max_token": 512, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}}, "name": "2023\u5e74\u7248GMP\u6307\u5357-\u8d28\u91cf\u7ba1\u7406\u4f53\u7cfb.pdf", "type": "pdf", "location": "2023\u5e74\u7248GMP\u6307\u5357-\u8d28\u91cf\u7ba1\u7406\u4f53\u7cfb.pdf", "size": 7081164, "tenant_id": "dd78793cbfa211efaa260242ac150006", "language": "Chinese", "embd_id": "nomic-ai/nomic-embed-text-v1.5@FastEmbed", "pagerank": 10, "img2txt_id": "", "asr_id": "", "llm_id": "Qwen/Qwen2-7B-Instruct@SILICONFLOW", "update_time": 1734799699414}}

*Trying to change to the other LLM, error message is the same.

2024-12-22 01:06:57,576 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: None, progress_msg: Page(25~37): Generate 94 chunks 2024-12-22 01:06:57,787 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 304 2024-12-22 01:06:57,789 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.601063829787234, progress_msg: 2024-12-22 01:06:57,924 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 304 2024-12-22 01:06:57,928 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.6180851063829788, progress_msg: 2024-12-22 01:06:58,064 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 304 2024-12-22 01:06:58,067 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.6351063829787233, progress_msg: 2024-12-22 01:06:58,203 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 304 2024-12-22 01:06:58,206 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.652127659574468, progress_msg: 2024-12-22 01:06:58,344 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 304 2024-12-22 01:06:58,347 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.6691489361702128, progress_msg: 2024-12-22 01:06:58,477 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 266 2024-12-22 01:06:58,480 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.6861702127659575, progress_msg: 2024-12-22 01:06:58,820 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 736 2024-12-22 01:06:58,823 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.702127659574468, progress_msg: 2024-12-22 01:06:59,281 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 784 2024-12-22 01:06:59,285 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.7361702127659574, progress_msg: 2024-12-22 01:06:59,489 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 432 2024-12-22 01:06:59,492 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.7702127659574468, progress_msg: 2024-12-22 01:06:59,863 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 816 2024-12-22 01:06:59,866 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.8042553191489361, progress_msg: 2024-12-22 01:07:00,133 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 624 2024-12-22 01:07:00,136 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.8382978723404255, progress_msg: 2024-12-22 01:07:00,390 ERROR 19 LLMBundle.encode can't update token usage for dd78793cbfa211efaa260242ac150006/EMBEDDING used_tokens: 588 2024-12-22 01:07:00,393 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: 0.8723404255319148, progress_msg: 2024-12-22 01:07:00,400 INFO 19 Embedding chunks (2.82s) 2024-12-22 01:07:00,403 INFO 19 set_progress(512598e6bfbd11efb3870242ac150006), progress: None, progress_msg: Page(25~37): Embedding chunks (2.82s) 2024-12-22 01:07:00,414 INFO 19 INFINITY created table ragflow_dd78793cbfa211efaa260242ac150006_eb856922bfa211ef97d60242ac150006, vector size 768 2024-12-22 01:07:15,590 INFO 19 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2024-12-22T01:07:15.589376", "boot_at": "2024-12-22T01:03:45.295776", "pending": 1, "lag": 36, "done": 41, "failed": 0, "current": {"id": "512598e6bfbd11efb3870242ac150006", "doc_id": "59bb2120bfa311efb8850242ac150006", "from_page": 24, "to_page": 36, "retry_count": 0, "kb_id": "eb856922bfa211ef97d60242ac150006", "parser_id": "book", "parser_config": {"auto_keywords": 3, "auto_questions": 1, "raptor": {"use_raptor": true, "prompt": "\u8bf7\u603b\u7ed3\u4ee5\u4e0b\u6bb5\u843d\u3002 \u5c0f\u5fc3\u6570\u5b57\uff0c\u4e0d\u8981\u7f16\u9020\u3002 \u6bb5\u843d\u5982\u4e0b\uff1a\n {cluster_content}\n\u4ee5\u4e0a\u5c31\u662f\u4f60\u9700\u8981\u603b\u7ed3\u7684\u5185\u5bb9\u3002", "max_token": 512, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}}, "name": "2023\u5e74\u7248GMP\u6307\u5357-\u8d28\u91cf\u7ba1\u7406\u4f53\u7cfb.pdf", "type": "pdf", "location": "2023\u5e74\u7248GMP\u6307\u5357-\u8d28\u91cf\u7ba1\u7406\u4f53\u7cfb.pdf", "size": 7081164, "tenant_id": "dd78793cbfa211efaa260242ac150006", "language": "Chinese", "embd_id": "nomic-ai/nomic-embed-text-v1.5@FastEmbed", "pagerank": 10, "img2txt_id": "", "asr_id": "", "llm_id": "THUDM/glm-4-9b-chat@SILICONFLOW", "update_time": 1734800530312}}

Expected behavior

No response

Steps to reproduce

n/a

Additional information

No response

mayanlong2020 avatar Dec 21 '24 17:12 mayanlong2020

You could ignore that error message which effects on nothing.

KevinHuSh avatar Dec 23 '24 02:12 KevinHuSh

no,if ignore there is no response for my question.

f3550491 avatar Feb 16 '25 14:02 f3550491

I encountered the same bug. Is there any progress on this?

Extract embeddings: 100%|██████████| 1/1 [00:00<00:00, 11.94it/s]
2025-02-28 15:05:37,303 ERROR    42 LLMBundle.encode can't update token usage for fe3bb6d6f33511ef87ba0242c0a89006/EMBEDDING used_tokens: 191
2025-02-28 15:05:37,310 INFO     42 PUT http://es01:9200/ragflow_fe3bb6d6f33511ef87ba0242c0a89006/_bulk?refresh=false&timeout=60s [status:200 duration:0.004s]
2025-02-28 15:05:37,364 INFO     42 POST http://es01:9200/ragflow_fe3bb6d6f33511ef87ba0242c0a89006/_search [status:200 duration:0.004s]
2025-02-28 15:05:37,417 INFO     42 POST http://es01:9200/ragflow_fe3bb6d6f33511ef87ba0242c0a89006/_search [status:200 duration:0.005s]
2025-02-28 15:05:37,483 INFO     42 POST http://es01:9200/ragflow_fe3bb6d6f33511ef87ba0242c0a89006/_search [status:200 duration:0.005s]
2025-02-28 15:05:37,485 INFO     42 Trigger summary: (MINERVA-DELTA技术平台, 消化道癌)
2025-02-28 15:05:39,801 INFO     42 HTTP Request: POST https://api.lingyiwanwu.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-02-28 15:05:39,874 INFO     42 POST http://es01:9200/ragflow_fe3bb6d6f33511ef87ba0242c0a89006/_search [status:200 duration:0.003s]
Extract embeddings: 100%|██████████| 1/1 [00:00<00:00,  9.09it/s]
2025-02-28 15:05:39,991 ERROR    42 LLMBundle.encode can't update token usage for fe3bb6d6f33511ef87ba0242c0a89006/EMBEDDING used_tokens: 183

biofer avatar Feb 28 '25 07:02 biofer

You could ignore this error message: LLMBundle.encode can't update token usage.

KevinHuSh avatar Feb 28 '25 08:02 KevinHuSh

I got the same Error message in Ubuntu 24 and ragflow v0.16, it drives me crazy. I found this Error when I use embedding bge-large-zh-v1.5

dayu26 avatar Mar 02 '25 16:03 dayu26