PageIndex
PageIndex copied to clipboard
Issue: Token limit not enforced β `tokens_limit_reached` + unpacking crash
Test setup:
PageIndexCLI (main branch, April 27)- Model:
gpt-4o-mini(context ~8k tokens) - Python 3.10 on Ubuntu 22.04
- PDF: 22 pages (14k tokens)
- Command:
python3 run_pageindex.py --pdf_path docs/q1-fy25-earnings.pdf --model gpt-4o-mini --max-tokens-per-node 100 --max-pages-per-node 1
Observed Behavior
-
Repeated 413 errors (
tokens_limit_reached) from OpenAI API despite:- Low token and page limits (
--max-tokens-per-node 100) - Small model (
gpt-4o-mini)
- Low token and page limits (
-
Eventually crashes with:
response, finish_reason = ChatGPT_API_with_finish_reason(model=model, prompt=prompt) ValueError: too many values to unpack (expected 2) -
Excessive retries: code continues attempting oversized calls without checking token count.
Root Cause (Hypothesis)
--max-tokens-per-nodeis not actually enforced during prompt construction.- no CAPPING based on MAX_TOKEN supported by model , maybe adding a MAX_TOKEN FLAG can be useful ?
- Functions like
generate_toc_init,generate_toc_continue, andChatGPT_API_with_finish_reasonappear to send raw or multi-page inputs to OpenAI without validating token length. - The system assumes high-context models (e.g., GPT-4-32k), but fails silently on lower ones.
Expected Behavior
- The CLI should:
- Clip or truncate input prompts based on
--max-tokens-per-node - Warn or skip LLM usage if the input exceeds model limits
- Avoid crashing if the API call returns an unexpected format (unpacking issue)
- Clip or truncate input prompts based on
Suggested Fixes
-
Enforce prompt length budget:
- Add
count_tokens(prompt, model)checks before each LLM call ? - Truncate input text or skip summarization if it exceeds
max-tokens-per-node?
- Add
-
Improve error handling:
- Safeguard unpacking in
ChatGPT_API_with_finish_reason()with:try: response, finish_reason = ... except ValueError: return response, "error"
- Safeguard unpacking in
-
Add a
--safe-modeor--skip-llmflag to avoid LLM calls altogether for lightweight use ? add --max_total_tokens flag ?
Confirmed Working (Sanity Check)
- On a 5-page PDF, (3k tokens) everything works fine with
gpt-4o-miniβ confirming itβs a token overflow issue.
Thanks for the detailed report and suggestions! We're actively working on a fix and will keep you updated.
May I know any update?