PageIndex icon indicating copy to clipboard operation
PageIndex copied to clipboard

Issue: Token limit not enforced β†’ `tokens_limit_reached` + unpacking crash

Open Chebil-Ilef opened this issue 7 months ago β€’ 2 comments

Test setup:

  • PageIndex CLI (main branch, April 27)
  • Model: gpt-4o-mini (context ~8k tokens)
  • Python 3.10 on Ubuntu 22.04
  • PDF: 22 pages (14k tokens)
  • Command:
    python3 run_pageindex.py --pdf_path docs/q1-fy25-earnings.pdf --model gpt-4o-mini --max-tokens-per-node 100 --max-pages-per-node 1
    

Observed Behavior

  1. Repeated 413 errors (tokens_limit_reached) from OpenAI API despite:

    • Low token and page limits (--max-tokens-per-node 100)
    • Small model (gpt-4o-mini)
  2. Eventually crashes with:

    response, finish_reason = ChatGPT_API_with_finish_reason(model=model, prompt=prompt)
    ValueError: too many values to unpack (expected 2)
    
  3. Excessive retries: code continues attempting oversized calls without checking token count.


Root Cause (Hypothesis)

  • --max-tokens-per-node is not actually enforced during prompt construction.
  • no CAPPING based on MAX_TOKEN supported by model , maybe adding a MAX_TOKEN FLAG can be useful ?
  • Functions like generate_toc_init, generate_toc_continue, and ChatGPT_API_with_finish_reason appear to send raw or multi-page inputs to OpenAI without validating token length.
  • The system assumes high-context models (e.g., GPT-4-32k), but fails silently on lower ones.

Expected Behavior

  • The CLI should:
    • Clip or truncate input prompts based on --max-tokens-per-node
    • Warn or skip LLM usage if the input exceeds model limits
    • Avoid crashing if the API call returns an unexpected format (unpacking issue)

Suggested Fixes

  1. Enforce prompt length budget:

    • Add count_tokens(prompt, model) checks before each LLM call ?
    • Truncate input text or skip summarization if it exceeds max-tokens-per-node ?
  2. Improve error handling:

    • Safeguard unpacking in ChatGPT_API_with_finish_reason() with:
      try:
          response, finish_reason = ...
      except ValueError:
          return response, "error"
      
  3. Add a --safe-mode or --skip-llm flag to avoid LLM calls altogether for lightweight use ? add --max_total_tokens flag ?


Confirmed Working (Sanity Check)

  • On a 5-page PDF, (3k tokens) everything works fine with gpt-4o-mini β€” confirming it’s a token overflow issue.

Chebil-Ilef avatar Apr 27 '25 11:04 Chebil-Ilef

Thanks for the detailed report and suggestions! We're actively working on a fix and will keep you updated.

zmtomorrow avatar May 01 '25 14:05 zmtomorrow

May I know any update?

limcheekin avatar Nov 08 '25 08:11 limcheekin