PageIndex
PageIndex copied to clipboard
📑 PageIndex: Document Index for Reasoning-based RAG
- Document all 15 unique prompts used for LLM interactions - Organize by category: TOC detection, structure generation, verification, etc. - Include API configuration and common patterns - Add detailed...
## Problem PageIndex currently only supports OpenAI models, limiting user choice and potentially increasing costs. ## Solution Add unified interface supporting both OpenAI GPT-4 and Gemini 2.5 Flash models with...
I followed readme and tried: ```sh $ python3 run_pageindex.py --pdf_path /path/to/your/document.pdf ``` on my document, however, `run_pageindex.py` doesn't return the text of the content, only summaries: ``` { 'doc_name': 'referee_guidelines_arm.pdf',...
In the example there is a line code. `node_map = utils.create_node_mapping(tree)` but the function is not there
In the document, make the AI model choose which document to use based on the information list of the document. The problem is that some important information may not be...
Greetings to the creators of Page Index, I recently discovered your library, and it seems to be an excellent solution. From the description, it looks ideal for structured documents such...
ERROR:root:Failed to parse JSON even after cleanup Traceback (most recent call last): File "G:\agent_service\outside_tools\PageIndex\run_pageindex.py", line 67, in toc_with_page_number = page_index_main(args.pdf_path, opt) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "G:\agent_service\outside_tools\PageIndex\pageindex\page_index.py", line 1102, in page_index_main return asyncio.run(page_index_builder())...
If we have thousands or millions of documents, using LLM's to pick the documents will take a long time and will be quite expensive as well. Is there any other...
Added Cost Tracking and supports AzureOpenAI.