OmniThink
OmniThink copied to clipboard
[EMNLP 2025] OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking
OmniThink
Expanding Knowledge Boundaries in Machine Writing through Thinking
👏 Welcome to try OmniThink in our
Modelscope online demo and 🤗HuggingFace online demo!
Table of Contents
- 🚩Acknowledgement
- 🌻Quick Start
- 🌟Introduction
- 🔧Dependencies
- 🔍Local Search Support
- 📉Results
- 🧐Evaluation
🔔News
2025-08-24, We have added offline local search support using RAGFlow technology! Now you can search local documents without internet connection.2025-03-12, We have optimized the Docker usage for OmniThink.2025-02-20, We have added the evaluation methods from the paper to OmniThink, and in the future, we will integrate more evaluation methods.2025-01-28, We have provided support for the deepseek-reasoner model. You can try running ./examples/deepseekr1.py to test OmniThink's performance within deepseek-reasoner.
Previous News
2025-01-18, we open-sourced OmniThink, a machine writing framework.
🌻Acknowledgement
- This work is implemented by DsPY, STORM Sincere thanks for their efforts.
- We are also very grateful to Zhangjiabao-nudt and techshoww for their contributions to this repository.
- if you have any questions, please feel free to contact via [email protected], [email protected] or [email protected] or create an issue.
📖 Quick Start
- 🌏 The Online Demo is avaiable at ModelScope now!
📌 Introduction
Welcome to OmniThink, an innovative machine writing framework designed to replicate the human cognitive process of iterative expansion and reflection in generating insightful long-form articles.
- Iterative Expansion and Reflection: OmniThink uses a unique mechanism that simulates human cognitive behaviors to deepen the understanding of complex topics.
- Enhanced Knowledge Density: OmniThink focuses on expanding knowledge boundaries, resulting in articles that are rich in information and insights.
- Comprehensive Article Generation: OmniThink constructs outlines and generates articles, delivering high-quality content that is both coherent and contextually robust.
🛠 Dependencies
📦 Conda
conda create -n OmniThink python=3.11
git clone https://github.com/zjunlp/OmniThink.git
cd OmniThink
# Install requirements
pip install -r requirements.txt
🔍 Local Search Support
OmniThink now supports offline local search using RAGFlow technology! This feature allows you to:
- Search local documents without internet connection
- Use vector embeddings for semantic search
- Index and retrieve your own document collections
- Maintain data privacy with local-only processing
Local Search Features
- OfflineRAGFlow: Core RAG engine with FAISS vector database
- LocalSearch: DSPy-compatible search interface
- Sentence Transformers: High-quality text embeddings
- Smart Chunking: Intelligent document segmentation
- Semantic Retrieval: Context-aware search results
Quick Local Search Setup
from src.tools.rm import OfflineRAGFlow, LocalSearch
# Initialize the local RAG engine
rag_engine = OfflineRAGFlow(
model_name="sentence-transformers/all-MiniLM-L6-v2",
chunk_size=800,
overlap=120,
k=5
)
# Add documents to your local index
rag_engine.ingest(
text="Your document content here...",
meta={"title": "Document Title", "doc_id": "doc1"}
)
# Create DSPy-compatible search interface
local_search = LocalSearch(search=rag_engine, k=3)
# Use in your DSPy pipeline
results = local_search.forward("your search query")
🐳 Docker
git clone https://github.com/zjunlp/OmniThink.git
docker pull zjunlp/omnithink:latest
docker run -it zjunlp/omnithink:latest
🔑 Before running, please export the LM API key and SEARCH key as an environment variable:
export LM_KEY=YOUR_API_KEY
export SEARCHKEY=YOUR_SEARCHKEY
Local Search Dependencies
For local search functionality, additional packages are required:
# Install local search dependencies
pip install sentence-transformers faiss-cpu numpy
# Or use the updated requirements.txt
pip install -r requirements.txt
You can define your own LM API and SEARCH API
Note that the output of the LM should be a LIST.
Results in OmniThink
The preformance of OmniThink is shown below:
Generate Article in OmniThink
Just one command required
sh run.sh
You can find your Article, Outline and mindmap in ./results/
🔍 Evaluation
We provide convenient scripts for evaluating your method. The evaluation is divided into three categories: Rubric_Grading, Knowledge_Density, and Information_Diversity.
We use the factscore library. Please run the following code before starting the evaluation.
cd eval
git clone https://github.com/shmsw25/FActScore.git
For Rubric Grading
python Rubric_Grading.py \
--articlepath articlepath \
--modelpath modelpath
For Information Diversity
python Information_Diversity.py \
--mappath mappath \
--model_path model_path
For Knowledge_Density
python Knowledge_Density.py \
--articlepath articlepath \
--api_path api_path \
--threads threads
Citation
If you find our repo useful in your research, please kindly consider cite:
@misc{xi2025omnithinkexpandingknowledgeboundaries,
title={OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking},
author={Zekun Xi and Wenbiao Yin and Jizhan Fang and Jialong Wu and Runnan Fang and Ningyu Zhang and Jiang Yong and Pengjun Xie and Fei Huang and Huajun Chen},
year={2025},
eprint={2501.09751},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.09751},
}