pg-aiguide
pg-aiguide copied to clipboard
feat(postgis): add PostGIS documentation support
Summary
This PR adds comprehensive PostGIS documentation support to pg-aiguide, enabling AI coding assistants to provide better guidance for spatial database operations.
New Features
-
PostGIS Documentation Scraper (
ingest/postgis_docs.py)- Dedicated scraper for PostGIS manual (DocBook HTML format)
- Supports both file and database storage modes
- Header-based markdown chunking with token counting
-
Search APIs
-
semantic_search_postgis_docs- Vector similarity search for PostGIS documentation -
keyword_search_postgis_docs- BM25 keyword search for PostGIS documentation
-
-
Database Migration
-
postgis_pagesandpostgis_chunkstables - HNSW index for fast vector similarity search
-
Enhanced Embedding Configuration
Added support for custom embedding providers (beyond OpenAI):
-
OPENAI_BASE_URL- Custom OpenAI-compatible API endpoint (e.g., Ollama, SiliconFlow) -
EMBEDDING_MODEL- Custom embedding model name -
EMBEDDING_DIMENSIONS- Configurable vector dimensions This allows users to use alternative embedding services while maintaining compatibility with the existing database schema.
Testing
- ✅ TypeScript build passes
- ✅ Python syntax validation passes
- ✅ Database migration tested
- ✅ Scraper tested with file mode (5 pages)
- ✅ Scraper tested with database mode (3 pages, 43 chunks)
- ✅ Semantic search verified with vector similarity queries
Usage
# Scrape PostGIS documentation
cd ingest
uv run python postgis_docs.py --version 3.5 --storage-type database
# With custom embedding provider
export OPENAI_BASE_URL=https://api.siliconflow.cn/v1
export EMBEDDING_MODEL=Qwen/Qwen3-Embedding-8B
uv run python postgis_docs.py --version 3.5 --storage-type database
Checklist
- Code follows project conventions
- All comments in English
- Documentation updated (README.md, .env.sample)
- Database migration included
- Tests performed locally