text-splitting topic

List text-splitting repositories

semchunk

158
Stars
9
Forks
Watchers

A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.

semantic-chunking

28
Stars
1
Forks
Watchers

🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows

gosbd

16
Stars
3
Forks
Watchers

A sentence splitting (sentence boundary disambiguation) library for Go. It is rule-based and works out-of-the-box.