spring-ai
spring-ai copied to clipboard
TokenTextSplitter creates document with different id than it's input document
Bug description when giving a document to TokenTextSplitter after the split the resulting documents have a different id than the input. in a pipeline scenario this is completely unexpected.
Environment Kotlin + Java 21 Spring Boot 3.2.1 Spring AI 1.0.0-M1 (Also on 1.0.0-SNAPSHOT)
Steps to reproduce create a document with specific id, pass it to tokenTextSplitter, the returned document will have a different id.
Expected behavior document id should not change in textSplitter or even any DocumentTransformer for that matter unless explicitly done so.
Minimal Complete Reproducible example val input = Document("123", "content", emptyMap()) val output = TokenTextSplitter().split(input).first() assert(input.id == output.id)