Ben Cox

Results 14 comments of Ben Cox

It would be great to be able to mix the benefit of `RecursiveCharacterTextSplitter` (i.e. splitting by useful things like sentences and paragraphs) with custom tokenisers (i.e. the knowledge that a...

Thanks a lot @bhperry, it wasn't clear to me that it worked for the subclasses like RCTS so I'm very happy to have that confirmed.

@bhperry Are there recommendations for settings we need to apply to the splitter or tokenizer when attempting to using `RecursiveTextCharacterSplitter` with a custom length function? The calculation isn't coming out...

Ah, nice! Little tweak, just need to count the list length: ```python def length_function(text): return len(tokenizer.encode(text, add_special_tokens=False)) ``` Thank you!