stakenaka

Results 3 comments of stakenaka

@tonybaloney @pamelafox Great suggestion :) It is true that the script doesn't always punctuate Japanese sentences. Currently [TextSplitter](https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/scripts/prepdocslib/textsplitter.py#L23) has `word_breaks` settings internally. IMO, it would be better to inject those...

@tonybaloney @pamelafox > add some descriptions about customization for respective languages. Providing information to use NLP related OSS, like [spaCy](https://spacy.io/), [NLTK](https://www.nltk.org/) on README is my suggestion. In fact, LangChain provides...

I'm not sure why the CI fails... Mentioning @olivakar , who is checked in the latest update. Could you take a look at it ?