Juha Jeronen
Juha Jeronen
Thanks, I wasn't aware of that package. Looks useful. Starred, bookmarked, and promptly forgotten. :P It seems to require some kind of AI model ([`flair`](https://github.com/flairNLP/flair)). From the viewpoint of producing...
Ok, a new Extras module it is, then. Yes, quality will definitely be better with AI-based NLP, since unlike regexes, it was actually built to do things like this. :)...
> You can have both. Just install dehyphen as a requirement for Extras API. Ok, will do. > I don't have any preference. But also "Science mode" is not a...
@Cohee1207: There's one more thing I forgot to ask about: I'd like to have a progress indicator for the ingestion process, as it can still take a minute or two...
Thanks! I'll go ahead, then. Updated schedule, expect something in the upcoming weeks. :) Summarizing, **TODO**: - **Extras**: - Add text cleanup endpoint (`/api/sanitize` or something) to Extras, using `dehyphen`...
Never mind me, just syncing this with the latest staging.
Also, small status update: I installed `dehyphen` into my `extras` venv, played around with it, and investigated it in more detail. A user-selectable, character-based AI model, from `flair`, is used...
@Cohee1207: I had planned to do this in Extras, once I find the time to work on ST again. In the meantime, I've minimally updated the PR to resolve the...
> The preferred way of adding functionality is server plugins. Ok. Thanks for the quick response! > > fast local RAG embeddings provider > > Any reason why transformer.js embeddings...
Calling into CUDA from JS? I suppose I can look into it. I'll keep you posted.