NotelyVoice [ QUESTION ] How work summarize option?

How work summarize option?

Because now I see app only trim long text (from begging) or squeeze text (delete any empty space).

No mini AI model do that? Or in future will be mini AI model like mini 0.6 qwen or 1B llama or Gemini 3 running offline

Oct 12 '25 22:10 bi4key

@bi4key There are smaller AI models that can generate summaries, but their accuracy hasn’t been very reliable in testing. I haven’t had much time to research this further, as my current focus is on the core transcription features. I plan to explore summary capabilities in the future. So for now it's only trim long text & remove stop words

Oct 13 '25 06:10 tosinonikute

Ok, thx for your work!

Here some inspiration for future text recognition and summary:

https://github.com/docling-project/docling

https://huggingface.co/ibm-granite/granite-docling-258M

https://huggingface.co/unsloth/gemma-3-270m-it-qat-GGUF

Oct 14 '25 10:10 bi4key

Thanks for the links @bi4key

Oct 14 '25 10:10 tosinonikute

I would suggest that, once an offline runnable LLM model is found, the button allows to select various pre-recorded prompts that the user can define in the options, so that we can summarize, correct typos, reformat grammar and syntax, remove redundancies, make more formal, etc.

I am a bit behind in my monitoring of the current LLM landscape but I heard that there are incredible sub 1GB LLM models nowadays.

The linked models appear more to be useful to do data extraction from various documents, similarly to what docling does but without having to running the program stack (this can be useful in some situations where you either want more flexibility than a program provides or you can only run LLM models and cannot run the docling stack - so this is very niche imho). There exists more general mini/nano LLM models.

Nov 05 '25 20:11 lrq3000

And here is nice mega thread:

https://www.reddit.com/r/LocalLLaMA/s/nBLTtkVPvK

https://huggingface.co/blog/ocr-open-models

Nov 05 '25 20:11 bi4key

@bi4key Very nice links, thank you very much!

However IMHO a text-only LLM would be sufficient and likely smaller for NotelyVoice, or do you think that OCR LLMs can provide additional value?

Nov 05 '25 21:11 lrq3000