Petri Savolainen

Results 75 comments of Petri Savolainen

@rjurney given it appears you're more recently familiar with the performance issues & the improvements, it would be great if you could submit those as PR(s). Using LRU cache seems...

The test job failure (see below) is caused by an assertion error in tests/test_cleanname.py at line 478, specifically in a test checking unicode non-Latin script handling. The log indicates: AssertionError:...

Related: https://github.com/ollama/ollama/issues/5796, "Streaming for tool calls is unsupported"

sorely needing this, too...

Probably. Getting as close to real-time performance is a key success factor for AI based input augmentation, so yeah streaming will be needed to get there.