void icon indicating copy to clipboard operation
void copied to clipboard

[App] [Feature] - add a config knob to the frequency at which void polls API endpoints to detect models

Open wolfspyre opened this issue 7 months ago • 2 comments

Currently as discussed in Discord the model auto-detection logic seems to be hard-coded to poll endpoints every ~10s) this causes an unnecessary volume of context switching on inferencing endpoints...

I'd propose a couple changes

  • decrease, (or cease altogether) polling frequency to an api target while an active inference operation is ongoing with that endpoint

if the endpoint is actively servicing an ongoing inferencing request, is there REALLY sufficient value in having an updated model list to warrant hitting its' api consistently?

Admittedly, not all api endpoints are the same,

Some may be abstraction loadbalancers.... .... but many are not. ...and... many of the workloads being performed ARE cpubound...

...so the polling frequency has a palpable impact on result speed .... saying this to illustrate that the polling is not without some consequence in some circumstances.

  • offer a control mechanism (Ideally overridable at each endpoint / tier ie: )
  • locality ( localhost / lan / vpn/wan/"private-remote"/external )
    • service (api / service type - lmstudio / cortex / ollllllllllllllllllllama / openai-api-compatible endpoint ; tool server api)
    • THIS SPECIFIC HOSTPORT

to control the rate of polling... The current 'poll quick all the time ' strategy may make sense in a cloudy dynamic env, or during periods of interactivity, or for example while anticipating a change, it could make sense to poll that frequently (ie while waiting for a model to load)

but otherwise, IMO 10s is aggressive for steady-state local envs... maybe look at DHCP lease times or arp cache records as a guide for what other 'durable systems' use to balance 'cache recency' and 'unnecessary noise'

I'd guess an adjustment to 30-60 as a default while also likely doing some sorta incremental backoff on each identical resultset to a max of like 5-10m or something ... or on address change

IE if you lookup the endpoint, and get different ip address than you had precviously, that could be an event which warrants validating the models you say you have match what I think you have, since your address changed ....

... either way ....

there's a few paths towards 'be less noisy, an dont poke endpoints overly agressively for no benefit' :)

maybe one of them feels aligned with the direction ya wanna go? :)

wolfspyre avatar May 15 '25 21:05 wolfspyre

Thanks for the suggestion, if anyone wants to implement this it's a good first issue. Just need to:

  1. add a slider for frequency in Settings.tsx
  2. add a new globalSettings entry in voidSettingsService.ts for the frequency
  3. use the frequency in refreshModelService instead of the 10s

Could probably ask Void Agent to build this!

andrewpareles avatar May 23 '25 07:05 andrewpareles

can i work on this?

prosperxo avatar May 25 '25 01:05 prosperxo