LocalAI icon indicating copy to clipboard operation
LocalAI copied to clipboard

feat: add machine tag and inference timings

Open mintyleaf opened this issue 10 months ago • 2 comments

Description For my P2P endeavors i need to measure inference time and get exactly from which machine the request was sent
Since the #3687 is unfinished and didn't even connected to the routers, as well as it didn't provide enough control in P2P environment i created opt-in extension to the OpenAI response with included timings data in ms (given token count and duration in ms is enough to get the other data like tokens per second which is calculated on grpc-server.cpp glue backend just for printing out the timings)
Also there is header with machine hostname or opt-in custom tag is specified, which is useful when roaming through rent machines on services like vast.ai to get the static identifier for them

I'm still not sure about naming and putting into Reply protobuf packet two more doubles, so i'm waiting for feedback

Signed commits

  • [ ] Yes, I signed my commits.

mintyleaf avatar Jan 11 '25 00:01 mintyleaf

Deploy Preview for localai ready!

Name Link
Latest commit 4f9ed3af266bd7b1d156e7705207e1094b223206
Latest deploy log https://app.netlify.com/sites/localai/deploys/678a29c3f91dc10008c7ba47
Deploy Preview https://deploy-preview-4577--localai.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

netlify[bot] avatar Jan 11 '25 00:01 netlify[bot]

Liking that, thanks! addition of two new fields in the reply field look reasonable, my other comments are mostly on adding the machine tag with a middleware to avoid repetitions, and to control that with a flag if returning that information back or not

mudler avatar Jan 13 '25 16:01 mudler

@mudler

I've written a help message for LOCALAI_MACHINE_TAG config env variable
Now if the MachineTag option is not null the http part just uses middleware which is modifying header to that specific tag
The hostname fallback seems unreasonable, since the most time that feature possibly being used - tag shouldn't really depend on varying hostname of rented machine

I have also renamed LocalAI-Machine-Tag to just Machine-Tag header, since the header '-' separated parts can only be just capitalized, and in further use that tag transforms into Localai-Machine-Tag and confuses usage with some header parser boilerplate, which is strange, because afaik the headers are case insensitive

Waiting for your feedback!

mintyleaf avatar Jan 17 '25 07:01 mintyleaf

Thank you @mintyleaf ! just few questions inline - direction looks good to me!

Also - could you signoff your commits? git rebase origin/master --signoff will do

mudler avatar Jan 17 '25 09:01 mudler

@mintyleaf I'm merging this as-is, but will have to make sure to adapt the docs for the headers that enables extra stats, maybe care to open a follow-up? Thanks anyways!

mudler avatar Jan 17 '25 16:01 mudler