Eero Tamminen

Results 721 comments of Eero Tamminen

Providing such metrics is straightforward. When user query comes in: * Timestamp first token start time * Increase `query_count` counter When replying with first token for that query: * Add...

Note: metrics should have relevant prefix, e.g. `chatqna_` for ChatQnA service, so they can be identified better.

> Note: metrics should have relevant prefix, e.g. `chatqna_` for ChatQnA service, so they can be identified better. Each metric should also have a label that identifies to which Helm...

Was more complicated than I thought, but fixed with https://github.com/opea-project/GenAIComps/pull/845

The last warnings in the log, about things being not any more supported, look quite suspicious. Btw. @mandalrajiv It's better to provide such (long) log files as attachments (instead of...

There's odd warning about invalid HTTP request, and I'm not sure how to interpret what your log is about, as there seem to be multiple logs, interrupted in middle? ```...

I guess upload processing time is linearly related to amount of text => 37 page doc could take 20x longer than 2 page one. I.e. if 2 page uploaded goes...

FYI: I have (k8s) readiness probes on TGI and TEI services, because otherwise things fail due to (k8s) service endpoint sending traffic to recently scaled up TGI instances, although TGI...

> Current test 32 requests with no such issues. I don't think that is stressing it enough, if you do not see even Timeout issues. I'm seeing these kind of...

> [@eero-t](https://github.com/eero-t) I have only on machine, so I could not send large number of requests. > > Did you deploy on K8S with multiple machines? What's the rough number...