Eero Tamminen comments

Results 721 comments of


                                            Eero Tamminen

[ChatQnA] Provide E2E performance metrics

Providing such metrics is straightforward. When user query comes in: * Timestamp first token start time * Increase `query_count` counter When replying with first token for that query: * Add...

[ChatQnA] Provide E2E performance metrics

Note: metrics should have relevant prefix, e.g. `chatqna_` for ChatQnA service, so they can be identified better.

[ChatQnA] Provide E2E performance metrics

> Note: metrics should have relevant prefix, e.g. `chatqna_` for ChatQnA service, so they can be identified better. Each metric should also have a label that identifies to which Helm...

[ChatQnA] Provide E2E performance metrics

Was more complicated than I thought, but fixed with https://github.com/opea-project/GenAIComps/pull/845

ChatQnA on Xeon Docker Implementation - Embedding into Vector DB Failing

The last warnings in the log, about things being not any more supported, look quite suspicious. Btw. @mandalrajiv It's better to provide such (long) log files as attachments (instead of...

ChatQnA on Xeon Docker Implementation - Embedding into Vector DB Failing

There's odd warning about invalid HTTP request, and I'm not sure how to interpret what your log is about, as there seem to be multiple logs, interrupted in middle? ```...

ChatQnA on Xeon Docker Implementation - Embedding into Vector DB Failing

I guess upload processing time is linearly related to amount of text => 37 page doc could take 20x longer than 2 page one. I.e. if 2 page uploaded goes...

Exceptions in application logs with larger or larger number of requests

FYI: I have (k8s) readiness probes on TGI and TEI services, because otherwise things fail due to (k8s) service endpoint sending traffic to recently scaled up TGI instances, although TGI...

Exceptions in application logs with larger or larger number of requests

> Current test 32 requests with no such issues. I don't think that is stressing it enough, if you do not see even Timeout issues. I'm seeing these kind of...

Exceptions in application logs with larger or larger number of requests

> [@eero-t](https://github.com/eero-t) I have only on machine, so I could not send large number of requests. > > Did you deploy on K8S with multiple machines? What's the rough number...