Eero Tamminen
Eero Tamminen
> Thanks a lot. I extended it to 40min, but unfortunately shards preparation haven't finish within this time if I deploy TGI service as kata-qemu-tdx (with TDX protection). Any hint...
> TD VM (kata-qemu-tdx) pod is created without persistent storage, so while deploing new TGI pod, it has to download data model from network. I assume TDX is used for...
I haven't tried using Gaudis (nor Docker-compose), but thought of few possible issues... Based on your error output, sharding is enabled. TGI tries by default to use all _available_ devices,...
> Problem here is that TEI and TGI seems to try to compete with each other for the only 1 Gaudi card, and TGI failed with the error message. Ah,...
@louie-tsai Please don't assign things to me as I'm not a developer in this project (just another user testing it).
> if could, please provide a PR for ReadME and describe the step. Thanks Sorry, that things is in so many files [1] that cleaning it is way too large...
> to move TEI embedding microservice to CPU Why? Is TEI-embedding Gaudi utilization too low for it to make sense, or is there some other reason?
> Could you provide more information about what issue the empty securityContexts cause? Such pods cannot be run in clusters with more strict pod security policies (see the "pod-security-standards" link)....
Thanks, the merged PR looks good, but there are few things that could be improved: * `/mnt` is not a good host mount point. Dirs mounted from host should be...
> PR [opea-project/GenAIInfra#153](https://github.com/opea-project/GenAIInfra/pull/153) should have resolved this Yes, looks good! Any idea when those changes get also to this (`GenAIExamples`) repository?