LLM-VM
LLM-VM copied to clipboard
Load-balancing / auto-scaling for LLM serving on AWS
What did you have in mind here? when you refer to autoscaling, are you referring to horizontal scalability, or parallelism across a single host?
@horahoradev I was actually referring to horizontal scaling of LLM instances
Hi, can I please work on this issue? Thank you!