pcpLiu

Results 5 comments of pcpLiu

Hi @meganeura . I had similar problem as you. My `x` was `-inf` and `width` was `inf`. I added code: ``` if (x

@ruisearch42 yes, some code like ``` @serve.deployment( max_ongoing_requests=5, ) @serve.ingress(app) class VllmDeployment: .... model = OpenAIServingChat(...) ```

@akshay-anyscale ### Deployment of 1st model Ray deployment config ``` name: VllmDeployment num_replicas: 1 max_ongoing_requests: 5 max_queued_requests: -1 user_config: null graceful_shutdown_wait_loop_s: 2 graceful_shutdown_timeout_s: 20 health_check_period_s: 10 health_check_timeout_s: 30 ray_actor_options: runtime_env:...

Close the issue as we fixed this with a local patch. We submitted a PR in case it will help others on this case

@kouroshHakha We bumped the ray[server] to 2.43.0, this issue still persists. we are using vllm==0.7.2