Cade Daniel

Results 20 issues of Cade Daniel

## Why are these changes needed? Fixing small things that fell through the cracks ## Related issue number

docs
infra

Hi, thanks for the great work! I am setting up a GitHub Project [beta] to track incidents at my company. My flow is : [incident management SaaS] -> GitHub integration...

See https://github.com/ray-project/ray/issues/27299 for context. ### script.py Testing with the following code: ```python3 #!/usr/bin/env python3 import ray ray.init() @ray.remote def task(argument): import grpc import platform print(argument, grpc.__version__, platform.python_version()) ray.get(task.remote('Hello world')) ```...

@author-action-required
core

This PR refactors part of the Raylet's worker pool so that the cache size is determined by a separate policy component instead of the worker pool itself. The goal is...

stale

### Speculative decoding This PR is a part of a larger series of PRs implementing speculative decoding, contributed to open source vLLM by Anyscale. See https://github.com/vllm-project/vllm/pull/2188 and [Speculative decoding open...

- 2b32260a5fe3694d8677f3dc42984c90d6ef141c FAILED [Buildkite :mac: :apple: Medium A-J](https://buildkite.com/ray-project/oss-ci-build-branch/builds/4806#01891d44-9d95-4bde-8ea4-84c7e986182b) .... Generated from flaky test tracker. Please do not edit the signature in this section. DataCaseName-osx://python/ray/tests:test_advanced_5-END ....

P0
core
flaky-tracker
ray 2.10

This PR introduces the concept of lookahead scheduling. Lookahead scheduling is where we allocate KV slots for each sequence in a decode batch that do not have any token assigned...

## Note: this PR is stacked on top of https://github.com/vllm-project/vllm/pull/3250 This PR is a subset of `PR 6/9: Integrate speculative decoding with LLMEngine.` in the [speculative decoding open sourcing plan](https://docs.google.com/document/d/1rE4pr3IdspRw97XbImY4fS9IWYuJJ3HGtL7AdIKGrw8/edit)....

Recently, we refactored the block manager subsystem to improve testability by separating concerns of each layer. See https://github.com/vllm-project/vllm/pull/3492 for more information. The following features are missing to support prefix caching:...

misc