Antoni Baum issues

Results 33 issues of


                                            Antoni Baum

[Train] Update documentation to show how to use `torchmetrics` for metric aggregation

### Description For some applications of Ray Train, it may be desirable to report metrics from all workers and/or report aggregations, such as mean and std. While Train currently reports...

enhancement

docs

train

air

ray-team-created

[RFC][Train] Allow for reporting results from multiple workers

### Description # Current state Currently, Ray Train only reports metrics from the first worker. This is fine in most cases, but for some applications, it may be desirable to...

enhancement

RFC

train

ray-team-created

[Train/CI] Fix flaky `test_reserved_cpu_warnings`

Signed-off-by: Antoni Baum ## Why are these changes needed? The issue seems to have been caused by Ray tasks / actors being sometimes kept alive between `fit` calls before garbage...

[Docs] Add tips on what to use for SSML

Signed-off-by: Antoni Baum ## Why are these changes needed? Adds a tip explaining to users when to use what tool for SSML. ## Related issue number ## Checks - [...

@author-action-required

[Tune] Checkpoints returned by Tuner do not point to cloud

### What happened + What you expected to happen If a Tuner has a `sync_config` set, the expectation is that the `Checkpoint`s contained within `Result`s returned would point to the...

bug

tune

ray-team-created

[2023-08-28] Sunsetting non-Llama model examples.

_This issue serves as a "news ticker" for the Aviary frontend._ **Current news:** We're sunsetting the non-Llama models as of the 0.3.0 release. The reason is because we've seen the...

Do not init process group if already initialized

# What does this PR do? Simple tweak to skip initialization of the torch process group if one is already initialized. ## Before submitting - [ ] This PR fixes...

[Dashboard] Confusing `workers` terminology

### Description In Ray Cluster Configuration, "worker" refers to worker nodes. However, in Dashboard, "worker" refers to Ray processes on nodes (which are also confusingly named "hosts"). This is inconsistent...

enhancement

dashboard

[Docs/AIR] Add install snippet to examples

## Why are these changes needed? Adds a cell to install the necessary libraries for AIR examples ## Related issue number ## Checks - [ ] I've signed off every...

stale

[Serve] Support callables in `serve.batch` args

## Why are these changes needed? This PR allows for callables to be used in `serve.batch` arguments, allowing users to add logic to dynamically change the max batch size and...