Antoni Baum
Antoni Baum
### Description For some applications of Ray Train, it may be desirable to report metrics from all workers and/or report aggregations, such as mean and std. While Train currently reports...
### Description # Current state Currently, Ray Train only reports metrics from the first worker. This is fine in most cases, but for some applications, it may be desirable to...
Signed-off-by: Antoni Baum ## Why are these changes needed? The issue seems to have been caused by Ray tasks / actors being sometimes kept alive between `fit` calls before garbage...
Signed-off-by: Antoni Baum ## Why are these changes needed? Adds a tip explaining to users when to use what tool for SSML. ## Related issue number ## Checks - [...
### What happened + What you expected to happen If a Tuner has a `sync_config` set, the expectation is that the `Checkpoint`s contained within `Result`s returned would point to the...
_This issue serves as a "news ticker" for the Aviary frontend._ **Current news:** We're sunsetting the non-Llama models as of the 0.3.0 release. The reason is because we've seen the...
# What does this PR do? Simple tweak to skip initialization of the torch process group if one is already initialized. ## Before submitting - [ ] This PR fixes...
### Description In Ray Cluster Configuration, "worker" refers to worker nodes. However, in Dashboard, "worker" refers to Ray processes on nodes (which are also confusingly named "hosts"). This is inconsistent...
## Why are these changes needed? Adds a cell to install the necessary libraries for AIR examples ## Related issue number ## Checks - [ ] I've signed off every...
## Why are these changes needed? This PR allows for callables to be used in `serve.batch` arguments, allowing users to add logic to dynamically change the max batch size and...