Anda Zhou

Results 8 issues of Anda Zhou

## Description summary of calls in existing workload sequencer for reference: ``` startup callbacks set data loaders load from checkpoint hvd.broadcast parameters/optimizer state try: for op in searcher_ops: (workloads) for...

cla-signed

## Description HTTP responses outside of /det and /api/v1 are not grouped, meaning 4xx/5xx responses will each create a new Prometheus metric label group and increase cardinality exponentially. Solution is...

cla-signed

## Description This PR migrates existing profiling metrics (system metrics only) in `trial_profiler_metrics` to generic metrics `metrics` and changes existing APIs related to the profiler to shim old APIs to...

cla-signed

## Description Implement the system metric profiling functionality in Core API. This is a complete rewrite of the old `ProfilerAgent`. Timing metrics functionality was removed and system metrics are now...

cla-signed

## Description Introduce the option of using persistent HTTP sessions in `api.Session`. Previously, each API call was wrapped in its own `requests.Session`, which meant making a new underlying TCP connection/TLS...

cla-signed

The current implementation of GCS storage uses an anonymous client without credentials. Is GCS with auth a planned feature improvement?

enhancement
plugin

## Ticket ## Description make sharded checkpoint uploads with `store_path` check for file conflicts across workers before upload. ## Test Plan ## Checklist - [ ] Changes have been manually...

cla-signed

## Ticket ## Description refactor of master-side searchers: - remove usage of searcher operations - remove usage of `max_length` master-side - rewrite of preview-search - renames trial -> run in...

documentation
cla-signed