Victor Skvortsov

Results 121 comments of Victor Skvortsov

Replacing `__root__` with RootModel is not obvious since RootModel is not compatible with pydantic-duality. In dstack, `__root__` models are used to parse unions discriminated by type, e.g.: ```python class AWSCreds(CoreModel):...

The most promising solution at the moment seems to be the instance-per-TPU-device model. Provisioning a multi-device TPU Pod creates an instance for each TPU device. For example, provisioning TPU v2-32...

I'd start with `instance` since it's what `dstack pool` currently works with. We can discuss how dstack supposed to work with clusters in a separate issue: we could expand `instance`...

The description is updated to use `fleet` configuration type.

This also seems to be the case with cudo, tensordock, vastai backends.

Currently, resource specification via `dstack run` CLI arguments is very limited. `cpu` and `memory` are not supported. This can be fixed if the CLI arguments unified with `dstack pool add`....

One hypothesis is that the panic is caused by concurrent executions of `logsWsGetHandler()`. In that case I assume the channel may be closed twice. But I'm not sure why concurrent...

@colinjc, not all runs are created from yaml configurations since runs can be submitted via Python API or HTTP API. Thus, `dstack` doesn't store the submitted yaml internally but a...

Confirmed that when mocking background tasks operations that establish ssh connections, the server CPU util does not exceed 3%.

> The correct plan should be cpu=2.. mem=8GB.. disk=100GB.. gpu: 1.. @peterschmidt85, Do you suggest to change the gpu default to `gpu: 1..`? Currently, it's `gpu: None`.