Andrey Velichkevich

Results 1318 comments of Andrey Velichkevich

@Snehadas2005 Can you sign your commit and remove long output from your Notebook? Currently, this PR has 35k lines of changes.

@Snehadas2005 It looks like your speech recognition Notebook is failing: ``` Error loading audio file: TorchCodec is required for load_with_torchcodec. Please install torchcodec to use this function.Error loading audio file:...

@Snehadas2005 Please rebase this PR to fix GPU CI.

@Snehadas2005 Can you also rebase this PR to fix CI?

@Snehadas2005 This should be fixed after rebase. Can you add your changes on top of latest changes from the `master` branch: https://github.com/kubeflow/trainer/commits/master/

Thanks for creating this @stivanov-intercom! Yeah, this is a good point, that other resources that TrainJob controller creates won't be cleaned up. For example, Secret or ConfigMap for MPI-based runtimes....

Hi @stivanov-intercom – during the latest Trainer WG call we discussed how to introduce `TTLSecondsAfterFinished` and `ActiveDeadlineSeconds` to TrainJob API: https://youtu.be/U6oMNAN4PE8?t=2318 One open question was whether these parameters should live...

/milestone v2.2