deepanker13
deepanker13
Issue → PyTorch profiler not capturing Dataloader time and runtime. Always shows 0. Code used → I have used the code given in official PyTorch profiler documentation ( [PyTorch documentation](https://pytorch.org/tutorials/intermediate/tensorboard_profiler_tutorial.html))...
@andreyvelich the links in fine-tuning.md are giving 404 page not found. Am I missing something?
> > @andreyvelich the links in fine-tuning.md are giving 404 page not found. Am I missing something? > > @deepanker13 Did you check these links via Website preview: https://deploy-preview-3718--competent-brattain-de2d6d.netlify.app/ ?...
> @johnugeorge @deepanker13 Do we need to create tracking issue with remaining items for Train/Fine-tune API for LLMs ? Okay I will create one
@StefanoFioravanzo I can help with the tutorial. Also do you have any reference for api documentation?
@tenzen-y I think environment variables like PET_RDZV_ENDPOINT, PET_RDZV_BACKEND etc get set for the containers only when we pass the elastic policy spec (https://github.com/kubeflow/training-operator/blob/0b6a30cd348e101506b53a1a176e4a7aec6e9f09/pkg/controller.v1/pytorch/envvar.go#L109). And the above mentioned environment variables are...
@andreyvelich since V2 implementation has started, should we take up the remaining tasks?
shall we rename it to kubeflow/model-dataset-downloader ? cc @andreyvelich @terrytangyuan