sagemaker-python-sdk Experiments/Runs, allow user-defined Run Group names

When creating an experiment's Run instance in order to track a training or processing job, the user should have an ability to specify a custom name for this run's run_group_name .

We've been using SageMaker Experiments previously and are currently migrating from the standalone smexperiments library to the new, intergated SDK. Previously we had an ability to define a Trialand effectively group a number of experiment/job runs under this Trial's name. Our understanding is that the new run group concept is serving the same purpose, yet for a standalone job run (pipelines might be different) it is not possible to specify a user-defined run_group_name while defining the experiment's Run context.

Feb 08 '23 16:02 AndreiVoinovTR

We discovered that an additional complication arises because apparently there is a Quota for maximum 50 trial components (runs) per Trial but since the Run Group name is effectively 'fixed' per experiment (as Default-Run-Group-<experiment-name>) the whole experiment ends up limited to 50 runs :( We are currently requesting to increase this default limit.

Jun 05 '23 14:06 AndreiVoinovTR

any update on this? were you able to raise the limit or find a way to assign run group name?

Jul 03 '23 18:07 Selva163

any update on this? were you able to raise the limit or find a way to assign run group name?

We've got the limit for max trial components per experiment increased up to 200 (apparently this is the maximum possible). As to assigning the group name, the fix to this is still pending AFAIK.

Jul 04 '23 07:07 AndreiVoinovTR

In the source code of sagemaker experiments‘ Run, there is a _generate_trial_component() method that relies on sagemaker.experiments.run.TRIAL_NAME_TEMPLATE. You can overwrite that value that defaults to Default-Run-Group-<experiment-name> before you create the run. We use it like this:

import random 
import datetime 

import sagemaker.session
import sagemaker.experiments.run
import sagemaker.experiments.trial
experiment_name = 'backtesting'
sagemaker.experiments.run.TRIAL_NAME_TEMPLATE = f"week-30"

session = sagemaker.session.Session()

start_date = datetime.datetime.now() - datetime.timedelta(days=10)
with sagemaker.experiments.Run(experiment_name=experiment_name, run_display_name='champion', sagemaker_session=session) as run:
    pass

This will create the run group with your desired name.

Jul 11 '23 18:07 lorenzwalthert

In the source code of sagemaker experiments‘ Run, there is a _generate_trial_component() method that relies on sagemaker.experiments.run.TRIAL_NAME_TEMPLATE. You can overwrite that value that defaults to Default-Run-Group-<experiment-name> before you create the run. We use it like this:
import random 
import datetime 

import sagemaker.session
import sagemaker.experiments.run
import sagemaker.experiments.trial
experiment_name = 'backtesting'
sagemaker.experiments.run.TRIAL_NAME_TEMPLATE = f"week-30"

session = sagemaker.session.Session()

start_date = datetime.datetime.now() - datetime.timedelta(days=10)
with sagemaker.experiments.Run(experiment_name=experiment_name, run_display_name='champion', sagemaker_session=session) as run:
    pass
This will create the run group with your desired name.

@lorenzwalthert I feel this is more of a workaround (or a hack even) and not a proper solution. We had our fingers burned before when we relied on SageMaker private code/API, they can change it without any prior notice and would have a right to do so :( So, I'd rather wait for an official resolution.

Aug 09 '23 12:08 AndreiVoinovTR

We would also like the requested capability to be added to sagemaker. Without the ability to specify, run_group is not useful.

Aug 17 '23 05:08 Drwhit

@AndreiVoinovTR I agree that official support with docs etc. would be better, but sagemaker.experiments.run.TRIAL_NAME_TEMPLATE is strictly speaking not the private API. It's not generally uncommon to use the namespace as settings. Also, if they later change the API, my hope is that there might be another way to set the run group, so it would not be too detrimental. But I agree it's a narrow path 😄

@Selva163 @Drwhit upvoting the initial comment might be more helpful to give the issue traction instead of creating more comments without additional information (that triggers notifications).

Aug 17 '23 10:08 lorenzwalthert

I would still be grateful if one of the code owners from sagemaker-python-sdk team could comment on this issue, and maybe share with us the current status of this request in the team's backlog..

Aug 17 '23 10:08 AndreiVoinovTR

FWIW @AndreiVoinovTR, it seems that you can specify the run group in the SDK if your job is part of a pipeline, here, under Specify a Custom Run Group Name.

Sep 11 '23 08:09 lorenzwalthert

FWIW @AndreiVoinovTR, it seems that you can specify the run group in the SDK if your job is part of a pipeline, here, under Specify a Custom Run Group Name.

Thank you for the info, @lorenzwalthert. Yes, you are correct, one can specify a run group (former trial) for pipelines and that works. The pipelines-related part of the experiments API has not been refactored (externally at least). Why, and will this also be refactored eventually - this is another question.

The new API (with Run context) had been introduced only for standalone jobs (training, processing) but it seems the ability to specify a custom run group (former trial) had been 'dropped' (intentionally or not) from the new jobs API. That is exactly what is issue is about.

Sep 11 '23 08:09 AndreiVoinovTR

sagemaker-python-sdk sagemaker-python-sdk copied to clipboard

Experiments/Runs, allow user-defined Run Group names

sagemaker-python-sdk
sagemaker-python-sdk copied to clipboard