sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

ParamValidationError thrown when setting parallelism_config in pipeline.upsert, pipeline.create etc.

Open fahran-wallace opened this issue 1 month ago • 0 comments

Quick version

The Problem

  • When calling pipeline.upsert (code), if you pass in aparallelism_config value with the correct type of ParallelIismConfiguration like this:
pipeline.upsert(
        config.sm_role,
        parallelism_config=ParallelismConfiguration(max_parallel_execution_steps=5),
    )

boto3 throws a ParamValidationError.

ParamValidationError: Parameter validation failed:
Invalid type for parameter ParallelismConfiguration, value: <sagemaker.workflow.parallelism_config.ParallelismConfiguration object at 0x16a80cbd0>, type: <class 'sagemaker.workflow.parallelism_config.ParallelismConfiguration'>, valid types: <class 'dict'>
  • Boto3 is expecting a dict, and isn't able to handle the ParallelismConfiguration object, which is passed through to it.
  • If you convert the ParallelismConfiguration to a dict before passing it in, using the to_request() method, the call succeeds.
  • However, this does require passing in an object with an incorrect type, which angers the type checker. A recent commit has enabled type checking for this module, which caused this problem to show up.
  • The tests in test_workflow.py erroneously pass in a dict, rather than the correct ParallelismConfiguration object.
  • Discovered in v2.254.1. It's been present for...quite a while I think. Issue became obvious in v2.245.0 when type validation was enabled.
  • I quickly checked pipeline.create and pipeline.update too - the behaviour seems to be the same.

The fix

Presumably, call to_request() somewhere in workflows/pipeline.py when handling the ParallelConfiguration parameters.

Long Version

  • We recently upgraded from v2.243.2 to v2.254.1
  • This included the fix that enabled type checking:
  • Our build failed our mypy type checking, as we were calling pipeline.upsert like this:
pipeline.upsert(
        config.sm_role,
        parallelism_config=ParallelismConfiguration(max_parallel_execution_steps=5).to_request(),
    )

to_request() was converting the ParallelismConfiguration into a RequestType, which under the hood was a dict.

  • The definition for upsert has long been
def upsert(
        self,
        role_arn: str = None,
        description: str = None,
        tags: Optional[Tags] = None,
        parallelism_config: ParallelismConfiguration = None,
    ) -> Dict[str, Any]:
  • The fix of the types therefore caused our type validator mypy to flag the issue - we were passing in a dict, and the correct type was a ParallelismConfiguration.
  • However, when we fixed it, removing the to_request() call, we get the following error when invoking the method:
ParamValidationError: Parameter validation failed:
Invalid type for parameter ParallelismConfiguration, value: <sagemaker.workflow.parallelism_config.ParallelismConfiguration object at 0x16a80cbd0>, type: <class 'sagemaker.workflow.parallelism_config.ParallelismConfiguration'>, valid types: <class 'dict'>

fahran-wallace avatar Dec 03 '25 17:12 fahran-wallace