sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

`metric_definitions` documentation claims arbitrary regex but requires metrics to be integers.

Open mkakodka-amzn opened this issue 2 years ago • 0 comments

Describe the bug The documentation for SKLearn constructor mentions that we can use custom metric_definitions with a regex. I checked that this kwarg is passed to Framework which then passes it to EstimatorBase.

This regex only captures numbers and not strings. This should either be mentioned in the documentation or support fort arbitrary strings needs to be present.

To reproduce

cli.py

if __name__ == "__main__":
    print("canary=1.01;")
    print(f"cat=dog;")

sagemaker.ipynb

from sagemaker.sklearn.estimator import SKLearn
import sagemaker
from sagemaker import get_execution_role
from copy import deepcopy
import itertools

sagemaker_session = sagemaker.Session()
role = get_execution_role()

FRAMEWORK_VERSION = "1.0-1"
script_path = "cli.py"

LOCAL=False

kwargs=dict(    
    instance_type="ml.m5.large",
    instance_count=1,
    sagemaker_session=sagemaker_session,
    base_job_name="test-job-",
)

base_model_gen = lambda: EstimatorBase(
    entry_point=script_path,
    framework_version=FRAMEWORK_VERSION,
    role=role,
    source_dir="./",
    metric_definitions=[dict(Name=m, Regex=f"{m}=(.*?);") for m in ["canary",'cat']],
    **kwargs
)
base_model_gen().fit()

Expected behavior Algorithm metrics should've displayed both canary and cat Only canary is displayed

Screenshots or logs image

System information A description of your system. Please provide:

  • SageMaker Python SDK version: '2.97.0'
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): SKLearn
  • Framework version: 1.0-1
  • Python version: 3.8.12
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context Add any other context about the problem here.

mkakodka-amzn avatar Jul 01 '22 14:07 mkakodka-amzn