amazon-sagemaker-examples icon indicating copy to clipboard operation
amazon-sagemaker-examples copied to clipboard

[Bug Report] Model Evaluation in abalone HPO example, for 2 best models

Open edesz opened this issue 2 years ago • 4 comments

Link to the notebook Add the link to the notebook.

HPO Notebook: https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-pipelines/tabular/tuning-step/sagemaker-pipelines-tuning-step.ipynb

This is a really useful notebook (especially for beginners). Thanks for putting this together!

Describe the bug A clear and concise description of what the bug is.

I am trying to evaluate multiple models in the model evaluation step of the abalone pipeline in this notebook.

In this notebook, the top 2 models are created in the step before model evaluation (see best_model = Model(...) and second_best_model = Model()).

I want to evaluate both of these models. Currently, the notebook only evaluates the best model. There is a comment in the notebook

# This can be extended to evaluate multiple models from the HPO step

I am trying to modify the model evaluation step of the pipeline to do this.

To reproduce A clear, step-by-step set of instructions to reproduce the bug.

In order to evaluate 2 models, I changed the following block of code

step_eval = ProcessingStep(
    name="EvaluateTopModel",
    processor=script_eval,
    inputs=[
        ProcessingInput(
            source=step_tuning.get_top_model_s3_uri(top_k=0, s3_bucket=model_bucket_key),
            destination="/opt/ml/processing/model",
        ),
        ProcessingInput(
            source=step_process.properties.ProcessingOutputConfig.Outputs["test"].S3Output.S3Uri,
            destination="/opt/ml/processing/test",
        ),
    ],
    outputs=[
        ProcessingOutput(output_name="evaluation", source="/opt/ml/processing/evaluation"),
    ],
    code="evaluate.py",
    property_files=[evaluation_report],
    cache_config=cache_config,
)

I replaced the above block of code by the following

inputs=[
        ProcessingInput(
            source=step_tuning.get_top_model_s3_uri(top_k=0, s3_bucket=model_bucket_key),
            destination="/opt/ml/processing/model",
        ),
        ProcessingInput(
            source=step_tuning.get_top_model_s3_uri(top_k=1, s3_bucket=model_bucket_key),
            destination="/opt/ml/processing/model",
        ),

since k = 1 will get the second-best model. In evaluate.py, I replaced

model_path = "/opt/ml/processing/model/model.tar.gz"
with tarfile.open(model_path) as tar:
    tar.extractall(path=".")
logger.debug("Loading xgboost model.")
model = pickle.load(open("xgboost-model", "rb"))

by

model1_path = "/opt/ml/processing/model/model.tar.gz"
with tarfile.open(model1_path) as tar:
    tar.extractall(path="./model1")

model2_path = "/opt/ml/processing/model/model.tar.gz"
with tarfile.open(model1_path) as tar:
    tar.extractall(path="./model2")

logger.debug("Loading xgboost model.")
model1 = pickle.load(open("model1/xgboost-model", "rb"))
model2 = pickle.load(open("model2/xgboost-model", "rb"))

Logs If applicable, add logs to help explain your problem. You may also attach an .ipynb file to this issue if it includes relevant logs or output.

I first tried to see if I could simply change k=0 to k=1 (with no other changes anywhere) and get the pipeline to run successfully. When I run the pipeline by

  • replacing k=0 by k=1 in the pipeline step (with no other changes)
  • making no changes to the original evaluate.py

I get the following error in the Model Evaluation step

>>> execution = pipeline.start()
>>> execution.list_steps()

{'StepName': 'EvaluateTopModel',
  'StartTime': datetime.datetime(...),
  'EndTime': datetime.datetime(...),
  'StepStatus': 'Failed',
  'AttemptCount': 0,
  'FailureReason': 'ClientError: Cannot access S3 key.',
  'Metadata': ...}

This suggests to me that it cannot access the s3 key for k=1. This is confusing since 2 models are clearly being created in the previous steps of the pipeline (step_create_first and step_create_second).

Second, when I run the pipeline with the exact modifications I have shown in the To reproduce section above

  • adding an input with k=1
  • changing evaluate.py as I described above

I get an error about not being able to handle duplicate keys. Presumably, it is having trouble with destination="/opt/ml/processing/model", being the same in both step inputs (k=0 and k=1).

Question I seem to be having some trouble adding a second model to the model evaluation step of this pipeline. How can I modify this pipeline to evaluate the top 2 models?

edesz avatar Mar 14 '22 17:03 edesz