amazon-sagemaker-examples
amazon-sagemaker-examples copied to clipboard
fix- issues 2525: update Dockerfile for dask
Issue #, if available:
2525
Description of changes:
I updated Dockerfile to install the latest version of dask. If the old Dockerfile is used, we can not build docker image for processing job. So, I fixed this problem with updating Dockerfile.
Testing done:
Done
Merge Checklist
Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.
- [x] I have read the CONTRIBUTING doc and adhered to the example notebook best practices
- [x] I have updated any necessary documentation, including READMEs
- [x] I have tested my notebook(s) and ensured it runs end-to-end
- [x] I have linted my notebook(s) and code using
tox -e black-format,black-nb-format
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
AWS CodeBuild CI Report
- CodeBuild project: sagemaker-examples-code-formatting
- Commit ID: 18de3c50f8b5cddfecabf62f14ae29eb2fe7fff4
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: sagemaker-examples-link-check
- Commit ID: 18de3c50f8b5cddfecabf62f14ae29eb2fe7fff4
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: sagemaker-examples-grammar
- Commit ID: 18de3c50f8b5cddfecabf62f14ae29eb2fe7fff4
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: amazon-sagemaker-examples-pr
- Commit ID: 18de3c50f8b5cddfecabf62f14ae29eb2fe7fff4
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: sagemaker-examples-grammar
- Commit ID: 91470a50db11fe460ff0698df56018e3bb58859f
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: sagemaker-examples-code-formatting
- Commit ID: 91470a50db11fe460ff0698df56018e3bb58859f
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: sagemaker-examples-link-check
- Commit ID: 91470a50db11fe460ff0698df56018e3bb58859f
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: amazon-sagemaker-examples-pr
- Commit ID: 91470a50db11fe460ff0698df56018e3bb58859f
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: sagemaker-examples-link-check
- Commit ID: a2e1da4ebd887b2a7d335125ba0104e43e627381
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: sagemaker-examples-code-formatting
- Commit ID: a2e1da4ebd887b2a7d335125ba0104e43e627381
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: sagemaker-examples-grammar
- Commit ID: a2e1da4ebd887b2a7d335125ba0104e43e627381
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
AWS CodeBuild CI Report
- CodeBuild project: amazon-sagemaker-examples-pr
- Commit ID: a2e1da4ebd887b2a7d335125ba0104e43e627381
- Result: SUCCEEDED
- Build Logs (available for 30 days)
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository
I tried to build the docker image in sagemaker and it failed due to lots of conflicting packages. How do I get your file changes ?
I tried to build the docker image in sagemaker and it failed due to lots of conflicting packages. How do I get your file changes ?
You can get the files from https://github.com/ksmin23/amazon-sagemaker-examples/tree/fix-sagemaker_processing-issues-2525
Thanks! I have a question. I have multiple files in S3 I'd like to preprocess and label encode it. The example is great if you have a single dataset, what if we have multiple files?
Thanks! I have a question. I have multiple files in S3 I'd like to preprocess and label encode it. The example is great if you have a single dataset, what if we have multiple files?
dask.DataFrame supports to process multiple files; please check this url: https://examples.dask.org/dataframes/01-data-access.html
So, if you would like to process multiple files in S3, you need to update preprocess.py to handle multiple files.
I think you had better check the following part of the sample code:
%%writefile preprocess.py
from __future__ import print_function, unicode_literals
import argparse
import json
import logging
......
if __name__ == "__main__":
......
input_data_path = "s3://{}".format(
os.path.join(
script_args["s3_input_bucket"],
script_args["s3_input_key_prefix"],
"census-income.csv", #TODO: need to be updated for multiple files
)
)
......