aws-step-functions-data-science-sdk-python
aws-step-functions-data-science-sdk-python copied to clipboard
Step Functions Data Science SDK for building machine learning (ML) workflows and pipelines on AWS
Currently, `Workflow. create()` method do not accept logging configuration parameter. Is there a way to enable logging using aws-step-functions-data-science-sdk-python ? Refrence: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/stepfunctions.html#SFN.Client.create_state_machine
Currently, [TrainingPipeline](https://github.com/aws/aws-step-functions-data-science-sdk-python/blob/b45b282592041d3c355f7cef492798bc3bf5415a/src/stepfunctions/template/pipeline/train.py#L95) uses the same instance type and count for both train and deploy. Different instance types and counts are desirable to address the different profiles for each workload.
Hello, there seems to be an issue if `stepfunctions.steps.states.State.output()` is used together with `result_path`. For example: ``` lambda_state_first = LambdaStep( state_id="MyFirstLambdaStep", parameters={ "FunctionName": "MakeApiCall", "Payload": { "input": "20192312" } },...
`ModelStep` does not handle `ModelPackage` correctly, we need a branching logic to set the correct parameters when the model is an instance of `ModelPackage`. ModelPackage: https://github.com/aws/sagemaker-python-sdk/blob/1ff8bd623dc9a13cb38f8253d098a5fffee29833/src/sagemaker/model.py#L903
dear collaborators, I am new in AWS Step Functions Data Science SDK for Amazon SageMaker. I'm working with the example " machine_learning_workflow_abalone" of the sagemaker examples, and when execute transform_step...
The hyperparameter set through `stepfunctions.template.pipeline.train.TrainingPipeline.execute(job_name=None, hyperparameters=None)` are not picked up during execution for DeepAR. Instead hyperparameters need to be set using `estimator.set_hyperparameters(**hyperparameters)` before `estimator` is passed as argument while instantiating...
Right now managing the parameters of the `iterator` to a `map` is challenging, because there's no way to use the input schema to ensure you're passing things to the right...
Extract from the workbook "machine_learning_workflow_abalone.ipynb" When adding tags in the following estimator : mes_tags = [{'key': 'cart', 'value': 'dataengineering'}] xgb = sagemaker.estimator.Estimator( image_uris.retrieve("xgboost", region, "1.2-1"), sagemaker_execution_role, train_instance_count=1, train_instance_type="ml.m4.4xlarge", train_volume_size=5, output_path=bucket_path...
https://aws.amazon.com/blogs/aws/step-functions-distributed-map-a-serverless-solution-for-large-scale-parallel-data-processing/ Can we add distributed map state into SDK? Thank you. --- This is a :rocket: Feature Request
I'm trying to use execution inputs as container arguements for my processing job: ``` execution_input = ExecutionInput( schema={ "IngestaJobName": str, "PreprocessingJobName": str, "InferenceJobName": str, "Fecha": str, } ) ``` ```...