sagemaker-pytorch-training-toolkit icon indicating copy to clipboard operation
sagemaker-pytorch-training-toolkit copied to clipboard

model_fn is not recognized. Sagemaker Studio template for model building, training, and deployment

Open babarory opened this issue 4 years ago • 1 comments

Hello everyone, I'm very new on sagemaker and I'm facing a strange issue that I can't solve.

My goal : I have created a CNN that I would like to train, build and deploy in a MLOPS pipeline with sagemaker.

First of all, I created a notebook instance in SageMaker in wich i created a wasteClassification.ipynb and a train.py file. The train.py file contain my neural network definition, some function to train and save it and several overwritted function : model_fn, predict_fn, input_fn. In my wasteClassification.ipynb I was able to create a PyTorch estimator, train the model, deploy the endpoint and make prediction using invoke_endpoint function without any issues.

After that, i decided to create a pipeline to automate training, building and deployment using the new sagemaker tool for that. I have created a sagemaker studio project based on the template MLOps template for model building, training, and deployment. This template provides two gitCommit repos : modelbuild and modeldeploy. I simply modified the modelbuild repo in wich I put my train.py script in the folder "/pipelines/abalone/" and I modified the file "pipelines/abalone/pipeline.py" in which I created a pytorch estimator linked to my train.py script. When the pipeline is lauched, I can see in the training job logs that my model is training without any issue and the final endpoint is created. But when I try to invoke the endpoint (invoke_endpoint), I have an error : An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message " Please provide a model_fn implementation." This is strange because I did provide a model_fn implementation in my train.py file...

Do you have any idea to solve this issue ?

babarory avatar Apr 06 '21 07:04 babarory

@babarory Did you find the answer?

Soroush-aali-bagi avatar Oct 27 '23 21:10 Soroush-aali-bagi