PaddleCloud
PaddleCloud copied to clipboard
[Feature] Add serve command to start a serverless predict serving port
Can run `paddlecloud serve -model-path xxx -scale 100 -cpu 1 -memory 8Gi -entry "infer.py" to start a serverless URL endpoint for serve the model.
Which inferance type we support? Online or offline? If we setup an online serve, we also need to add an ingress rule for the submited serving.
Online of course.
Offline inference can use the same method as training.