faasd icon indicating copy to clipboard operation
faasd copied to clipboard

faas-cli up/deploy failing right after build

Open vnourdin opened this issue 2 years ago • 3 comments

First of all, thanks a lot for all the great job done on this amazing tool. My current POC makes me pretty confident that Faasd is exactly what I need for my web dev work.

Problem

Each time I deploy a function, either with faas-cli up or with faas-cli deploy right after a build, it fails with error:
Unexpected status: 500, message: error deleting container MY_FUNCTION, MY_FUNCTION, cannot delete running task MY_FUNCTION: failed precondition.

Workaround

If I run faas-cli deploy a second time, it succeeds.

This is ok when deploying from my computer, but I am working on my CI/CD, and I am not really comfortable doing deploy twice to patch this.

My server environment

  • Debian 11
  • 1 vCPU
  • 2 Go RAM
  • faasd version: 0.16.2 commit: b7be42e5ec47bc9a52eb3459b0f3084d61c55e58
  • containerd v1.6.4 212e8b6fa2f44b9c21b2798135fc6fb7c53efc16

vnourdin avatar Jul 27 '22 08:07 vnourdin

@vnourdin I tried to produce the issue, but could not do it. Possibly if you can explain in brief, how you are trying to do it. Are there request in-flight while you are trying to deploy the function?

nitishkumar71 avatar Jul 31 '22 11:07 nitishkumar71

In fact, each time I try to deploy (with faas-cli up or deploy), it fails with the failed precondition error, and if I retry right after, it succeeds. This doesn't seem relative to the build. There is no running functions when I deploy, as I only have one function deployed and call it by hand for testing. For the records, my faasd server is hosted on a VPS and I deploy from my computer.

vnourdin avatar Aug 04 '22 10:08 vnourdin

@welteki Can you please give it a try, if you can reproduce this issue?

nitishkumar71 avatar Aug 06 '22 15:08 nitishkumar71

I can't reproduce this and use faasd weekly, so suspect it's related to timeouts or your configuration / image size.

We'll need to see your stack.yml file and the image size found with ctr -n openfaas-fn

alexellis avatar Aug 18 '22 11:08 alexellis

Here is my stack.yml:

provider:
  name: openfaas
  gateway: https://my.gateway

functions:
  my-function:
    lang: faas-server
    handler: ./my-function
    image: registry.gitlab.com/vnourdin/my-project/my-function

And sudo ctr -n openfaas-fn images list tell me that the image size is 55.8 MiB.

I use a custom template, maybe I missed something in my Node server's code ? Here is my Dockerfile:

FROM --platform=linux/amd64 node:18-alpine

ENV AWS_AK=???
ENV AWS_SK=???

WORKDIR /usr/src/function
COPY function/package.json .
RUN npm i
COPY function/ .

WORKDIR /usr/src/
COPY index.js .

CMD ["node", "index.js"]

and my index.js contains a simple vanilla Node http server.

vnourdin avatar Aug 19 '22 09:08 vnourdin

If I'm interpreting your set-up correctly, I would be interested in what happens if you add a docker pull registry.gitlab.com/vnourdin/my-project/my-function between the ~build~ push and deploy steps.

rgee0 avatar Aug 19 '22 09:08 rgee0

I can't docker pull as I haven't installed docker on the fassd host. Even without changing the image, deploy fails, but the image is already on the host from the previous deployment...

vnourdin avatar Aug 19 '22 11:08 vnourdin

I was thinking more about the CI/CD steps. You're clearly (now) working interactively.

Does this look similar? https://github.com/alexellis/faas-containerd/issues/28

rgee0 avatar Aug 19 '22 11:08 rgee0

@vnourdin perhaps you can try this example? https://github.com/alexellis/expressjs-k8s

alexellis avatar Aug 20 '22 09:08 alexellis

Yes, I haven't yet set up my CI/CD. It looks similar indeed, I have a task in the RUNNING state, the first deploy fails but the task become STOPPED and then I can deploy.

vnourdin avatar Aug 20 '22 09:08 vnourdin

/set title: Question about updating functions via faas-cli

alexellis avatar Aug 20 '22 09:08 alexellis

faas-cli deploy is blocking with faasd, rather than asynchronous as it is with Kubernetes.

It could be that your Node.js image is huge and most of us use smaller images.

alex@am1 expressjs-k8s % faas-cli deploy -g 192.168.1.15:8080
Deploying: expressjs.
WARNING! You are not using an encrypted connection to the gateway, consider using HTTPS.

Is OpenFaaS deployed? Do you need to specify the --gateway flag?
Post "http://192.168.1.15:8080/system/functions": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Function 'expressjs' failed to deploy with status code: 500

I saw a timeout during the deployment, which I think may be set to 30 seconds for an initial, cold pull. alexellis2/service:0.4.1 is around 177MB in size.

The second time around, all the layers were cached, so it completed within 1-2 seconds:

alex@am1 expressjs-k8s % faas-cli deploy -g 192.168.1.15:8080
Deploying: expressjs.
WARNING! You are not using an encrypted connection to the gateway, consider using HTTPS.
Function expressjs already exists, attempting rolling-update.

Deployed. 200 OK.
URL: http://192.168.1.15:8080/function/expressjs

alex@am1 expressjs-k8s % 

I use faasd a lot, even with Node.js and haven't seen the error you had here.

So it could be due to the timeout you've set (if you've not shared that?)

You also don't appear to be using our templates or watchdog, which may make things worse for you, because you're going "off specification"

See also: https://github.com/openfaas/templates/tree/master/template/node17

alexellis avatar Aug 20 '22 09:08 alexellis

I have the same behavior with your expressjs example.

The initial deploy succeed, it's only when a task is currently running that I can't deploy.

I haven't set any timeout.

I wanted to make my own template to understand how it works and be more comfortable with faasd, then maybe switch to your node17 template.

vnourdin avatar Aug 20 '22 09:08 vnourdin

I have tried to reduce timeouts, as advised in the https://github.com/alexellis/faas-containerd/issues/28 issue, but it doesn't help

ENV exec_timeout="1s"
ENV write_timeout="1s"
ENV read_timeout="1s"

vnourdin avatar Aug 20 '22 09:08 vnourdin

Check out the troubleshooting chapter of the faasd manual and see what logs you can find during the redeployment?

You may also have faasd set to "always pull" new images, which could be making it take a bit longer if you're on a slow internet connection.

I use faasd quite a lot, even with Node and generally, I've not had issues around this, or had them reported from other users.

Whatever we can learn about your changes and customisations will help us to help you. I'd also suggest moving to our template if you're having issues with your own.

Alex

alexellis avatar Aug 22 '22 14:08 alexellis

I'm not really comfortable being "forced" to buy the ebook to be able to complete my POC, it's kind of frustrating if I decide to go with another tool, but I understand it's a way of funding the product.

I don't know how this pull policy can be set, I haven't touched it. I'm on a VPS, so the network isn't a problem.

I have the same behavior with your express example, but not with my function running on the node17 template. I suppose it's coming from the missing watchdog. It's still curious if you don't have that problem on your side.

I'll buy the ebook and dig the debug part to see if I can get more details. I could go with the node17 template, but it involves reworking all my existing functions (from Netlify) and I would like to avoid that.

Thanks for your time !

vnourdin avatar Aug 22 '22 15:08 vnourdin

Hey @vnourdin no worries, we'll close this issue then.

Thanks for your interest in faasd.

Alex

alexellis avatar Aug 26 '22 10:08 alexellis