n8n icon indicating copy to clipboard operation
n8n copied to clipboard

`getWorkflowStaticData` doesn't always save

Open nihaals opened this issue 2 years ago • 32 comments

Describe the bug With some Code nodes, new $getWorkflowStaticData() aren't always read by future executions until a n8n restart. Future executions still (sometimes) read the previous value as if the new value wasn't being saved, but after a n8n restart it behaves as expected.

To Reproduce I haven't narrowed down exactly what's causing the issue. I have a few workflows that use getWorkflowStaticData, some of which have this issue while others don't. The steps:

  1. Create a workflow
  2. Add a Code node similar to:
const data = $getWorkflowStaticData("node")

const now = new Date().getTime()
const previous = data.previousTimestamp || "undefined"

// 10% chance of being `true`
const overwrite = Math.random() < 0.1

if (overwrite) {
    data.previousTimestamp = now
}

return [{ now: now, previous: previous, overwrite: overwrite }]
  1. Notice after an overwrite, on some executions, the value of previous is an older value while other executions it reads the correct value
  2. Restart n8n
  3. Notice every value of previous is correct again until another overwrite

Expected behavior The value of previous is always the now of the last execution where overwrite was true.

Environment (please complete the following information):

  • OS: Docker n8nio/n8n:0.231.3
  • n8n Version: v0.231.3
  • Node.js Version: Docker n8nio/n8n:0.231.3
  • Database system: SQLite
  • Operation mode: own

Additional context I've had this issue for a few months (when I started using n8n) but haven't managed to figure out solid reproduction steps. I just tried a simple schedule and Code node with the snippet above and couldn't reproduce it. Some of my workflows have this issue while others don't. The reason for the seemingly random example is to represent filtering based on changes of a remote which is what I've used Code nodes for mostly.

Example 1

An example that has never given me this issue:

const data = $getWorkflowStaticData("node")

const lastVersion = data.lastVersion
const out = []

for (const item of $input.all()) {
  if (lastVersion && lastVersion === item.json["title"]) {
    break
  }
  out.push(item)
}

const latestVersion = $input.first().json["title"]
data.lastVersion = latestVersion

out.reverse()
return out

Example 2

Another example that has never given me issues:

const data = $getWorkflowStaticData("node")

const lastTimestampProcessed = Number(data.lastTimestampProcessed || -1)

for (const tweet of $input.all()) {
  if (lastTimestampProcessed >= Number(tweet.json["created_at_timestamp"])) {
    break
  }
  ...
}

...

const latestTimestampProcessed = $input.first().json["created_at_timestamp"]
data.lastTimestampProcessed = latestTimestampProcessed

return out

Example 3

An example that has given me this issue (the input is a Sparkle Appcast):

const data = $getWorkflowStaticData("node")

const lastProcessedTimestamp = Number(data.lastProcessedTimestamp || -1)
const out = []

for (const post of $input.all()) {
  // Keep all posts newer than `lastProcessedTimestamp`
  if (lastProcessedTimestamp >= Number(post.json["timestamp"])) {
    continue
  }
  out.push(post)
}

out.sort((a, b) => (Number(a.json["timestamp"]) > Number(b.json["timestamp"])) ? 1 : -1)

const latestProcessedTimestamp = Math.max(...$input.all().map((post) => Number(post.json["timestamp"])))
data.lastProcessedTimestamp = latestProcessedTimestamp

return out

Example 4

An example that has given me this issue every time there's been an update:

const data = $getWorkflowStaticData("node")

const lastProcessedTimestamp = Number(data.lastProcessedTimestamp || -1)
$execution.customData.set("last_processed_timestamp", lastProcessedTimestamp.toString())
const out = []

for (const post of $input.all()) {
  // Keep all posts newer than `lastProcessedTimestamp`
  if (lastProcessedTimestamp >= Number(post.json["timestamp"])) {
    continue
  }
  // Debug
  post.json["last_processed_timestamp"] = lastProcessedTimestamp
  out.push(post)
}

out.sort((a, b) => (Number(a.json["timestamp"]) > Number(b.json["timestamp"]) ? 1 : -1))

const latestProcessedTimestamp = Math.max(...$input.all().map((post) => Number(post.json["timestamp"])))
data.lastProcessedTimestamp = latestProcessedTimestamp

// Debug
return out.map((post) => {
  post.json["latest_processed_timestamp"] = latestProcessedTimestamp
  return post
})

The result of this example has been that after a new post has been made, around 12.5-50% of the time, the previous value of lastProcessedTimestamp has been read, causing the same set of post to be returned as an older execution. I have confirmed that latestProcessedTimestamp is a different value through the debugs, but the next execution manages to read an older value. I have also confirmed that the input to the Code node has been the same during these executions. Restarting n8n has immediately fixed the problem until another post has been added to the input.

nihaals avatar Jun 28 '23 23:06 nihaals

Hey @nihaals

Just to double check, even though there are code node snippets in this report none of them actually reproduce the issue reliably?

Joffcom avatar Jun 29 '23 00:06 Joffcom

Essentially. With example 4, I've had the issue happen every time there's a new post, but even after that the actual incorrect read hasn't happened every time (the workflow is triggered every 3 hours and it happens around once a day every day after a new post is made until I restart n8n).

I'm currently working on reproducing the issue in a new n8n instance so I can share a complete database and workflow, which will hopefully help if comparing the examples above that do and don't trigger the issue doesn't help narrow down the cause. I know the original report is a bit vague and doesn't include solid repro steps but I have struggled to narrow down really anything with this issue despite having some workflows that have the issue and others that don't while having it occur for months.

nihaals avatar Jun 29 '23 00:06 nihaals

With that last example what is the source of the data? If you can share that maybe we can use that to reproduce it as it sounds like it happens all the time.

Joffcom avatar Jun 29 '23 06:06 Joffcom

This is the workflow I'm aiming to reproduce with in the new instance (which exactly matches a workflow I've had issues with other than the RSS URL and an extra node removed after the Code node). I've also confirmed that the feed hasn't been changing between the executions by checking the executions in n8n. The problem with this case is that it relies on the external source updating their posts. I haven't tried it with a URL I control, but something to note is the schedule being 3 hours seems to have an effect. My workflows with schedules of 15 minutes don't seem to have this issue which is even stranger.

Workflow
{
  "name": "Cloudflare Blog",
  "nodes": [
    {
      "parameters": {
        "rule": {
          "interval": [
            {
              "field": "hours",
              "hoursInterval": 3
            }
          ]
        }
      },
      "id": "87ac14e5-03e4-4ebd-8ee4-f5c8d0a0224c",
      "name": "Schedule Trigger",
      "type": "n8n-nodes-base.scheduleTrigger",
      "typeVersion": 1,
      "position": [
        640,
        380
      ]
    },
    {
      "parameters": {
        "url": "https://blog.cloudflare.com/rss/"
      },
      "id": "7f26b475-1ecd-433a-9f38-9224df15e364",
      "name": "RSS Read",
      "type": "n8n-nodes-base.rssFeedRead",
      "typeVersion": 1,
      "position": [
        820,
        380
      ]
    },
    {
      "parameters": {
        "value": "={{ $json[\"isoDate\"] }}",
        "dataPropertyName": "timestamp",
        "toFormat": "X",
        "options": {}
      },
      "id": "4b713028-e6b1-4124-9364-1010217b0475",
      "name": "Date & Time",
      "type": "n8n-nodes-base.dateTime",
      "typeVersion": 1,
      "position": [
        1000,
        380
      ]
    },
    {
      "parameters": {
        "jsCode": "const data = $getWorkflowStaticData(\"node\")\n\nconst lastProcessedTimestamp = Number(data.lastProcessedTimestamp || -1)\nconst out = []\n\nfor (const post of $input.all()) {\n  // Keep all posts newer than `lastProcessedTimestamp`\n  if (lastProcessedTimestamp >= Number(post.json[\"timestamp\"])) {\n    continue\n  }\n  out.push(post)\n}\n\n// Don't assume feed is in chronological order\n// Sort in ascending order\nout.sort((a, b) => (Number(a.json[\"timestamp\"]) > Number(b.json[\"timestamp\"])) ? 1 : -1)\n\n// Don't depend on `out` as it may be e.g. empty\n// Set last in case there's an error\nconst latestProcessedTimestamp = Math.max(...$input.all().map((post) => Number(post.json[\"timestamp\"])))\ndata.lastProcessedTimestamp = latestProcessedTimestamp\n\nreturn out"
      },
      "id": "1df62847-b151-451e-a5d5-6cf2643fa625",
      "name": "Code",
      "type": "n8n-nodes-base.code",
      "typeVersion": 1,
      "position": [
        1180,
        380
      ]
    }
  ],
  "pinData": {},
  "connections": {
    "Schedule Trigger": {
      "main": [
        [
          {
            "node": "RSS Read",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "RSS Read": {
      "main": [
        [
          {
            "node": "Date & Time",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Date & Time": {
      "main": [
        [
          {
            "node": "Code",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "active": true,
  "settings": {},
  "versionId": "101557a7-7ffd-437d-8066-fb30a35116e7",
  "id": "7Yf8gFcH47Ts5d0M",
  "meta": {
    "instanceId": "0000000000000000000000000000000000000000000000000000000000000000"
  },
  "tags": []
}

nihaals avatar Jun 29 '23 13:06 nihaals

I've now added a possible MWE to my main instance, this time with a 3 hour schedule to see if that's the cause.

nihaals avatar Jun 29 '23 14:06 nihaals

I have just set up your workflow but used the WTF1 RSS feed, As it is a race week there should be a lot of posts. When the issue triggers what will I be looking for in the output?

Joffcom avatar Jun 29 '23 14:06 Joffcom

  1. The first execution will return all posts
  2. The second execution (assuming no new post was made) will return no posts
  3. When a new post is made, that post is returned
  4. That post should not be returned again
  5. Randomly, that post will be returned again
  6. Restarting n8n stops the post from being returned again

In my case, new posts aren't made every day, so I'm not sure how this will work with the latest timestamp changing more often, but the issue may still reproduce. Based on the schedule having an impact, this may be something specific with my instance, but I assume even with higher load, random old values shouldn't be returned.

nihaals avatar Jun 29 '23 14:06 nihaals

Hey @nihaals,

I have not yet seen this happen in my test set up 🤔 I am wondering if maybe it is too active and I need a feed that doesn't post as often.

Joffcom avatar Jul 10 '23 13:07 Joffcom

I messed something up with my test instance so it hasn't been running the whole time but I'll see if I can reproduce it on this entirely separate instance (which might take a while). The correlation between the schedule and the bug occurring suggests to me that I might not be able to and it may be related to instance load, but I'm not sure how to debug that and I don't have that many workflows. I might also try moving some of the workflows exactly to this other instance (they generally fire webhooks so I removed that node) to see if that somehow makes a difference.

nihaals avatar Jul 10 '23 14:07 nihaals

Hey @nihaals,

If it helps the instance I have been running it on does around 600 executions a day, Not a lot but there is a bit of load going through it.

Joffcom avatar Jul 10 '23 15:07 Joffcom

Update, I just transferred my instance to another server with more resources and still reproduced the issue but wasn't able to reproduce it on the new instance. I'm now trying with more workflows on the new instance to see if it's due to multiple workflows being triggered at the same time.

nihaals avatar Jul 16 '23 20:07 nihaals

It could be if the same workflow launches multiple times at once, most of my workflows are schedules / polling so I would have excepted to see this by now with the test workflows.

Most of my own workflows moved away from static data though as it has some known limitations so I use Redis to store my static data.

Joffcom avatar Jul 16 '23 21:07 Joffcom

It could be if the same workflow launches multiple times at once

The affected workflows are scheduled every 3 hours and usually take less than 10 seconds to execute, could this still be the cause?

Most of my own workflows moved away from static data though as it has some known limitations so I use Redis to store my static data.

Ah I didn't realise n8n had Redis nodes and didn't think of that, although if it's possible to use static data then it feels nicer and more integrated, especially as I don't already have a Redis instance running. Are there docs on the limitations of static data? The only one I'm currently aware of is it not being usable from manual executions, which in combination with not being able to easily inspect the data is quite annoying (and is making running a persistent Redis instance tempting).

nihaals avatar Jul 17 '23 03:07 nihaals

I seem to have already reproduced the issue on the new instance which was a lot faster than I expected, I should be able to get a MWE soon and have an idea of what might be the cause.

nihaals avatar Jul 17 '23 03:07 nihaals

I haven't been able to get a MWE but I have the database of the new instance I managed to reproduce the issue with:

repro.zip

It hit a Discord rate limit since I didn't limit how many messages were going to be sent, but the execution logs for each workflow still show a mix of working correctly and sometimes not setting properly. I created the workflow, duplicated it 9 times and then restarted n8n so the workflows would be triggered at the same time. The Discord webhook has been deleted. I used n8nio/n8n:0.236.2. I've also confirmed that the input data to the Code nodes are identical (so it's not that one of the posts' date has changed).

nihaals avatar Jul 17 '23 16:07 nihaals

This is the simplest repro case I've found (notice execution #​5 and #​18 for example):

repro.zip

nihaals avatar Jul 17 '23 21:07 nihaals

I can also reproduce on v1.0.5.

nihaals avatar Jul 25 '23 17:07 nihaals

@nihaals I would expect that to be the case as I have not yet made the dev ticket to get it fixed. I will get that created tomorrow morning

Joffcom avatar Jul 25 '23 18:07 Joffcom

In case anyone finds this issue, I tried setting a slight offset on the schedule (instead of being on the hour, 10 minutes later) for a single workflow and that fixed the problem for that workflow. I assume setting all of them to the offset would cause the issue to occur again but I have one workflow that particularly causes issues with it repeating so it helped alleviate the impact.

nihaals avatar Aug 07 '23 06:08 nihaals

Hey @nihaals,

That is a good find.

Joffcom avatar Aug 07 '23 07:08 Joffcom

Also should this have https://github.com/n8n-io/n8n/labels/in%20linear?

nihaals avatar Aug 07 '23 16:08 nihaals

@nihaals it will when I create the ticket in Linear, Fallen a bit behind on that while working on other issues.

Joffcom avatar Aug 07 '23 17:08 Joffcom

I can confirm, there's something very wrong with how getWorkflowStaticData writes/reads the data. I have simple Code node with state and sometimes I get wrong data when reading from the state via getWorkflowStaticData.

Kagami avatar Aug 15 '23 19:08 Kagami

Internal ticket is N8N-6831

Joffcom avatar Aug 16 '23 10:08 Joffcom

I can confirm, there's something very wrong with how getWorkflowStaticData writes/reads the data. I have simple Code node with state and sometimes I get wrong data when reading from the state via getWorkflowStaticData.

Out of interest, do you also have a few workflows being triggered at the same time or are you getting it from individual runs?

nihaals avatar Aug 16 '23 16:08 nihaals

do you also have a few workflows being triggered at the same time or are you getting it from individual runs?

I have one workflow which is run every 3 hours by Schedule Trigger. And sometimes getWorkflowStaticData reads back the wrong value, probably the previous one, as you described.

Kagami avatar Aug 16 '23 18:08 Kagami

Interesting, I only reproduced it with exactly 3 hours too

nihaals avatar Aug 16 '23 18:08 nihaals

Oh, didn't notice it in your report, but that might be somehow related (although I don't understand how 😄). I had another workflow which run every 2 minutes and also used getWorkflowStaticData but I don't remember any issues with it.

I will change schedule to 1 hour to see if that reproduces again 👍

Kagami avatar Aug 16 '23 18:08 Kagami

Actually using a 10 minute offset didn't end up fixing the issue and it happened again. I'm going to try keeping it on 2 hours instead. @Kagami Did the issue continue with it set to 1 hour?

nihaals avatar Aug 26 '23 04:08 nihaals

Did the issue continue with it set to 1 hour?

Update of my workflow happens rarely so I can't say yet for sure. I will post if issue happens again.

Kagami avatar Aug 26 '23 11:08 Kagami