n8n
n8n copied to clipboard
`getWorkflowStaticData` doesn't always save
Describe the bug
With some Code nodes, new $getWorkflowStaticData() aren't always read by future executions until a n8n restart. Future executions still (sometimes) read the previous value as if the new value wasn't being saved, but after a n8n restart it behaves as expected.
To Reproduce
I haven't narrowed down exactly what's causing the issue. I have a few workflows that use getWorkflowStaticData, some of which have this issue while others don't. The steps:
- Create a workflow
- Add a Code node similar to:
const data = $getWorkflowStaticData("node")
const now = new Date().getTime()
const previous = data.previousTimestamp || "undefined"
// 10% chance of being `true`
const overwrite = Math.random() < 0.1
if (overwrite) {
data.previousTimestamp = now
}
return [{ now: now, previous: previous, overwrite: overwrite }]
- Notice after an overwrite, on some executions, the value of
previousis an older value while other executions it reads the correct value - Restart n8n
- Notice every value of
previousis correct again until another overwrite
Expected behavior
The value of previous is always the now of the last execution where overwrite was true.
Environment (please complete the following information):
- OS: Docker
n8nio/n8n:0.231.3 - n8n Version: v0.231.3
- Node.js Version: Docker
n8nio/n8n:0.231.3 - Database system: SQLite
- Operation mode:
own
Additional context I've had this issue for a few months (when I started using n8n) but haven't managed to figure out solid reproduction steps. I just tried a simple schedule and Code node with the snippet above and couldn't reproduce it. Some of my workflows have this issue while others don't. The reason for the seemingly random example is to represent filtering based on changes of a remote which is what I've used Code nodes for mostly.
Example 1
An example that has never given me this issue:
const data = $getWorkflowStaticData("node")
const lastVersion = data.lastVersion
const out = []
for (const item of $input.all()) {
if (lastVersion && lastVersion === item.json["title"]) {
break
}
out.push(item)
}
const latestVersion = $input.first().json["title"]
data.lastVersion = latestVersion
out.reverse()
return out
Example 2
Another example that has never given me issues:
const data = $getWorkflowStaticData("node")
const lastTimestampProcessed = Number(data.lastTimestampProcessed || -1)
for (const tweet of $input.all()) {
if (lastTimestampProcessed >= Number(tweet.json["created_at_timestamp"])) {
break
}
...
}
...
const latestTimestampProcessed = $input.first().json["created_at_timestamp"]
data.lastTimestampProcessed = latestTimestampProcessed
return out
Example 3
An example that has given me this issue (the input is a Sparkle Appcast):
const data = $getWorkflowStaticData("node")
const lastProcessedTimestamp = Number(data.lastProcessedTimestamp || -1)
const out = []
for (const post of $input.all()) {
// Keep all posts newer than `lastProcessedTimestamp`
if (lastProcessedTimestamp >= Number(post.json["timestamp"])) {
continue
}
out.push(post)
}
out.sort((a, b) => (Number(a.json["timestamp"]) > Number(b.json["timestamp"])) ? 1 : -1)
const latestProcessedTimestamp = Math.max(...$input.all().map((post) => Number(post.json["timestamp"])))
data.lastProcessedTimestamp = latestProcessedTimestamp
return out
Example 4
An example that has given me this issue every time there's been an update:
const data = $getWorkflowStaticData("node")
const lastProcessedTimestamp = Number(data.lastProcessedTimestamp || -1)
$execution.customData.set("last_processed_timestamp", lastProcessedTimestamp.toString())
const out = []
for (const post of $input.all()) {
// Keep all posts newer than `lastProcessedTimestamp`
if (lastProcessedTimestamp >= Number(post.json["timestamp"])) {
continue
}
// Debug
post.json["last_processed_timestamp"] = lastProcessedTimestamp
out.push(post)
}
out.sort((a, b) => (Number(a.json["timestamp"]) > Number(b.json["timestamp"]) ? 1 : -1))
const latestProcessedTimestamp = Math.max(...$input.all().map((post) => Number(post.json["timestamp"])))
data.lastProcessedTimestamp = latestProcessedTimestamp
// Debug
return out.map((post) => {
post.json["latest_processed_timestamp"] = latestProcessedTimestamp
return post
})
The result of this example has been that after a new post has been made, around 12.5-50% of the time, the previous value of lastProcessedTimestamp has been read, causing the same set of post to be returned as an older execution. I have confirmed that latestProcessedTimestamp is a different value through the debugs, but the next execution manages to read an older value. I have also confirmed that the input to the Code node has been the same during these executions. Restarting n8n has immediately fixed the problem until another post has been added to the input.
Hey @nihaals
Just to double check, even though there are code node snippets in this report none of them actually reproduce the issue reliably?
Essentially. With example 4, I've had the issue happen every time there's a new post, but even after that the actual incorrect read hasn't happened every time (the workflow is triggered every 3 hours and it happens around once a day every day after a new post is made until I restart n8n).
I'm currently working on reproducing the issue in a new n8n instance so I can share a complete database and workflow, which will hopefully help if comparing the examples above that do and don't trigger the issue doesn't help narrow down the cause. I know the original report is a bit vague and doesn't include solid repro steps but I have struggled to narrow down really anything with this issue despite having some workflows that have the issue and others that don't while having it occur for months.
With that last example what is the source of the data? If you can share that maybe we can use that to reproduce it as it sounds like it happens all the time.
This is the workflow I'm aiming to reproduce with in the new instance (which exactly matches a workflow I've had issues with other than the RSS URL and an extra node removed after the Code node). I've also confirmed that the feed hasn't been changing between the executions by checking the executions in n8n. The problem with this case is that it relies on the external source updating their posts. I haven't tried it with a URL I control, but something to note is the schedule being 3 hours seems to have an effect. My workflows with schedules of 15 minutes don't seem to have this issue which is even stranger.
Workflow
{
"name": "Cloudflare Blog",
"nodes": [
{
"parameters": {
"rule": {
"interval": [
{
"field": "hours",
"hoursInterval": 3
}
]
}
},
"id": "87ac14e5-03e4-4ebd-8ee4-f5c8d0a0224c",
"name": "Schedule Trigger",
"type": "n8n-nodes-base.scheduleTrigger",
"typeVersion": 1,
"position": [
640,
380
]
},
{
"parameters": {
"url": "https://blog.cloudflare.com/rss/"
},
"id": "7f26b475-1ecd-433a-9f38-9224df15e364",
"name": "RSS Read",
"type": "n8n-nodes-base.rssFeedRead",
"typeVersion": 1,
"position": [
820,
380
]
},
{
"parameters": {
"value": "={{ $json[\"isoDate\"] }}",
"dataPropertyName": "timestamp",
"toFormat": "X",
"options": {}
},
"id": "4b713028-e6b1-4124-9364-1010217b0475",
"name": "Date & Time",
"type": "n8n-nodes-base.dateTime",
"typeVersion": 1,
"position": [
1000,
380
]
},
{
"parameters": {
"jsCode": "const data = $getWorkflowStaticData(\"node\")\n\nconst lastProcessedTimestamp = Number(data.lastProcessedTimestamp || -1)\nconst out = []\n\nfor (const post of $input.all()) {\n // Keep all posts newer than `lastProcessedTimestamp`\n if (lastProcessedTimestamp >= Number(post.json[\"timestamp\"])) {\n continue\n }\n out.push(post)\n}\n\n// Don't assume feed is in chronological order\n// Sort in ascending order\nout.sort((a, b) => (Number(a.json[\"timestamp\"]) > Number(b.json[\"timestamp\"])) ? 1 : -1)\n\n// Don't depend on `out` as it may be e.g. empty\n// Set last in case there's an error\nconst latestProcessedTimestamp = Math.max(...$input.all().map((post) => Number(post.json[\"timestamp\"])))\ndata.lastProcessedTimestamp = latestProcessedTimestamp\n\nreturn out"
},
"id": "1df62847-b151-451e-a5d5-6cf2643fa625",
"name": "Code",
"type": "n8n-nodes-base.code",
"typeVersion": 1,
"position": [
1180,
380
]
}
],
"pinData": {},
"connections": {
"Schedule Trigger": {
"main": [
[
{
"node": "RSS Read",
"type": "main",
"index": 0
}
]
]
},
"RSS Read": {
"main": [
[
{
"node": "Date & Time",
"type": "main",
"index": 0
}
]
]
},
"Date & Time": {
"main": [
[
{
"node": "Code",
"type": "main",
"index": 0
}
]
]
}
},
"active": true,
"settings": {},
"versionId": "101557a7-7ffd-437d-8066-fb30a35116e7",
"id": "7Yf8gFcH47Ts5d0M",
"meta": {
"instanceId": "0000000000000000000000000000000000000000000000000000000000000000"
},
"tags": []
}
I've now added a possible MWE to my main instance, this time with a 3 hour schedule to see if that's the cause.
I have just set up your workflow but used the WTF1 RSS feed, As it is a race week there should be a lot of posts. When the issue triggers what will I be looking for in the output?
- The first execution will return all posts
- The second execution (assuming no new post was made) will return no posts
- When a new post is made, that post is returned
- That post should not be returned again
- Randomly, that post will be returned again
- Restarting n8n stops the post from being returned again
In my case, new posts aren't made every day, so I'm not sure how this will work with the latest timestamp changing more often, but the issue may still reproduce. Based on the schedule having an impact, this may be something specific with my instance, but I assume even with higher load, random old values shouldn't be returned.
Hey @nihaals,
I have not yet seen this happen in my test set up 🤔 I am wondering if maybe it is too active and I need a feed that doesn't post as often.
I messed something up with my test instance so it hasn't been running the whole time but I'll see if I can reproduce it on this entirely separate instance (which might take a while). The correlation between the schedule and the bug occurring suggests to me that I might not be able to and it may be related to instance load, but I'm not sure how to debug that and I don't have that many workflows. I might also try moving some of the workflows exactly to this other instance (they generally fire webhooks so I removed that node) to see if that somehow makes a difference.
Hey @nihaals,
If it helps the instance I have been running it on does around 600 executions a day, Not a lot but there is a bit of load going through it.
Update, I just transferred my instance to another server with more resources and still reproduced the issue but wasn't able to reproduce it on the new instance. I'm now trying with more workflows on the new instance to see if it's due to multiple workflows being triggered at the same time.
It could be if the same workflow launches multiple times at once, most of my workflows are schedules / polling so I would have excepted to see this by now with the test workflows.
Most of my own workflows moved away from static data though as it has some known limitations so I use Redis to store my static data.
It could be if the same workflow launches multiple times at once
The affected workflows are scheduled every 3 hours and usually take less than 10 seconds to execute, could this still be the cause?
Most of my own workflows moved away from static data though as it has some known limitations so I use Redis to store my static data.
Ah I didn't realise n8n had Redis nodes and didn't think of that, although if it's possible to use static data then it feels nicer and more integrated, especially as I don't already have a Redis instance running. Are there docs on the limitations of static data? The only one I'm currently aware of is it not being usable from manual executions, which in combination with not being able to easily inspect the data is quite annoying (and is making running a persistent Redis instance tempting).
I seem to have already reproduced the issue on the new instance which was a lot faster than I expected, I should be able to get a MWE soon and have an idea of what might be the cause.
I haven't been able to get a MWE but I have the database of the new instance I managed to reproduce the issue with:
It hit a Discord rate limit since I didn't limit how many messages were going to be sent, but the execution logs for each workflow still show a mix of working correctly and sometimes not setting properly. I created the workflow, duplicated it 9 times and then restarted n8n so the workflows would be triggered at the same time. The Discord webhook has been deleted. I used n8nio/n8n:0.236.2. I've also confirmed that the input data to the Code nodes are identical (so it's not that one of the posts' date has changed).
I can also reproduce on v1.0.5.
@nihaals I would expect that to be the case as I have not yet made the dev ticket to get it fixed. I will get that created tomorrow morning
In case anyone finds this issue, I tried setting a slight offset on the schedule (instead of being on the hour, 10 minutes later) for a single workflow and that fixed the problem for that workflow. I assume setting all of them to the offset would cause the issue to occur again but I have one workflow that particularly causes issues with it repeating so it helped alleviate the impact.
Hey @nihaals,
That is a good find.
Also should this have https://github.com/n8n-io/n8n/labels/in%20linear?
@nihaals it will when I create the ticket in Linear, Fallen a bit behind on that while working on other issues.
I can confirm, there's something very wrong with how getWorkflowStaticData writes/reads the data. I have simple Code node with state and sometimes I get wrong data when reading from the state via getWorkflowStaticData.
Internal ticket is N8N-6831
I can confirm, there's something very wrong with how getWorkflowStaticData writes/reads the data. I have simple Code node with state and sometimes I get wrong data when reading from the state via getWorkflowStaticData.
Out of interest, do you also have a few workflows being triggered at the same time or are you getting it from individual runs?
do you also have a few workflows being triggered at the same time or are you getting it from individual runs?
I have one workflow which is run every 3 hours by Schedule Trigger. And sometimes getWorkflowStaticData reads back the wrong value, probably the previous one, as you described.
Interesting, I only reproduced it with exactly 3 hours too
Oh, didn't notice it in your report, but that might be somehow related (although I don't understand how 😄). I had another workflow which run every 2 minutes and also used getWorkflowStaticData but I don't remember any issues with it.
I will change schedule to 1 hour to see if that reproduces again 👍
Actually using a 10 minute offset didn't end up fixing the issue and it happened again. I'm going to try keeping it on 2 hours instead. @Kagami Did the issue continue with it set to 1 hour?
Did the issue continue with it set to 1 hour?
Update of my workflow happens rarely so I can't say yet for sure. I will post if issue happens again.