go-spacemesh icon indicating copy to clipboard operation
go-spacemesh copied to clipboard

Tracking Spacemesh memory problems on Windows

Open reythia opened this issue 1 year ago • 7 comments

Most commonly observed just after 'creating post verifier', various memory errors:

runtime: VirtualAlloc of 2924544 bytes failed with errno=1455
fatal error: out of memory
fatal error: runtime: cannot allocate memory
fatal error: runtime: cannot allocate memory

reythia avatar Nov 19 '23 21:11 reythia

I had closed this because I thought it was my end, but I've observed further issues.

Personally didn't observe this prior to 1.2.6 but most of my nodes are on Linux systems so that is likely conicedence.

There's a variety of messages observed so far and other less obvious behaviour such as failing to generate proof.

runtime: VirtualAlloc of 2924544 bytes failed with errno=1455
fatal error: out of memory

fatal error: runtime: cannot allocate memory

scrypt: out of memory

"errmsg": "create ATX: build NIPost: failed to generate Post: generate proof: generating proof: got nil", "name": "atxBuilder"}

The common thread is that that a user observes more than adequate RAM in Windows task manager, sometimes tens of GBs shown as 'Available'.

However, it seems that once the 'Committed' RAM is high spacemesh develops faults.

There's effectively no pagefile / swap as far as SM is concerned - shouldn't this be for the OS to decide?

image

reythia avatar Nov 26 '23 20:11 reythia

how to fix , sir ?

Windy0606 avatar Dec 15 '23 07:12 Windy0606

I'd like to point to: https://github.com/spacemeshos/smapp/issues/1439 please especially look at the https://github.com/spacemeshos/smapp/issues/1439#issuecomment-1675974406 and https://github.com/spacemeshos/smapp/issues/1439#issuecomment-1884621111 so we could debug it.

pigmej avatar Jan 11 '24 14:01 pigmej

I'm speaking only for my case, I have the impression that the "out of memory" message in the logs is not due to excessive use of memory or excessive use of the CPU but only due to excessive slowness in downloading the DB from the internet, this doubt makes me it arose because after I managed to synchronize a PC I copied the same DB on all the other machines and now they are all synchronized and working, perhaps the "out of memory" indication was a false indication . My current connection is 1Giga but there are many machines....

Odino1978 avatar Jan 14 '24 14:01 Odino1978

another user reported this error recently: https://discord.com/channels/623195163510046732/1192476775792529448/1197781395347619840

lrettig avatar Mar 06 '24 23:03 lrettig

Node stopped and unable to restart even after PC restart.

Screenshot 2024-04-15 140802 Screenshot 2024-04-15 153730

OriginalCJay avatar Apr 15 '24 14:04 OriginalCJay

@OriginalCJay it's a different case. I think your DB got corrupted somehow. Please delete state.sql* files from the node directory and then sync from 0 or use quick sync to get the recent state.

Please also note that this has nothing to do with the memory problems on Windows.

pigmej avatar Apr 16 '24 08:04 pigmej