marmot icon indicating copy to clipboard operation
marmot copied to clipboard

Crash: "double free or corruption (!prev)"

Open antiops opened this issue 2 years ago • 4 comments

I've been getting consistent crashes on the master server (in a master node + 1 replica setup). Both are on the latest version from the realeases page. Uptime is inconsistent. Sometimes its up for a day then crashes and sometimes it crashes within a few hours.

They're using basic configs so I might be missing an important thing that I do not know about.

The replica server has been running fine with no crashes.

Configs:

Master (config-main.toml)

db_path="/home/tik/redis/videos-replica.v2.db"
seq_map_path="/tmp/videos-main.cbor"

node_id=1

publish=true
replicate=false

Replica (config-replica.toml)

db_path="/home/rep/tik/videos.v2.db"
seq_map_path="/tmp/videos-replica-1.cbor"

node_id=2

publish=false
replicate=true

Details

Each instance is ran through the command line like

# Master
./marmot -config config-main.toml -cluster-addr 10.1.0.12:4223 -cluster-peers 'nats://10.1.0.1:14222/'

# Replica
./marmot -config config-replica.toml -cluster-addr 10.1.0.1:14222 -cluster-peers 'nats://10.1.0.12:4223/'

The database that it's using is 1.8GB with 4 tables of which only 1 (videos_clean) is being updated frequently. The master database is a replica itself to keep it separate from the production one, a script pushes changes to it every minute.

Below is the output from the most recent crash.

marmot-v0.8.5-master-crashlog.txt

antiops avatar Aug 22 '23 08:08 antiops

Reading crash logs:

goroutine 12606 [syscall]:
runtime.cgocall(0xdc4280, 0xc00057ed50)
        /opt/hostedtoolcache/go/1.20.7/x64/src/runtime/cgocall.go:157 +0x5c fp=0xc00057ed28 sp=0xc00057ecf0 pc=0x40601c
github.com/mattn/go-sqlite3._Cfunc_sqlite3_close_v2(0x7f1b701351f8)
        _cgo_gotypes.go:631 +0x4c fp=0xc00057ed50 sp=0xc00057ed28 pc=0x884f0c
github.com/mattn/go-sqlite3.(*SQLiteConn).Close.func1(0x0?)
        /home/runner/go/pkg/mod/github.com/mattn/[email protected]/sqlite3.go:1772 +0x46 fp=0xc00057ed88 sp=0xc00057ed50 pc=0x8958c6
github.com/mattn/go-sqlite3.(*SQLiteConn).Close(0xc000502840)
        /home/runner/go/pkg/mod/github.com/mattn/[email protected]/sqlite3.go:1772 +0x25 fp=0xc00057edb8 sp=0xc00057ed88 pc=0x8957c5
database/sql.(*driverConn).finalClose.func2()
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:644 +0x3c fp=0xc00057ede0 sp=0xc00057edb8 pc=0x7ee3dc
database/sql.withLock({0x12f7620, 0xc00037a6c0}, 0xc00057ee88)
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:3405 +0x8c fp=0xc00057ee20 sp=0xc00057ede0 pc=0x7fc86c
database/sql.(*driverConn).finalClose(0xc00037a6c0)
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:642 +0x116 fp=0xc00057eec8 sp=0xc00057ee20 pc=0x7ee296
database/sql.finalCloser.finalClose-fm()
        <autogenerated>:1 +0x2b fp=0xc00057eee0 sp=0xc00057eec8 pc=0x7fddcb
database/sql.(*driverConn).Close(0xc00037a6c0)
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:623 +0x13f fp=0xc00057ef28 sp=0xc00057eee0 pc=0x7ee15f
database/sql.(*DB).connectionCleaner(0xc00047e340, 0xc00027b000?)
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:1078 +0x23d fp=0xc00057efc0 sp=0xc00057ef28 pc=0x7efffd
database/sql.(*DB).startCleanerLocked.func1()
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:1048 +0x2a fp=0xc00057efe0 sp=0xc00057efc0 pc=0x7efd8a
runtime.goexit()
        /opt/hostedtoolcache/go/1.20.7/x64/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00057efe8 sp=0xc00057efe0 pc=0x46e9e1
created by database/sql.(*DB).startCleanerLocked
        /opt/hostedtoolcache/go/1.20.7/x64/src/database/sql/sql.go:1048 +0x105

Sounds like there is some sort of connection cleanup in the SQLite library that's messing it up. You see it happen infrequently due to the race condition (guessing from startCleanerLocked). Seems like I might have to do some deeper digging into github.com/mattn/go-sqlite3

maxpert avatar Aug 23 '23 02:08 maxpert

Would it be OK for you to join the discord channel and DM me? I am trying to reproduce the issue.

maxpert avatar Aug 23 '23 13:08 maxpert

Is this resolved ?

computinglife avatar Jun 21 '24 08:06 computinglife

I've not been able to reproduce the issue. I am about to push out newer version out with newer version of SQLite. Maybe you can try after that and tell me if it reproduces for you?

maxpert avatar Aug 02 '24 15:08 maxpert