lnd icon indicating copy to clipboard operation
lnd copied to clipboard

[bug]: LND panics when calling `kvdb.lastCompactionDate`.

Open lorenzolfm opened this issue 10 months ago • 6 comments

Background

LND panics when calling lastCompactionDate: lnd[5921]: panic: runtime error: index out of range [7] with length 0. This happened after a power outage that rebooted my machine. I was able to avoid the issue by setting db.bolt.auto-compact=false.

Your environment

  • version of lnd: lnd version 0.17.4-beta commit=v0.17.4-beta
  • which operating system (uname -a on *Nix): Debian GNU/Linux 11 (bullseye) aarch64
  • version of btcd, bitcoind, or other backend: v26.1.0
  • any other relevant environment details

Expected behaviour

LND should be able to startup with db.bolt.auto-compact=true, or at least fail gracefully with better error handling.

Actual behaviour

LND panics, see stacktrace:

lnd[5921]: panic: runtime error: index out of range [7] with length 0
lnd[5921]: goroutine 1 [running]:
lnd[5921]: encoding/binary.bigEndian.Uint64(...)
lnd[5921]:         encoding/binary/binary.go:179
lnd[5921]: github.com/lightningnetwork/lnd/kvdb.lastCompactionDate({0x40003b2980?, 0x0?})
lnd[5921]:         github.com/lightningnetwork/lnd/[email protected]/backend.go:226 +0x184
lnd[5921]: github.com/lightningnetwork/lnd/kvdb.compactAndSwap(0x4000707098)
lnd[5921]:         github.com/lightningnetwork/lnd/[email protected]/backend.go:127 +0xc8
lnd[5921]: github.com/lightningnetwork/lnd/kvdb.GetBoltBackend(0x4000707098)
lnd[5921]:         github.com/lightningnetwork/lnd/[email protected]/backend.go:101 +0x74
lnd[5921]: github.com/lightningnetwork/lnd/lncfg.(*DB).GetBackends(0x400056c740, {0x1ca9928?, 0x4000578a50}, {0x40005744e0, 0x21}, {0x4000052870, 0x29}, {0x40005745 10, 0x2e}, 0x1, ...)
lnd[5921]:         github.com/lightningnetwork/lnd/lncfg/db.go:552 +0x2be4
lnd[5921]: github.com/lightningnetwork/lnd.(*DefaultDatabaseBuilder).BuildDatabase(0x4000566990, {0x1ca9928, 0x4000578a50})
lnd[5921]:         github.com/lightningnetwork/lnd/config_builder.go:918 +0x288
lnd[5921]: github.com/lightningnetwork/lnd.Main(0x4000366000, {{0x0?, 0x40001c41e0?, 0x40001c4240?}}, 0x400007ca20, {0x40002ee1e0, 0x40001c41e0, 0x40001c4240, 0x400 01c42a0, {0x0}})
lnd[5921]:         github.com/lightningnetwork/lnd/lnd.go:427 +0x1840
lnd[5921]: main.main()
lnd[5921]:         github.com/lightningnetwork/lnd/cmd/lnd/main.go:38 +0x198
systemd[1]: lnd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
systemd[1]: lnd.service: Failed with result 'exit-code'.

lorenzolfm avatar Apr 10 '24 10:04 lorenzolfm

Thanks for raising this issue. I'd recommend to keep lnd shutdown for the moment to not publish any outdated state.

hieblmi avatar Apr 10 '24 10:04 hieblmi

Could you try the compaction with chantools which is described a bit here and see if that succeeds? https://github.com/lightningnetwork/lnd/issues/8532#issuecomment-1985577491

hieblmi avatar Apr 10 '24 11:04 hieblmi

It seem to have compacted the db with no issue.

chantools compactdb channel.db --sourcedb ./channel.db --destdb ./compacted.db 2024-04-10 13:17:13.335 [INF] CHAN: chantools version v0.13.0 commit

~/tmp $ ls
channel.db  compacted.db  results

lorenzolfm avatar Apr 10 '24 12:04 lorenzolfm

Ok that's great, could you check that the .last-compacted file in your live-directory is empty? If so then move it out to a different temporary place, then turn the autocompaction of your node back on and restart lnd.

hieblmi avatar Apr 10 '24 12:04 hieblmi

Seems like the .last-compacted file was corrupted? Are you maybe running from an SD card?

bhandras avatar Apr 10 '24 13:04 bhandras

Ok that's great, could you check that the .last-compacted file in your live-directory is empty? If so then move it out to a different temporary place, then turn the autocompaction of your node back on and restart lnd.

Moving the channel.db.last-compacted still results is the same issue. The output of cat file is:

^WÄõÈy^[à<80>

Seems like the .last-compacted file was corrupted? Are you maybe running from an SD card?

No I'm using an SSD.

I'm not discarding corruption, but since moving the file and starting LND from scratch didn't work it doesn't seem that corruption is involved.

func lastCompactionDate(dbFile string) (time.Time, error) {
	zeroTime := time.Unix(0, 0)

	tsFile := fmt.Sprintf("%s%s", dbFile, LastCompactionFileNameSuffix)
	if !fileExists(tsFile) {
		return zeroTime, nil
	}

	tsBytes, err := ioutil.ReadFile(tsFile)
	if err != nil {
		return zeroTime, err
	}

	tsNano := byteOrder.Uint64(tsBytes)
	return time.Unix(0, int64(tsNano)), nil
}

If I understand correctly, if the .last-compacted file does not exists, the function returns early. Which is odd, since the file channel.db.last-compacted is created before this function even runs. Reading from an empty file could explain the crash on byteOrder.Uint64(tsBytes), tho.

lorenzolfm avatar Apr 10 '24 16:04 lorenzolfm