zed icon indicating copy to clipboard operation
zed copied to clipboard

"panic: runtime error: slice bounds out of range" in ztests that use fork operator

Open philrz opened this issue 2 years ago • 4 comments

This was most recently seen in Actions run https://github.com/brimdata/zed/actions/runs/7615875147 on Windows. There was a report of someone having seen it fail once before.

--- FAIL: TestZed (0.05s)
    --- FAIL: TestZed/runtime/op/merge/ztests (0.00s)
        --- FAIL: TestZed/runtime/op/merge/ztests/fork-merge (0.01s)
            ztest.go:389: runtime\op\merge\ztests\fork-merge.yaml: panic: runtime error: slice bounds out of range [:111] with capacity 2
                goroutine 110 [running]:
                runtime/debug.Stack()
                	C:/hostedtoolcache/windows/go/1.21.5/x64/src/runtime/debug/stack.go:24 +0x5e
                github.com/brimdata/zed/runtime/op.(*Catcher).Pull.func1()
                	D:/a/zed/zed/runtime/op/catcher.go:25 +0x3d
                panic({0x1656a20?, 0xc000886b88?})
                	C:/hostedtoolcache/windows/go/1.21.5/x64/src/runtime/panic.go:914 +0x21f
                github.com/brimdata/zed/zcode.(*Iter).Next(0xc001514ea0?)
                	D:/a/zed/zed/zcode/iter.go:33 +0x1bc
                github.com/brimdata/zed/runtime/expr.getNthFromContainer({0xc001f00002?, 0xc001514e70?, 0xc001f00002?}, 0x19aa7d0?)
                	D:/a/zed/zed/runtime/expr/eval.go:646 +0x7c
                github.com/brimdata/zed/runtime/expr.(*DotExpr).Eval(0xc0014e3140, {0x19a3e60, 0x2735680}, {{0x19aa7d0?, 0xc001514e70?}, 0xc001f00002?, 0xc001a375c0?})
                	D:/a/zed/zed/runtime/expr/dot.go:50 +0xe6
                github.com/brimdata/zed/runtime/expr.(*missingAsNull).Eval(0xc001a37548?, {0x19a3e60?, 0x2735680?}, {{0x19aa7d0, 0xc001514e70}, 0xc001f00002, 0x2})
                	D:/a/zed/zed/runtime/expr/sort.go:139 +0x3d
                github.com/brimdata/zed/runtime/expr.(*Comparator).Compare(0xc001a6a8c0, {{0x19aa7d0?, 0xc001514e70?}, 0xc001f00002?, 0x10?}, {{0x19aa7d0?, 0xc001514e70?}, 0xc001f00000?, 0x2?})
                	D:/a/zed/zed/runtime/expr/sort.go:153 +0x19a
                github.com/brimdata/zed/runtime/op/merge.(*Op).Less(0x100010000?, 0x157c840?, 0xc001156f60?)
                	D:/a/zed/zed/runtime/op/merge/merge.go:161 +0x6b
                container/heap.up({0x19bab40, 0xc00054cc00}, 0xc00054cae0?)
                	C:/hostedtoolcache/windows/go/1.21.5/x64/src/container/heap/heap.go:92 +0x74
                container/heap.Push({0x19bab40, 0xc00054cc00}, {0x157c840?, 0xc00054cae0?})
                	C:/hostedtoolcache/windows/go/1.21.5/x64/src/container/heap/heap.go:53 +0x55
                github.com/brimdata/zed/runtime/op/merge.(*Op).start(0xc00054cc00)
                	D:/a/zed/zed/runtime/op/merge/merge.go:129 +0x8e
                github.com/brimdata/zed/runtime/op/merge.(*Op).run(0xc00054cc00)
                	D:/a/zed/zed/runtime/op/merge/merge.go:113 +0x9a
                github.com/brimdata/zed/runtime/op/merge.(*Op).Pull.func1()
                	D:/a/zed/zed/runtime/op/merge/merge.go:50 +0x25
                sync.(*Once).doSlow(0x211273?, 0xc001a7a6c0?)
                	C:/hostedtoolcache/windows/go/1.21.5/x64/src/sync/once.go:74 +0xbf
                sync.(*Once).Do(...)
                	C:/hostedtoolcache/windows/go/1.21.5/x64/src/sync/once.go:65
                github.com/brimdata/zed/runtime/op/merge.(*Op).Pull(0xc00054cc00, 0x0)
                	D:/a/zed/zed/runtime/op/merge/merge.go:50 +0x[85](https://github.com/brimdata/zed/actions/runs/7615875147/job/20741356145#step:4:86)
                github.com/brimdata/zed/runtime/op.(*Single).Pull(0xc00098d4a0, 0x1)
                	D:/a/zed/zed/runtime/op/mux.go:133 +0x2b
                github.com/brimdata/zed/runtime/op.(*Catcher).Pull(0x0?, 0x0?)
                	D:/a/zed/zed/runtime/op/catcher.go:28 +0x62
                github.com/brimdata/zed/runtime.(*Query).Pull(0x2?, 0xd0?)
                	D:/a/zed/zed/runtime/query.go:90 +0x3e
                github.com/brimdata/zed/zbuf.CopyPuller({0x22beac009e0, 0xc00098d248}, {0x19a3f00, 0xc001a7a6c0})
                	D:/a/zed/zed/zbuf/batch.go:162 +0x4d
                github.com/brimdata/zed/ztest.runzq({0x0, 0x0}, {0xc00088ad80, 0x2a}, {0xc00077a630, 0x12}, {0xc00080b920?, 0x3, 0x3}, {0x2735680, ...})
                	D:/a/zed/zed/ztest/ztest.go:519 +0xb18
                github.com/brimdata/zed/ztest.(*ZTest).RunInternal(0xc00046b0e0, {0x0, 0x0})
                	D:/a/zed/zed/ztest/ztest.go:355 +0x20c
                github.com/brimdata/zed/ztest.(*ZTest).Run(0xc00046b0e0, 0xc000476000, {0x0, 0x0}, {0xc00088ad20, 0x27})
                	D:/a/zed/zed/ztest/ztest.go:3[86](https://github.com/brimdata/zed/actions/runs/7615875147/job/20741356145#step:4:87) +0x12f
                github.com/brimdata/zed/ztest.Run.func1(0xc000476000)
                	D:/a/zed/zed/ztest/ztest.go:201 +0xd3
                testing.tRunner(0xc000476000, 0xc0009c62d0)
                	C:/hostedtoolcache/windows/go/1.21.5/x64/src/testing/testing.go:1595 +0xff
                created by testing.(*T).Run in goroutine [89](https://github.com/brimdata/zed/actions/runs/7615875147/job/20741356145#step:4:90)
                	C:/hostedtoolcache/windows/go/1.21.5/x64/src/testing/testing.go:1648 +0x3ad

philrz avatar Jan 23 '24 21:01 philrz

I have a branch zed-5006-repro that I'm using to run go test -short ./... -count=1 in a loop on Windows Actions Runners in an attempt to reliably repro this issue and have a way to verify it's fixed when we figure it out. I've got five going at once and https://github.com/brimdata/zed/actions/runs/7644127528 just had a failure in this test with an incorrect result but without a panic, which probably still counts.

--- FAIL: TestZed (0.05s)
    --- FAIL: TestZed/runtime/sam/op/merge/ztests (0.00s)
        --- FAIL: TestZed/runtime/sam/op/merge/ztests/fork-merge (0.01s)
            ztest.go:389: runtime\sam\op\merge\ztests\fork-merge.yaml: expected and actual output differ:
                --- expected
                +++ actual
                @@ -1,4 +1,4 @@
                -{x:1}
                +{x:3}
                +{x:null(int64)}
                 {x:2}
                -{x:3}

philrz avatar Jan 24 '24 17:01 philrz

https://github.com/brimdata/zed/actions/runs/7643894261/job/20826962012 just saw this panic in runtime/sam/expr/ztests/rename-share.yaml which also uses fork.

--- FAIL: TestZed (0.81s)
    --- FAIL: TestZed/runtime/sam/expr/ztests (0.72s)
        --- FAIL: TestZed/runtime/sam/expr/ztests/rename-share (0.01s)
panic: runtime error: slice bounds out of range [2:1] [recovered]
	panic: runtime error: slice bounds out of range [2:1]

goroutine 649 [running]:
testing.tRunner.func1.2({0x1863300, 0xc00088a060})
	C:/hostedtoolcache/windows/go/1.21.5/x64/src/testing/testing.go:1545 +0x238
testing.tRunner.func1()
	C:/hostedtoolcache/windows/go/1.21.5/x64/src/testing/testing.go:1548 +0x397
panic({0x1863300?, 0xc00088a060?})
	C:/hostedtoolcache/windows/go/1.21.5/x64/src/runtime/panic.go:914 +0x21f
github.com/brimdata/zed/zcode.(*Iter).Next(0xc00215c510?)
	D:/a/zed/zed/zcode/iter.go:34 +0x1af
github.com/brimdata/zed/zson.(*Formatter).formatRecord(0xc00215c510, 0x411245?, 0xc001a6cba0?, {0xc001ccc001?, 0x4537a9?, 0x10000?}, 0x80?, 0xb?)
	D:/a/zed/zed/zson/formatter.go:414 +0x285
github.com/brimdata/zed/zson.(*Formatter).formatValue(0xc00215c510, 0xc0010b5982?, {0x1bb9110?, 0xc002204660?}, {0xc001ccc001, 0x1, 0x6}, 0x0, 0x0, 0x1)
	D:/a/zed/zed/zson/formatter.go:169 +0x27e
github.com/brimdata/zed/zson.(*Formatter).formatRecord(0xc00215c510, 0x0?, 0xc002288090?, {0xc001ccc000?, 0x7?, 0x7?}, 0x53?, 0x2b?)
	D:/a/zed/zed/zson/formatter.go:414 +0x2c5
github.com/brimdata/zed/zson.(*Formatter).formatValue(0xc00215c510, 0xc002288090?, {0x1bb9110?, 0xc002288090?}, {0xc001ccc000, 0x7, 0x7}, 0x0, 0x0, 0x0)
	D:/a/zed/zed/zson/formatter.go:169 +0x27e
github.com/brimdata/zed/zson.(*Formatter).formatValueAndDecorate(0xc00215c510, {0x1bb9110?, 0xc002288090?}, {0xc001ccc000, 0x7, 0x7})
	D:/a/zed/zed/zson/formatter.go:144 +0x145
github.com/brimdata/zed/zson.(*Formatter).Format(0xc00215c510, {{0x1bb9110?, 0xc002288090?}, 0xc001ccc000?, 0xc000b9e700?})
	D:/a/zed/zed/zson/formatter.go:97 +0xa5
github.com/brimdata/zed/zson.(*Formatter).FormatRecord(...)
	D:/a/zed/zed/zson/formatter.go:74
github.com/brimdata/zed/zio/zsonio.(*Writer).Write(0xc001755d58, {{0x1bb9110?, 0xc002288090?}, 0xc001ccc000?, 0xc0018f5980?})
	D:/a/zed/zed/zio/zsonio/writer.go:33 +0x90
github.com/brimdata/zed/zbuf.WriteBatch({0x2c064710b78, 0xc001755d58}, {0x1bc55c8?, 0xc000e4fcb0?})
	D:/a/zed/zed/zbuf/batch.go:41 +0x84
github.com/brimdata/zed/zbuf.CopyPuller({0x2c064710b78, 0xc001755d58}, {0x1bb2fe0, 0xc002204330})
	D:/a/zed/zed/zbuf/batch.go:166 +0x76
github.com/brimdata/zed/ztest.runzq({0x0, 0x0}, {0xc000772c00, 0x33}, {0xc000772d00, 0x33}, {0xc0010ab590?, 0x3, 0x3}, {0x2949680, ...})
	D:/a/zed/zed/ztest/ztest.go:519 +0xb58
github.com/brimdata/zed/ztest.(*ZTest).RunInternal(0xc001315680, {0x0, 0x0})
	D:/a/zed/zed/ztest/ztest.go:355 +0x20c
github.com/brimdata/zed/ztest.(*ZTest).Run(0xc001315680, 0xc0005d6340, {0x0, 0x0}, {0xc00077b6e0, 0x29})
	D:/a/zed/zed/ztest/ztest.go:386 +0x12f
github.com/brimdata/zed/ztest.Run.func1(0xc0005d6340)
	D:/a/zed/zed/ztest/ztest.go:201 +0xd3
testing.tRunner(0xc0005d6340, 0xc000101ef0)
	C:/hostedtoolcache/windows/go/1.21.5/x64/src/testing/testing.go:1595 +0xff
created by testing.(*T).Run in goroutine 100
	C:/hostedtoolcache/windows/go/1.21.5/x64/src/testing/testing.go:1648 +0x3ad

philrz avatar Jan 24 '24 18:01 philrz

I've also confirmed the problem is not unique to Windows. I just just repro'ed a panic in rename-share on an Ubuntu Actions Runner in https://github.com/brimdata/zed/actions/runs/7645061917.

--- FAIL: TestZed (0.42s)
    --- FAIL: TestZed/runtime/sam/expr/ztests (0.43s)
        --- FAIL: TestZed/runtime/sam/expr/ztests/rename-share (0.01s)
panic: runtime error: slice bounds out of range [:16] with capacity 7 [recovered]
	panic: runtime error: slice bounds out of range [:16] with capacity 7

goroutine 548 [running]:
testing.tRunner.func1.2({0x18564a0, 0xc0004b4fd8})
	/opt/hostedtoolcache/go/1.21.6/x64/src/testing/testing.go:1545 +0x238
testing.tRunner.func1()
	/opt/hostedtoolcache/go/1.21.6/x64/src/testing/testing.go:1548 +0x397
panic({0x18564a0?, 0xc0004b4fd8?})
	/opt/hostedtoolcache/go/1.21.6/x64/src/runtime/panic.go:914 +0x21f
github.com/brimdata/zed/zcode.(*Iter).Next(0xc000ee05a0?)
	/home/runner/work/zed/zed/zcode/iter.go:33 +0x1bc
github.com/brimdata/zed/zson.(*Formatter).formatRecord(0xc000ee05a0, 0x0?, 0xc00175b410?, {0xc001f5e000?, 0x7?, 0x7?}, 0xf3?, 0x65?)
	/home/runner/work/zed/zed/zson/formatter.go:414 +0x285
github.com/brimdata/zed/zson.(*Formatter).formatValue(0xc000ee05a0, 0xc00175b410?, {0x1ba7db0?, 0xc00175b410?}, {0xc001f5e000, 0x7, 0x7}, 0x0, 0x0, 0x0)
	/home/runner/work/zed/zed/zson/formatter.go:169 +0x27e
github.com/brimdata/zed/zson.(*Formatter).formatValueAndDecorate(0xc000ee05a0, {0x1ba7db0?, 0xc00175b410?}, {0xc001f5e000, 0x7, 0x7})
	/home/runner/work/zed/zed/zson/formatter.go:144 +0x145
github.com/brimdata/zed/zson.(*Formatter).Format(0xc000ee05a0, {{0x1ba7db0?, 0xc00175b410?}, 0xc001f5e000?, 0xc00095f200?})
	/home/runner/work/zed/zed/zson/formatter.go:97 +0xa5
github.com/brimdata/zed/zson.(*Formatter).FormatRecord(...)
	/home/runner/work/zed/zed/zson/formatter.go:74
github.com/brimdata/zed/zio/zsonio.(*Writer).Write(0xc000c46408, {{0x1ba7db0?, 0xc00175b410?}, 0xc001f5e000?, 0xc001cfd980?})
	/home/runner/work/zed/zed/zio/zsonio/writer.go:33 +0x90
github.com/brimdata/zed/zbuf.WriteBatch({0x7f2c6ce6e8e8, 0xc000c46408}, {0x1bb4448?, 0xc0011fb920?})
	/home/runner/work/zed/zed/zbuf/batch.go:41 +0x84
github.com/brimdata/zed/zbuf.CopyPuller({0x7f2c6ce6e8e8, 0xc000c46408}, {0x1ba1c40, 0xc000699140})
	/home/runner/work/zed/zed/zbuf/batch.go:166 +0x[76](https://github.com/brimdata/zed/actions/runs/7645061917/job/20830769959#step:4:77)
github.com/brimdata/zed/ztest.runzq({0x0, 0x0}, {0xc000c7a740, 0x33}, {0xc000c7a7c0, 0x33}, {0xc0013a3f20?, 0x3, 0x3}, {0x2902480, ...})
	/home/runner/work/zed/zed/ztest/ztest.go:519 +0xb58
github.com/brimdata/zed/ztest.(*ZTest).RunInternal(0xc000d50d20, {0x0, 0x0})
	/home/runner/work/zed/zed/ztest/ztest.go:355 +0x20c
github.com/brimdata/zed/ztest.(*ZTest).Run(0xc000d50d20, 0xc0014e91e0, {0x0, 0x0}, {0xc0006cc630, 0x29})
	/home/runner/work/zed/zed/ztest/ztest.go:386 +0x12f
github.com/brimdata/zed/ztest.Run.func1(0xc0014e91e0)
	/home/runner/work/zed/zed/ztest/ztest.go:201 +0xd3
testing.tRunner(0xc0014e91e0, 0xc00148b180)
	/opt/hostedtoolcache/go/1.21.6/x64/src/testing/testing.go:1595 +0xff
created by testing.(*T).Run in goroutine [79](https://github.com/brimdata/zed/actions/runs/7645061917/job/20830769959#step:4:80)
	/opt/hostedtoolcache/go/1.21.6/x64/src/testing/testing.go:1648 +0x3ad

philrz avatar Jan 24 '24 19:01 philrz

In a recent team sync @nwt expressed an interest in digging into this one after he's wrapped his current Arena efforts.

philrz avatar Feb 08 '24 00:02 philrz