syzkaller icon indicating copy to clipboard operation
syzkaller copied to clipboard

prog: `panic: no result` during serialization

Open cpaasch opened this issue 1 year ago • 16 comments
trafficstars

For quite a while now my syz-manager instances stopped running for long. Very soon they end up with an error and terminate (see below).

Let me know what kind of information I can provide to debug this. (go version: go1.22.2)

panic: no result

goroutine 519346 [running]:
github.com/google/syzkaller/prog.(*ResultArg).serialize(0xc0338054d0, 0xc02d7700f0)
        /home/cpaasch/syzkaller/prog/encoding.go:224 +0x2a5
github.com/google/syzkaller/prog.(*serializer).arg(0x19a07e0?, {0x19adaa0?, 0xc0338054d0?})
        /home/cpaasch/syzkaller/prog/encoding.go:121 +0x29
github.com/google/syzkaller/prog.(*serializer).call(0xc02d7700f0, 0xc0325d0af0)
        /home/cpaasch/syzkaller/prog/encoding.go:82 +0x211
github.com/google/syzkaller/prog.(*Prog).serialize(0xc005e511c0, 0x0)
        /home/cpaasch/syzkaller/prog/encoding.go:46 +0xd8
github.com/google/syzkaller/prog.(*Prog).Serialize(...)
        /home/cpaasch/syzkaller/prog/encoding.go:30
main.(*Runner).logProgram(0xc01deef810, 0x4, 0xc005e511c0)
        /home/cpaasch/syzkaller/syz-manager/rpc.go:455 +0x52
main.(*RPCServer).StartExecuting(0xc0004fa000, 0xc02d737ec0, 0x0?)
        /home/cpaasch/syzkaller/syz-manager/rpc.go:251 +0x1c5
reflect.Value.call({0xc000586780?, 0xc002302dd0?, 0xc013d32c40?}, {0x14deee0, 0x4}, {0xc00aa3aef8, 0x3, 0xc0134ddce0?})
        /usr/local/go/src/reflect/value.go:596 +0xca6
reflect.Value.Call({0xc000586780?, 0xc002302dd0?, 0xc034a4b320?}, {0xc00aa3aef8?, 0x0?, 0x0?})
        /usr/local/go/src/reflect/value.go:380 +0xb9
net/rpc.(*service).call(0xc00270c940, 0xc00270c900, 0xc0288912b8, 0xc0288912d0, 0xc0003a6c80, 0xc024106120, {0x11123a0?, 0xc02d737ec0?, 0xc00af75230?}, {0x1114a20, ...}, ...)
        /usr/local/go/src/net/rpc/server.go:381 +0x20e
created by net/rpc.(*Server).ServeCodec in goroutine 422942
        /usr/local/go/src/net/rpc/server.go:478 +0x3d7

cpaasch avatar Apr 25 '24 16:04 cpaasch

Hi Christoph,

How frequently/soon does it crash? I don't remember seeing anything similar on syzbot nor locally. Do you have any local modifications? What syzkaller commit are you using? Recently this piece of logic was moved from the VM to the syz-manager, so potentially previously your VMs were crashing now and then with this error, but now syz-manager process started crashing.

dvyukov avatar Apr 26 '24 04:04 dvyukov

Hi @dvyukov !

How frequently/soon does it crash

I would say within less than 24 hours.

Do you have any local modifications

No modifications.

What syzkaller commit are you using

I am on 8bdc0f220628c9347b3581fead4c026272439799 right now.

cpaasch avatar Apr 30 '24 19:04 cpaasch

We do this trick to dump more info when decoding programs: https://github.com/google/syzkaller/blob/master/prog/encoding.go#L252-L257

But it won't work for encoding, since if encoding panics we cannot encode to dump it :)

Maybe if you flip this var to true, it may give some useful info earlier:

prog/validation.go:var debug = false // enabled in tests and fuzzers

With this flag program manipulation will do more checks every now and then.

dvyukov avatar May 02 '24 04:05 dvyukov

Been running with that suggested change for a while now and I'm no longer seeing the crash reported above. What I AM seeing instead is a lot of this:

2024/05/02 18:27:28 vm-12007: crash: panic: call #NUM mmap: ptr vma has bad address ADDR/NUM/NUM

I'll keep this running over night to see if that is really the difference in behavior or just another artifact of the debug variable being set.

bjoernd avatar May 02 '24 18:05 bjoernd

What's in the crash log files for these? They should contain unmasked values for these NUM/ADDR + maybe some additional info (stack, programs).

dvyukov avatar May 02 '24 18:05 dvyukov

Hi @dvyukov !

How frequently/soon does it crash

I would say within less than 24 hours.

Do you have any local modifications

No modifications.

What syzkaller commit are you using

I am on 8bdc0f2 right now.

It's really interesting that we don't see this happen on syzbot and when we run syzkaller locally. If it's easily reproducible on your side, I wonder what we are doing differently.

I think it would be great to find some minimized setup (Linux kernel revision + config, syzkaller revision + config, host and target arches) that would let us reproduce the problem too.

a-nogikh avatar May 02 '24 19:05 a-nogikh

It also may be some bad programs in the corpus.

dvyukov avatar May 02 '24 19:05 dvyukov

╰─○ cat log0
Warning: Permanently added '[localhost]:39534' (ED25519) to the list of known hosts.
2024/05/02 19:03:36 fuzzer started
2024/05/02 19:03:36 dialing manager at localhost:40649
2024/05/02 19:03:36 checking machine...
2024/05/02 19:03:36 testing simple program...
syzkaller login: [   17.819672] cgroup: Unknown subsys name 'net'
[   17.977755] cgroup: Unknown subsys name 'rlimit'
[   18.172610] Adding 124996k swap on ./swap-file.  Priority:0 extents:1 across:124996k FS
panic: call #0 mmap: ptr vma has bad address 18446744073709547520/4096/4096

goroutine 1 [running]:
github.com/google/syzkaller/prog.(*Prog).debugValidate(...)
	/home/ubuntu/src/syzkaller/prog/validation.go:19
github.com/google/syzkaller/prog.(*Prog).SerializeForExec(0xc00110f8c0)
	/home/ubuntu/src/syzkaller/prog/encodingexec.go:69 +0x3d5
github.com/google/syzkaller/pkg/ipc.(*Env).Exec(0xc000560000?, 0x0?, 0xdc0798?)
	/home/ubuntu/src/syzkaller/pkg/ipc/ipc.go:339 +0x25
main.checkSimpleProgram(0xc00110feb8, 0x0?)
	/home/ubuntu/src/syzkaller/syz-fuzzer/testing.go:258 +0x21e
main.checkMachine(0xc00110feb8)
	/home/ubuntu/src/syzkaller/syz-fuzzer/testing.go:148 +0x265
main.main()
	/home/ubuntu/src/syzkaller/syz-fuzzer/fuzzer.go:156 +0xa28

It's a separate problem from the one that @cpaasch described and I remember seeing it too when I ran syz-manger with prog validation enabled. We use negative vma addresses for our test program to emulate protected memory areas around the data buffer, and the validator is not particularly happy about this.

a-nogikh avatar May 02 '24 19:05 a-nogikh

Sorry. Should I move this to a separate issue or do we have a fix?

bjoernd avatar May 02 '24 19:05 bjoernd

Yes, let's create a separate issue for that. To not let it stop debugging of the original problem, does it work if you remove the following line and recompile syzkaller?

https://github.com/google/syzkaller/blob/ddfc15a1555eaea52bb3186b2c53245cc91ac5d2/sys/targets/common.go#L43

a-nogikh avatar May 02 '24 19:05 a-nogikh

Moving my posts to https://github.com/google/syzkaller/issues/4750

bjoernd avatar May 02 '24 19:05 bjoernd

Yes, let's create a separate issue for that. To not let it stop debugging of the original problem, does it work if you remove the following line and recompile syzkaller?

https://github.com/google/syzkaller/blob/ddfc15a1555eaea52bb3186b2c53245cc91ac5d2/sys/targets/common.go#L43

Strange this wasn't caught in tests. Tests should enable debug=true. Would be good to check and fix as well.

dvyukov avatar May 02 '24 19:05 dvyukov

I think I got an instrumented backtrace:

panic: bad group arg size 32, should be <= 24 for &prog.GroupArg{ArgCommon:prog.ArgCommon{ref:0x2bb, dir:0x0}, Inner:[]prog.Arg{(*prog.UnionArg)(0xc094edf530)}} type "array"

goroutine 12512734 [running]:
github.com/google/syzkaller/prog.foreachArgImpl({0x1971880?, 0xc001fd5160?}, 0xc04b547310, 0xc07cbf7380)
        /home/ubuntu/src/syzkaller/prog/analysis.go:192 +0x774
github.com/google/syzkaller/prog.foreachArgImpl({0x1971880?, 0xc001fd5120?}, 0xc04b547310, 0xc07cbf7380)
        /home/ubuntu/src/syzkaller/prog/analysis.go:181 +0x4ce
github.com/google/syzkaller/prog.foreachArgImpl({0x19716c0?, 0xc094edf440?}, 0xc04b547310, 0xc07cbf7380)
        /home/ubuntu/src/syzkaller/prog/analysis.go:200 +0x3d8
github.com/google/syzkaller/prog.foreachArgImpl({0x1971880?, 0xc001fd50e0?}, 0xc04b547310, 0xc07cbf7380)
        /home/ubuntu/src/syzkaller/prog/analysis.go:181 +0x4ce
github.com/google/syzkaller/prog.foreachArgImpl({0x19716c0?, 0xc094edf410?}, 0xc04b547310, 0xc07cbf7380)
        /home/ubuntu/src/syzkaller/prog/analysis.go:200 +0x3d8
github.com/google/syzkaller/prog.foreachArgImpl({0x1971880?, 0xc001fd50c0?}, 0xc04b547310, 0xc07cbf7380)
        /home/ubuntu/src/syzkaller/prog/analysis.go:181 +0x4ce
github.com/google/syzkaller/prog.foreachArgImpl({0x19716c0?, 0xc094edf3b0?}, 0xc04b547310, 0xc07cbf7380)
        /home/ubuntu/src/syzkaller/prog/analysis.go:200 +0x3d8
github.com/google/syzkaller/prog.foreachSubArgWithStack(...)
        /home/ubuntu/src/syzkaller/prog/analysis.go:139
github.com/google/syzkaller/prog.forEachStaleUnion(0xc0005bf4a0, 0xc0b7a2df90, 0xc07cbf73d8)
        /home/ubuntu/src/syzkaller/prog/expr.go:105 +0x56
github.com/google/syzkaller/prog.(*Call).checkConditions(0xc06ccc6140?, 0x1971680?, 0x50?)
        /home/ubuntu/src/syzkaller/prog/expr.go:178 +0x58
github.com/google/syzkaller/prog.(*validCtx).validateCall(0xc06ccc6140, 0xc0b7a2df90)
        /home/ubuntu/src/syzkaller/prog/validation.go:79 +0x185
github.com/google/syzkaller/prog.(*Prog).validateWithOpts(0xc07d4293c0, {0x10?, 0x37?})
        /home/ubuntu/src/syzkaller/prog/validation.go:51 +0x115
github.com/google/syzkaller/prog.(*Prog).validate(...)
        /home/ubuntu/src/syzkaller/prog/validation.go:25
github.com/google/syzkaller/prog.(*Prog).debugValidate(...)
        /home/ubuntu/src/syzkaller/prog/validation.go:18
github.com/google/syzkaller/prog.(*Prog).serialize(0xc07d4293c0, 0x0)
        /home/ubuntu/src/syzkaller/prog/encoding.go:38 +0x2e
github.com/google/syzkaller/prog.(*Prog).Serialize(...)
        /home/ubuntu/src/syzkaller/prog/encoding.go:30
main.(*Runner).logProgram(0xc0aa19fea0, 0xc068dddec0?, 0x5e7?)
        /home/ubuntu/src/syzkaller/syz-manager/rpc.go:552 +0x49
main.(*RPCServer).StartExecuting(0xc0005b6000, 0xc05dcaa660, 0x1?)
        /home/ubuntu/src/syzkaller/syz-manager/rpc.go:309 +0x1c5
reflect.Value.call({0xc0001927e0?, 0xc00013c8b0?, 0x0?}, {0x14ad9a6, 0x4}, {0xc087339ef8, 0x3, 0x0?})
        /home/ubuntu/src/go/src/reflect/value.go:596 +0xce7
reflect.Value.Call({0xc0001927e0?, 0xc00013c8b0?, 0x2?}, {0xc087339ef8?, 0x196c968?, 0x0?})
        /home/ubuntu/src/go/src/reflect/value.go:380 +0xb9
net/rpc.(*service).call(0xc0005e6b40, 0xc02ac16001?, 0xc087339fd0?, 0xc01f1592a0, 0xc0006d2080, 0xc00518c1e0?, {0x10e4a40?, 0xc05dcaa660?, 0xc0442e4580?}, {0x10e7000, ...}, ...)
        /home/ubuntu/src/go/src/net/rpc/server.go:382 +0x214
created by net/rpc.(*Server).ServeCodec in goroutine 12431003
        /home/ubuntu/src/go/src/net/rpc/server.go:479 +0x410

bjoernd avatar May 06 '24 06:05 bjoernd

Then again, 6 minutes later I get the backtrace from the top of this issue, though I've been running with the debug config as suggested. What would we expect to see with debugging on?

bjoernd avatar May 06 '24 06:05 bjoernd

Then again, 6 minutes later I get the backtrace from the top of this issue, though I've been running with the debug config as suggested. What would we expect to see with debugging on?

No concrete expectation, it was just a simple way to try to get something more useful.

I've tried to run prog package test for tens of CPU hours in a loop, but no success so far. Trying to run syz-manager with debug enabled.

dvyukov avatar May 06 '24 08:05 dvyukov

No progress with --debug so far. I see it's only running a single VM despite my config asking for more. That might as well be a reason to not hit the problem. Anyways, I'll keep running for some more.

bjoernd avatar May 07 '24 12:05 bjoernd

I've run syz-manager compiled with -race for a while and it has found the problem -- logProgram() races with the code that has been waiting for the execution to have been finished.

The problem is fixed in https://github.com/google/syzkaller/pull/4802.

a-nogikh avatar May 13 '24 14:05 a-nogikh

Forgot to get back to this, but just wanted to thank you as my syzkaller instances were running fine again last week!

cpaasch avatar May 24 '24 05:05 cpaasch