warchaeology icon indicating copy to clipboard operation
warchaeology copied to clipboard

segfault when testing a bad WARC ending in gzip header (10 bytes) and no data on Mac

Open ikreymer opened this issue 11 months ago • 5 comments

Tried out the latest release on Mac, and am getting this segfault with Browsertrix Crawler WARCs:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x68 pc=0x16932fa]

goroutine 16 [running]:
github.com/nlnwa/warchaeology/cmd/validate.validateFile({0x1912600, 0x1e30c20}, {0xc001b2c780, 0x5f})
	/home/runner/work/warchaeology/warchaeology/cmd/validate/validate.go:149 +0x37a
github.com/nlnwa/warchaeology/internal/filewalker.(*filewalker).Walk.func2.1()
	/home/runner/work/warchaeology/warchaeology/internal/filewalker/filewalker.go:224 +0x183
github.com/nlnwa/warchaeology/internal/workerpool.New.func1(0x0?)
	/home/runner/work/warchaeology/warchaeology/internal/workerpool/workerpool.go:42 +0xa5
created by github.com/nlnwa/warchaeology/internal/workerpool.New in goroutine 1
	/home/runner/work/warchaeology/warchaeology/internal/workerpool/workerpool.go:37 +0x89

Have not had time to isolate which WARC is causing it exactly, but could later, if that would be helpful. Tried both with -r and with a --source-file-list with default concurrency.

ikreymer avatar Mar 20 '24 20:03 ikreymer