refmt
                                
                                
                                
                                    refmt copied to clipboard
                            
                            
                            
                        Panics and OOMs in cbor decoding
Hi, I've started fuzzing go-ipld-cbor in which we use the refmt.
I've found following crashers: https://ipfs.io/ipfs/QmeGTCjuXnjVzQHZJ8krt7UtPqEKpCCDpK2T9RBJA7DtMm/ (tar here: https://ipfs.io/ipfs/QmV7nc5WhA4k5rpdtjQEC9G66boq13FtaNoec7g3jFiUZP)
You can test them with refmt cbor=pretty < crasherFile.
Some of them cause out of memory, some are tagged timeout but those probably are OOM as well.
Hey, thanks for reporting. Just to let you know, I've seen this, and started looking into it, but I'm feeling a bit under the weather and fixes might take a few days.
Looks like allocating 3472328296227680304 bytes is a bit of a no-go :)  I guess the first round of fixes here will be to double-check the sanity of length headers before continuing on to an alloc, and that should fix the basic crashers.  A "good" solution probably should count the rough total of alloc'd space over time so we can aim for finite memory usage even on outright antagonistic input -- that will take a little more coding though.
I went ahead and added the basic guardrails around overly-huge allocs in the byte and string paths in the cbor decoder. It fixes all the crashers in the dataset you provided, turning them into error messages instead. Those are merged to master now.
The further solution of keeping a stateful running tally of estimated memory demanded by the deserialization so far is something I'd still like to do as well, but I figured no harm in plucking the lowest hanging fruit first :)
Cool, this will allow me to fuzz it further. Is there any reason for 33554432 as a limit?
Also would you be interested in hosting fuzzing corpus and test directly in this repo? Corpus currently is very small <10KB.
Here is second set of crashers, all connected with either makeslice: len out of range or slice bounds out of range
crashers2.tar.gz
Yeah! I've never used this kind of tool before, so I don't know exactly how to most productively incorporate it. But I'm all ears!
Does this tool have a minimizer feature that goes further? I feel some of these should be integrated as regression test cases as well, but I'm a little adding test cases that have a bunch of coincidental fluff in them.
Arbitrarily 1024*1024*32 = 32MB.  This should probably be configurable.  I was basically going precisely for the "Cool, this will allow me to fuzz it further" reaction :D
Does this tool have a minimizer feature that goes further? Yes it does.
What you can do is have the minimized corpus tested with CI on commit (read corpus input off the file and test it).
Yeah, could you go ahead and fire off a PR with your setup? I'm been thinking about this more and I'd love to take whatever you've got and run with it.