go-fuzz icon indicating copy to clipboard operation
go-fuzz copied to clipboard

Improve versifier

Open dvyukov opened this issue 9 years ago • 3 comments

Currently versifer (automatic protocol reverse engineering) does only very basic analysis of text protocols. There are plenty of things that can be improved:

  • Rewrite analysis (currently is it more of a quick prototype), it probably should use some structured approach like Sequitur [0].
  • If a token sequence can be both, say, a list of key-value pairs and a key-value pair where the value is a list; versifer could build AST for both possibilities and then join them using ChoiceNode. Then, during generation we choose one of the possibilities.
  • Try to join several inputs into a single AST. For example, if one inputs contain key-value pair with alphanum value, while another inputs contain key-value pair with num value at the very same position; we could join both inputs and say that this a key-value pair that can contain both alphanum and num as value. This joining can be done on parts of the same input as well. For example, if we have an HTTP request with 10 headers, we can figure out that these headers have similar structure and build a common dictionary of header names and a common representation of header values.
  • Explore analysis of binary protocols (see [1] and [2]).
  • Better detection of text/binary protocols. I think I saw versifer being triggered for a binary protocol.
  • Better approach for mutations. Currently it does too many mutations. Also see whether there are other interesting mutations.
  • Better testing story. E.g. unit tests could test that it recognizes an input as, say, list of key-value pairs; while system tests could test that we can generate an output with required properties from given input in finite time.
  • Investigate how it works on some common inputs (xml, json, http, protobufs). This can uncover bugs in analysis and suggest new interesting mutation strategies.

[0] https://en.wikipedia.org/wiki/Sequitur_algorithm [1] Discoverer: Automatic Protocol Reverse Engineering from Network Traces http://research.microsoft.com/pubs/153196/discoverer-security07.pdf [2] Reverse Engineering of Protocols from Network Traces http://www.di.fc.ul.pt/~nuno/PAPERS/WCRE11.pdf

dvyukov avatar Jul 18 '15 20:07 dvyukov

Also https://github.com/aoh/radamsa

dgryski avatar Aug 18 '15 19:08 dgryski

Alao https://github.com/jasantunes/reverx

dgryski avatar Jan 06 '17 04:01 dgryski

I now have a rough initial implementation of sequitur, based on the reference C++ code: https://github.com/dgryski/go-sequitur

dgryski avatar Dec 22 '17 08:12 dgryski