dk.brics.automaton icon indicating copy to clipboard operation
dk.brics.automaton copied to clipboard

Generate matching strings randomly instead of exhaustively

Open davisjam opened this issue 6 years ago • 1 comments

This is not a serious pull request, but the idea might be of interest and other people might want this feature.

Generating strings exhaustively may take a long time if all one desires is a random set of matching strings.

This patch:

  • introduces a getRandomStrings method that selects transitions and transition characters randomly to reduce the set of generated strings.
  • uses GSON for machine-parseable output
  • modifies pom.xml so that mvn package produces a jar that can be executed to generate random strings up to the requested length.

For example, to generate 10 strings for the regex /abc[0-9]+/ with 0 probability of "excessive" exploration:

(11:02:04) jamie@woody /tmp/dk.brics.automaton $ java -jar target/brics-string-generator-1.0.jar 'abc[0-9]+' 10 0 2>&1 | grep 'RAND STR'
RAND STR: "abc68715"
RAND STR: "abc2971316"
RAND STR: "abc53"
RAND STR: "abc7732"
RAND STR: "abc3"
RAND STR: "abc679"
RAND STR: "abc503916"

davisjam avatar Jan 17 '19 16:01 davisjam

Thanks for the suggestion. Just a quick comment: Random strings could be generated with many different probability distributions, and it is not clear why this one is particularly useful.

amoeller avatar Jan 17 '19 20:01 amoeller