scalacheck-gen-regexp
scalacheck-gen-regexp copied to clipboard
A library for creating Scalacheck generators from regular expressions
scalacheck-gen-regexp
A library for creating scalacheck generators from regular expressions
Cross-built for Scala 2.12/2.13/3.1/3.2
Installation
In your build.sbt
libraryDependencies += "io.github.wolfendale" %% "scalacheck-gen-regexp" % "[VERSION]"
Quick start
import wolfendale.scalacheck.regexp.RegexpGen
val generator: Gen[String] = RegexpGen.from("[1-9]\\d?(,\\d{3})+")
Supported syntax
| Feature | Example | Notes |
|---|---|---|
| Literals | a, \\w, 7 |
Literals are transformed into constant generators |
| Character Classes | [abc], [^abc], [a-zA-Z0-9] |
Character classes are transformed with Gen.oneOf |
| Default Classes | \w, \d, \S, . |
These are treated as predefined character classes |
| Quantifiers | a?, b+, c*, d{3}, e{4,5}, f{5,} |
These use Gen.listOfN to create sized lists of the preceding term |
| Groups | (abc), (?:def) |
Backreferences are not supported, groups can only be used for grouping terms |
| Alternates | a|b|c, a(b|c)d |
Alternates are also transformed with Gen.oneOf |
| Boundaries | ^, $, \b |
Although these will be parsed they do not modify the generator output |
Unsupported syntax
| Feature | Example | Notes |
|---|---|---|
| Backreferences | ([ab]\1) |
With the current implementation there's no simple way to do this, definitely in consideration for a future release |
| Octal / Hex / Special Literals | \012, \xF1, \p{Lower} |
Most of these should be simple to implement but I wanted to get an initial release created first |
| Character Class Intersection | [a&&[b]], [a[b]] |
Difficult to implement currently but not impossible, definite consideration for a future release |
Tips for usage
-
In order to represent any character,
RegexpGen#fromtakes an implicitArbitrary[Char]. There is a default instance provided by scalacheck however for most uses you probably want to provide your own. -
If you use the
+or*quantifiers you'll end up getting huge variance in string sizes. If this isn't what you want, consider bounding the lengths of certain string segments with the{min,max}quantifier. -
Negated character classes / default classes are implemented by generating an arbitrary
Charwithin certain bounds viasuchThat, because of this you can end up throwing away a lot of cases and in certain circumstances your tests may fail. Try to refactor out negative cases. -
In character classes each option is given equal weighting. If you'd prefer to weight a particular entry you can add it multiple times, this is made easier with string interpolation:
s"[${"a-z"*5}\s]". In the example case the generator is 5 times more likely to generate an alpha character than a space.