fast-check
fast-check copied to clipboard
arbitrary: string matching regex
🚀 Feature Request
Test with strings matching a given regex
Motivation
Fastcheck doesn't do whitebox test like afl-fuzz and so on, but for testing parsers etc we can still go a bit deeper by throwing known-valid strings which would pass "first layer of defenses"
See genex and regexp-enumerator (though probably we need one that can generate given the seed and not just iterate thru the possible universe)
Example
- test url parser with pattern like "https://[a-z0-9-.]+/.*" and not just random strings
Actually, one can go even deeper and build CFG generators and not just regexes with invertible parsers like nearley but we have to stop somewhere :)
Edit: will help with #484 by iterating thru possibilities faster :)
Thanks for the suggestion, it will clearly push #484 a step further. Let's see how it goes for the simple case of #484 and iterate over it to try something way more powerful as you suggested 🤔
The main challenge would probably be performance to generate strings and maybe ability to shrink them properly.
For the moment, for the regex you passed the best way would be to use a stringOf generating the a-z0-9-., a fullUnicodeString for the .* and the tuple of them followed by a map appending the https.
Here is the current approach to build those kind of values using fast-check for the moment:
const alhpaNumericCharacterArb = fc.mapToConstant(
{ num: 26, build: v => String.fromCharCode(v + 0x61) }, // a-z
{ num: 10, build: v => String.fromCharCode(v + 0x30) }, // 0-9
{ num: 1, build: v => '-' }, // -
{ num: 1, build: v => '.' }, // .
);
const urlArb = fc.record({
domain: fc.stringOf(alhpaNumericCharacterArb),
path: fc.fullUnicodeString(),
}).map(opts => `https://${opts.domain}/${opts.path}`);
Please note, that there is a built-in builder for urls, see webUrl.
another example is:
fc.property(
fc.oneof(
fc
.record({
ip: fc.ipV4(),
port: fc.option(fc.integer({ min: 1, max: 65535 })),
})
.map(({ ip, port }) => `${ip}${port ? `:${port}` : ""}`),
fc
.record({
ip: fc.ipV6(),
port: fc.option(fc.integer({ min: 1, max: 65535 })),
})
.map(({ ip, port }) => {
if (port) {
return `[${ip}]:${port}`;
}
return ip;
}),
fc.integer({ min: 1, max: 65535 }).map((port) => `:${port}`)
)
)
which generates for me either a ipv4 address w/ optional port, an ipv6 address with optional port, or a raw port.
I've enjoyed the fc.record({}).map(()=>{}) pattern recently for easily creating one-off string arbitraries
fc.stringMatching(/[\w.-]*/) would be more concise and likely more efficient than fc.stringOf(fc.char().filter((c) => /[\w.-]/.test(c))) I'm currently using.
fast-check could parse regex with https://github.com/fent/ret.js and map its AST to fc.Arbitrary<number[]>, and then
String.fromCodePoint.
Good news, I'm currently working on a first version of a stringMatching(Regex) helper. So far, I don't manage contextual parts such as ^, $ or \b. This feature will probably come as a separate package as it needs to pull an external dependency, so I prefer putting it appart at the moment.
In the meantime, depending how complex blob formats are, I may provide a built-in string matching blob as part of fast-check while regex one will stay appart (for now).