randexp.js icon indicating copy to clipboard operation
randexp.js copied to clipboard

Lookaheads generate content and negative lookaheads have no effect

Open blixt opened this issue 8 years ago • 10 comments

Lookaheads are only assertions that the following character(s) match, but don't actually consume characters. It seems randexp generates random content for them, however:

^(?=a).$ // can only match a single "a" but generates two characters ("a" and a random one)

Negative lookaheads should remove possible matches ahead, but randexp ignores them:

^(?!a)[a-b]$ // can only match "b" but also generates "a"

blixt avatar Sep 03 '15 21:09 blixt

example of this behavior in https://github.com/fent/randexp.js/issues/17

georgePadolsey avatar Sep 05 '15 14:09 georgePadolsey

Going to unify #15 and #17 into this issue.

Right now, I don't know how this would be accomplished. Specially with groups within groups and groups that come after lookaheads.

I'll have to think about it some more.

fent avatar Sep 21 '15 02:09 fent

I think that behaviour of lookahead should be the same as without using lookahead. For example: I think that randexp('x(?=y)') should produce the same kind of strings that randexp('xy') because I think that tautology rule exposed in #19 is important to be acomplished.

It is true that 'xy'.match(/x(?=y)/) === ['x']. But /x(?=y)/.test('x') is false. For that, I think that randexp('x(?=y)') should return 'xy'. Because then, /x(?=y)/.test(randexp(/x(?=y)/)) is true.

In summary I think that lookahead expressions are not useful to generate random strings. Any string produced by regular expression with lookahead can be produced with regular expression without lookaheads.

xgbuils avatar Jul 18 '16 18:07 xgbuils

@xgbuils This issue is different than what you describe. Look again at the example:

var regex = /^(?=a).$/;
'a'.match(regex); // ["a"]
var example = randexp(regex); // "ax"
regex.test(example); // false

blixt avatar Jul 18 '16 18:07 blixt

But /^(?=a).$/ is very weird regular expression. I cannot understand how this regexp works because lookahead should be used after some pattern. If not, this looks like an attempt to do a lookbehind.

xgbuils avatar Jul 18 '16 19:07 xgbuils

I'm having the same issue with ^(?=[a-hj-npr-zA-HJ-NPR-Z0-9]{17}$)(?=.[0-9])(?=.[a-hj-npr-zA-HJ-NPR-Z])([a-hj-npr-zA-HJ-NPR-Z0-9]+)$

esperancaJS avatar Oct 13 '16 09:10 esperancaJS

@fent any chance of having look ahead implemented in the near future? They're extremely practical when trying to do logical AND on regexes from multiple sources... (thanks for the awesome work!)

JonathanMontane avatar Oct 14 '16 14:10 JonathanMontane

I might take a shot at it eventually, but not anytime soon. :/

fent avatar Oct 15 '16 20:10 fent

@PedroEsperanca what are you expecting to get back, and what are you actually getting back?

WesTyler avatar Oct 17 '16 15:10 WesTyler

a VIN number.

Something like:

1MEFM50293A615995

(taken from http://randomvin.com/)

esperancaJS avatar Oct 19 '16 09:10 esperancaJS