pyWhat icon indicating copy to clipboard operation
pyWhat copied to clipboard

IPv6 regex matches on "::"

Open SkeletalDemise opened this issue 4 years ago • 8 comments

Matched on: ::
Name: Internet Protocol (IP) Address Version 6
Link:  https://www.shodan.io/host/::

This shouldn't match.

SkeletalDemise avatar Oct 09 '21 18:10 SkeletalDemise

Can you explain a little more?

Alireza-Sampour avatar Oct 09 '21 19:10 Alireza-Sampour

Can you explain a little more?

It matches a false positive value. The changes of regex may be required or, if regex seems to be perfect, its rarity should be lowered.

ghost avatar Oct 09 '21 21:10 ghost

I think we should remove :: :)

bee-san avatar Oct 11 '21 10:10 bee-san

I think we should remove :: :)

What do you mean by that? We can't remove all instances of :: because then the IPv6 regex won't work.

SkeletalDemise avatar Oct 12 '21 03:10 SkeletalDemise

I think we should remove :: :)

What do you mean by that? We can't remove all instances of :: because then the IPv6 regex won't work.

Why not? :: will also match on things like Dictionary::Entry::Point which is not an ipv6 address. It has too many false positives and isn't helpful.

I doubt the IPv6 address regex will 100% break if we ask it to not match things which are 2 characters

Also, please explain why it won't work in the future instead of making me ask you why :) Saves a lil bit of time / effort ! :)

bee-san avatar Oct 12 '21 07:10 bee-san

I doubt the IPv6 address regex will 100% break if we ask it to not match things which are 2 characters

Sorry I got confused and thought you meant removing all instances of :: and not just :: by itself.

Why not? :: will also match on things like Dictionary::Entry::Point which is not an ipv6 address. It has too many false positives and isn't helpful.

Now I'm confused again. :: is used in IPv6 addresses. See here.

Also, please explain why it won't work in the future instead of making me ask you why :) Saves a lil bit of time / effort ! :)

The reason it wouldn't work is because some IPv6 addresses use :: in them and the regex wouldn't match those.

SkeletalDemise avatar Oct 13 '21 05:10 SkeletalDemise

I worked on the IPv6 regex, so I can explain why and in what case this happens. :)

Regex doesn't just match to plain :: because we wanted to remove that and a check was added that there must be at least one number. Now, by default with boundaries the "look ahead" check works only on the match and there is no issue. On the other hand in boundaryless mode this check is working on the whole line. For example, in boundaryless if a line has :: and a number after that somewhere, it would show :: as a match.

The reason it wouldn't work is because some IPv6 addresses use :: in them and the regex wouldn't match those.

That is correct. This is why I didn't put any more restrictions in. IPv6 can start or end or have :: multiple times inside as it used as a short version of all numbers being zero.

amadejpapez avatar Oct 13 '21 06:10 amadejpapez