WhatsMyName icon indicating copy to clipboard operation
WhatsMyName copied to clipboard

Create regexes for profile name extraction on each site

Open WebBreacher opened this issue 4 years ago • 2 comments

OSINTCombine asked for us to go through each site and add a line for fullname extraction regex/pattern. This will allow downstream tools to extract names and match to usernames.

WebBreacher avatar Apr 15 '20 14:04 WebBreacher

[...] add a line for fullname extraction regex/pattern.

I suggest using an array of regexs, rather than a single regex string. It is common for profiles to display various formats to represent user profiles. For example, a HTML view rendering for a "pro" user may have a different layout to a "free" user, or a "staff" user may have a different layout to a regular "user".

Also, to clarify: full name extraction or user name extraction ?

There is benefit in implementing both. ie, capitalization of some letters in the username might be useful (PenIsland vs PenisLand), or the username might include extra characters not represented in the URL-safe version or may be different entirely (URL-safe 420iq360nosc0pe69 vs xx_[420iq]360-no-sc(+)pe-69_xx.)

In which case, if you're going to go this route, it would be useful to add other fields for data extraction (dob, address, date registered, etc), although obviously this adds maintenance overhead.

bcoles avatar Sep 15 '20 11:09 bcoles

It is maybe useful for you guys: Maigret can extract information from a dozens of sites through socid_extractor. Also, there is an automatic generation of regexps for sites in the Maigret database. It is possible to extract new usernames from URLs parsed from account pages with such regexps.

soxoj avatar May 08 '21 18:05 soxoj

Closing and not implementing

WebBreacher avatar Mar 14 '23 23:03 WebBreacher