whois icon indicating copy to clipboard operation
whois copied to clipboard

.ua domains might erroneously end up with two creation dates

Open lelutin opened this issue 2 years ago • 0 comments

One of the tests in the test suite is broken: the expectation for the creation_date field is to have a list of two dates.

This is actually a bug in the behaviour of the .ua parser : we're getting the correct creation date, but additionally we're also grabbing the creation_date of one of the contacts in the response.

Here's an example of this happening. In the whois response, we should only be expecting to see datetime.datetime(2002, 12, 4, 0, 0) as a response, but there is an additional one:

In [1]: import whois

In [2]: whois.whois("google.com.ua")
Out[2]: 
{'domain_name': 'google.com.ua',
 'status': ['clientDeleteProhibited',
  'clientTransferProhibited',
  'clientUpdateProhibited',
  'ok',
  'linked'],
 'registrar': 'MarkMonitor Inc.',
 'registrar_name': 'ua.markmonitor',
 'registrar_url': 'http://markmonitor.com',
 'registrar_country': 'US',
 'registrar_city': 'Meridian, Idaho',
 'registrar_address': 'US 83642 Meridian, Idaho 2150 S. Bonito Way, Suite 150',
 'registrar_email': '[email protected]',
 'registrant_name': 'Google LLC',
 'registrant_country': 'US',
 'registrant_city': 'Mountain View',
 'registrant_state': 'CA',
 'registrant_address': '1600 Amphitheatre Parkway',
 'registrant_email': '[email protected]',
 'registrant_postal_code': '94043',
 'registrant_phone': '+1.6502530000',
 'registrant_fax': '+1.6502530001',
 'admin': 'Google LLC',
 'admin_country': 'US',
 'admin_city': 'Mountain View',
 'admin_state': 'CA',
 'admin_address': '1600 Amphitheatre Parkway',
 'admin_email': '[email protected]',
 'admin_postal_code': '94043',
 'admin_phone': '+1.6502530000',
 'admin_fax': '+1.6502530001',
 'updated_date': datetime.datetime(2020, 12, 11, 1, 7, 19),
 'creation_date': [datetime.datetime(2002, 12, 4, 0, 0),
  datetime.datetime(2018, 2, 27, 21, 7, 26)],
 'expiration_date': datetime.datetime(2021, 12, 4, 0, 0),
 'name_servers': ['ns1.google.com',
  'ns2.google.com',
  'ns3.google.com',
  'ns4.google.com']}

The regexp for creation_date in WhoisUA needs to be made stricter so that it only catches the first occurence of ^created:. However, I tried to use the following but it didn't work. From what I understand, WhoisEntry.parse() ends up using re.findall() which make the back reference totally useless.

'creation_date':                  r'(?<!Registrant:)created: +(.+)',

I'm stumped about how to fix this.

lelutin avatar Aug 12 '21 04:08 lelutin