DataProfiler icon indicating copy to clipboard operation
DataProfiler copied to clipboard

Datetime Profiler cannot current detect dates with days which have suffixes

Open JGSweets opened this issue 3 years ago • 1 comments

General Information:

  • Library version: v0.7.6

Describe the bug: While this date can be detected: Nov 15, 2013 adding a suffix to the day does not allow it to be predicted, e.g. Nov 15th, 2013.

To Reproduce:

DateTimeColumn._get_datetime_profile(pd.Series(['Nov 15th, 2013']))

# output:
# {
#     'date_formats': [],
#     'min': None,
#     'max': None,
#     'min_obj': datetime.datetime(9999, 12, 31, 23, 59, 59, 999999),
#     'max_obj': datetime.datetime(1, 1, 1, 0, 0),
#     'match_count': 0
#  }

Expected behavior:

DateTimeColumn._get_datetime_profile(pd.Series(['Nov 15th, 2013']))

# output:
# {
#     'date_formats': ['%b %d, %Y'],  # something in this should indicate a suffix, currently doesn't.
#     'min': 'Nov 15th, 2013',
#     'max': 'Nov 15th, 2013',
#     'min_obj': Timestamp('2013-11-15 00:00:00'),
#     'max_obj': Timestamp('2013-11-15 00:00:00'),
#     'match_count': 1
# }

JGSweets avatar Feb 09 '22 17:02 JGSweets

probably a short term fix for removing th, nd, rd, and st so they can at least be recognized.

JGSweets avatar Feb 16 '22 16:02 JGSweets