dateparser
dateparser copied to clipboard
Incorrect result
In [37]: dateparser.parse(date_string="11 de a")
Out[37]:
datetime.datetime(
year=2011,
month=7,
day=14,
hour=15,
minute=57,
second=17,
microsecond=334749
)
It happens a lot with strings of the format "\d\d \W\ \W\W", and not with strings longer and shorter than that.
Here I leave a small list of invalid dates I've seen.
invalid_regexp_list: list[str] = [
r"\D*\d{1,2} \D{1,2} \D",
r"\D*\d{1,2} \d{1,2} \D",
r"\D*\d{1,2} \D{1,2} \D\d",
r"\D*\d{1,2} \d{1,2} \d\D",
r"\D*\d{1,2} \D \D{1,2}"
]
def detect_invalid_date(text: str) -> bool:
for regexp in invalid_regexp_list:
result = re.findall(regexp, text)
if len(result) > 0:
return True
return False