dataprep
dataprep copied to clipboard
Date Cleaning (clean_date) falied to clean dates with 'August'
Describe the bug When cleaning dates with clean_date module, if the source date contains 'August', the function will not recognize it as a date. All other text months including 'Aug' can be properly identified and cleaned.
To Reproduce
from dataprep.clean import clean_date
import pandas as pd
samp = pd.DataFrame({'date': ['2021 August 21', '2021 Aug 21', '2021 July 21', '2021 Jul 21', '2021 08 21', 'Aug 21 2021']})
clean_date(samp, 'date')
Expected behavior E.g. '2021 August 21' will be cleaned into '2021-08-21 00:00:00'.
Screenshots
Desktop (please complete the following information):
- OS: Windows 11
- Browser: N/A
- Platform: VSCode
- Platform Version: 1.80.0
- Python Version: 3.10.11
- Dataprep Version: 0.4.5
Additional context I noticed that there is already an issue open on FutureWarning: Meta is not valid.
The issue might be in tokens = split(date, JUMP)
where 'st'
is in the JUMP
list.