parsedatetime
parsedatetime copied to clipboard
Unexpected output with cal.nlp('Sunday is a time of rest')
Using version 1.4 Wondering why I would have gotten a datetime back at all ?
CODE: import parsedatetime cal = parsedatetime.Calendar() nlp = cal.nlp('Sunday is a time of rest') print nlp
OUTPUT: ((datetime.datetime(2014, 11, 2, 17, 52, 48), 1, 0, 6, 'sunday'),)
the nlp() routine tries all of the words to find out which one is a possible date/time to parse, so it found "Sunday" and went with it
@bear Hmm. OK. Makes sense.
I was hoping for some kind of way now for parsedatetime to return to me a level of confidence that it has a Day, Month, and Year (Date without Time) within a very close string based on a Levenshtein distance ? I need to know "Hey, this is nearly most probably a Date, because I found a Day, Month, Year which are very close to one another in this string".
Perhaps having an option parameter for the parse() or nlp() where it would output nulls instead when not finding a Day or Month or Year ? The output would then look like:
OUTPUT: ((datetime.datetime(null, null, null, null, null, null), null, null, null, 'sunday'),)
or another option, returning 4 for "invalid Date, missing either a Day, or Month, or Year" and adding to the : 0 = not parsed at all 1 = parsed as a C{date} 2 = parsed as a C{time} 3 = parsed as a C{datetime} 4 = parsed as an invalid Date, missing either a Day, or Month, or Year
Thoughts?
I've always wanted to add a new result to both parse() and nlp() that was a list of possibles with weights - so your Levenshtein distance value idea would be a perfect use case for this.
So nlp("Sunday I owe $300") would return something like: [(2014, 11, 3, 12, 0, 0, 0, 0, 1), 2, "Sunday", 0), (2014, 10, 29, 3, 0, 0, 2, 302, 1), 2, "$300", 5)]
So yea, something like the above being returned when a parameter is enabled ... +1 for sure