Incorrect Recognition of "in [month]" pattern
The parser fails to correctly process date strings prefixed with "in" and followed by the short name of months (e.g., "in Jun"). While the recognition strips the "in" and identifies the month (e.g., "Jun"), the resulting text does not retain "in", causing the standalone month not to be recognized as a date when parsed separately.
Environment
- Version:
"chrono-node": "^2.7.5"
Steps to Reproduce
-
Configure a new chrono instance with default settings:
const configuration = chrono.casual.defaultConfig.createCasualConfiguration(false); const chronoInstance = new chrono.Chrono(configuration); const forwardFrom = new Date(); -
Parse the string "in Jun" with the following settings:
const result = chronoInstance.parse('in jun', { forwardDate: true, startDayHour: 8 });Expected result:
result[0].textshould be "in jun". Actual result:result[0].textis "jun". -
Repeat the parsing without the "in" prefix:
const result = chronoInstance.parse('jun', { forwardDate: true, startDayHour: 8 });Expected result: Date recognition for "jun". Actual result: No date is returned.
Expected Behavior
The parser should retain the "in" prefix in the recognized text because its removal results in the standalone month not being recognized as a date. Ideally, both "in Jun" and "Jun" should be correctly parsed as dates with the context retained when necessary.
Actual Behavior
The parser outputs "Jun" instead of "in Jun" when parsing "in Jun", and subsequently fails to recognize "Jun" as a valid date in the absence of the "in" prefix.
Proposed Fix
Retain the "in" prefix when needed to subsequently recognize the text.
Hello. Thanks for reporting this.
I do not quite agree with the expected behavior.
The parser should retain the "in" prefix in the recognized text because its removal results in the standalone month not being recognized as a date.
We do not have a precise definition what make up of the result text (and its index location).
What has been the case so far is: text makes up phases the date/time are parsed/extracted from. It does not have to include other "clue" or "context" phases to explain why we think the text is date/time.
In your example, the word "in" (from "in jun") is only a clue. It is not part of the date/time.
Ideally, both "in Jun" and "Jun" should be correctly parsed as dates with the context retained when necessary
I don't think I agree with that assumption.
Unless you know what is the input domain/context, "Jun" or "jun" is not always "June". (If anything, I am also not sure if "in jun" should be recognized as June either).
--
Could you share the use-cases where this current behavior is inconvenient for you?