rustling-ontology icon indicating copy to clipboard operation
rustling-ontology copied to clipboard

Year not identified for dates before the Unix epoch (January 1 1970)

Open koenvervloesem opened this issue 6 years ago • 10 comments
trafficstars

Parsing Error

Rustling doesn't identify the year in dates before the Unix epoch (January 1 1970).

This is a similar issue as #102 and the same issue as my comment there, but I add this here as a new issue because I found the exact date where it goes wrong.

Version

0.17.7

Language

en

Parser input

december 31 1969

Parser output

| ix | log(p)       | p          | text             | value                                                                                               |
+====+==============+============+==================+=====================================================================================================+
| 1  | -0.072079904 | 0.9304565  | ____________1969 | Integer(IntegerOutput(1969))                                                                        |
+----+--------------+------------+------------------+-----------------------------------------------------------------------------------------------------+
| 0  | -0.17216337  | 0.84184164 | december 31_____ | Time(TimeOutput { moment: 2019-12-31T00:00:00+01:00, grain: Day, precision: Exact, latent: false }) |
+----+--------------+------------+------------------+-----------------------------------------------------------------------------------------------------+

Parser expected output

+----+------------+-----------+------------------+-----------------------------------------------------------------------------------------------------+
| ix | log(p)     | p         | text             | value                                                                                               |
+====+============+===========+==================+=====================================================================================================+
| 0  | -0.4431368 | 0.6420194 | december 31 1969 | Time(TimeOutput { moment: 1969-12-31T00:00:00+01:00, grain: Day, precision: Exact, latent: false }) |
+----+------------+-----------+------------------+-----------------------------------------------------------------------------------------------------+

koenvervloesem avatar Jan 20 '19 14:01 koenvervloesem

It also goes wrong when you add a time.

Parser input

december 31 1969 23:59

Parser output

+----+--------------+------------+------------------------+--------------------------------------------------------------------------------------------------------+
| ix | log(p)       | p          | text                   | value                                                                                                  |
+====+==============+============+========================+========================================================================================================+
| 2  | -0.072079904 | 0.9304565  | ____________1969______ | Integer(IntegerOutput(1969))                                                                           |
+----+--------------+------------+------------------------+--------------------------------------------------------------------------------------------------------+
| 1  | -0.11270594  | 0.89341336 | _________________23:59 | Time(TimeOutput { moment: 2019-01-20T23:59:00+01:00, grain: Minute, precision: Exact, latent: false }) |
+----+--------------+------------+------------------------+--------------------------------------------------------------------------------------------------------+
| 0  | -0.17216337  | 0.84184164 | december 31___________ | Time(TimeOutput { moment: 2019-12-31T00:00:00+01:00, grain: Day, precision: Exact, latent: false })    |
+----+--------------+------------+------------------------+--------------------------------------------------------------------------------------------------------+

Parser expected output

+----+-----------+------------+------------------------+--------------------------------------------------------------------------------------------------------+
| ix | log(p)    | p          | text                   | value                                                                                                  |
+====+===========+============+========================+========================================================================================================+
| 0  | -1.499431 | 0.22325715 | december 31 1969 23:59 | Time(TimeOutput { moment: 1969-12-31T23:59:00+01:00, grain: Minute, precision: Exact, latent: false }) |
+----+-----------+------------+------------------------+--------------------------------------------------------------------------------------------------------+


koenvervloesem avatar Jan 20 '19 14:01 koenvervloesem

Other cases in https://github.com/snipsco/rustling-ontology/issues/102

rosastern avatar Sep 27 '19 09:09 rosastern

@kali or @hdlj , would be great to investigate this with your help!

rosastern avatar Oct 04 '19 09:10 rosastern

I also have this issue. Looking at the code of moment/src/interval_constraints.rs you can see there is a hard code of 1970. On a local copy I tested what would happen if I changed this to 1900 for example and it was able to resolve dates after 1900 with the change.

pub fn for_reference(now: Interval<T>) -> Context<T> {
    // TODO: Should be refactor with the min, max date offer by chrono crate
    let now_end = now.end_moment();
    let max_year = if 2038 > now_end.year() + 70 {
        now_end.year() + 70
    } else {
        2038
    };
    let min_year = if 1970 < now.start.year() - 70 {
        now.start.year() - 70
    } else {
        1970
    };
    let min_interval = Interval::starting_at(
        Moment(now.timezone().ymd(min_year, 1, 1).and_hms(0, 0, 0)),
        Grain::Second,
    );
    let max_interval = Interval::starting_at(
        Moment(now.timezone().ymd(max_year, 1, 1).and_hms(0, 0, 0)),
        Grain::Second,
    );
    Context::new(now, min_interval, max_interval)
}

Do you think we can change this value?

canicegen avatar Jun 26 '20 14:06 canicegen

@canicegen I have opened PR https://github.com/snipsco/rustling-ontology/pull/212 in relation to this.

schutza avatar Jul 03 '20 17:07 schutza

Hello, unfortunately that's not an issue we'll be able to tackle this way. I'll let @hdlj comment and explain why :)

RosaSternSonos avatar Jul 16 '20 11:07 RosaSternSonos

Thanks for opening this issue!

Rustling computes a date regarding a given context. This context gives to the algorithms the min/max date to support. The reference context is designed to be cross-platform. The issue was with Raspbian (32 bits system). It is not an issue on 64 bits.

When rustling is parsing a date before 1970 on Raspbian, it will return a date between 1970 and 2038 without throwing an error. This reference Context is there to prevent this situation.

Removing this constraint is not a good idea. However, if the platform you are using doesn't have this issue, you should be able to build the ResolverContext as you want (1900 -> 2100 or 1000 -> 3000 for instance). Unfortunately this API is not available but easy to add.

Thanks for using Rustling :) !!

hdlj avatar Jul 16 '20 15:07 hdlj

@hdlj - thank you for your comments. If this is a guard to protect against 32 bit systems would you accept altering the for_reference function to check the architecture and have a conditional - something like:

if env::consts::ARCH.ends_with("64") {  
  // 64 bit so allow wider range for years
} else {
 // current behavior - 1970 min year
}

That way there is not need to wire through extra API?

canicegen avatar Jul 16 '20 19:07 canicegen

Many thanks @canicegen for your comment. Indeed we should apply the restrictions only on 32bits operating system. With your approach is it working if we install a Raspbian (32 bits) on a 64 bits architecture (Raspi 3) ?

In PR #212 I have added the manual API, to by pass the hardcoded span of 70 years.

hdlj avatar Jul 17 '20 12:07 hdlj

Hi @hdlj - I just did a quick test on a Raspberry Pi 3B with Raspian and it looks ok. The CPU is Broadcom bcm2837rifbg which is 64 bit.

cat /etc/os-release 

PRETTY_NAME="Raspbian GNU/Linux 8 (jessie)"
NAME="Raspbian GNU/Linux"
VERSION_ID="8"
VERSION="8 (jessie)"
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

uname -m

armv7l


use std::env;

fn main() {
    println!("{}", env::consts::ARCH);
}


arm

canicegen avatar Jul 17 '20 15:07 canicegen