pysaml2 icon indicating copy to clipboard operation
pysaml2 copied to clipboard

Does not handle valid time properly

Open talebbits opened this issue 8 years ago • 12 comments

ValueError: time data '2017-08-29T09:16:45.0631274-05:00' does not match format '%Y-%m-%dT%H:%M:%SZ'

https://github.com/rohe/pysaml2/blob/fd7a4f694b137a92f2a8b7f502d51fc21e3528c7/src/saml2/time_util.py#L19

talebbits avatar Aug 29 '17 20:08 talebbits

What fails, how and under which circumstances ?

jkakavas avatar Aug 29 '17 20:08 jkakavas

when timestr = '2017-08-29T09:16:45.0631274-05:00' this function does no handle this format https://github.com/rohe/pysaml2/blob/fd7a4f694b137a92f2a8b7f502d51fc21e3528c7/src/saml2/time_util.py#L232

talebbits avatar Aug 29 '17 20:08 talebbits

Where does this timestr comes from ?Is it an IssueInstant, a NotBeforeOrAfter ? Can we get the stacktrace ?

I think this is a valid bug and we probably need to come up with a different regexp to validate dateTime even if the one we use now matches the most commonly used format

jkakavas avatar Aug 30 '17 06:08 jkakavas

It comes as <samlp:Response and it is in every datetime attribute including IssueInstant

talebbits avatar Aug 30 '17 14:08 talebbits

I think that this could be helpfull

import datetime
import dateutil.parser

def getDateTimeFromISO8601String(s):
    d = dateutil.parser.parse(s)
    return d

s = '2017-08-29T09:16:45.0631274-05:00'
getDateTimeFromISO8601String(s)

datetime.datetime(2017, 8, 29, 9, 16, 45, 63127, tzinfo=tzoffset(None, -18000))

Tested on Python 2.7.12. It also worked with the following combinations

s = '2017-08-29T09:16:45.0631274-05:00'
s = '2017-08-29T09:16:45.0631274'
s = '2017-08-29T09:16:45.0631'
s = '2017-08-29T09'
s = '2017-08-29 03:01:05'

# italian format
s = '29/08/2017 03:01:05'

# also if string seems truncated
 s = '2017-08-29T'

In other words we could subsitute the actual str_to_time function with dateutil.parser.parse or, more in the deep, refactor the entire time_util.py file with standard python libraries.

peppelinux avatar May 27 '18 18:05 peppelinux

Looking at the SAML2 xsd, I see that

<attribute name="IssueInstant" type="dateTime" use="required"/>

IssueInstant is of type xsd:dateTime. The XML Schema datatypes spec mandates

The ·lexical space· of dateTime consists of finite-length sequences of characters of the form: '-'? yyyy '-' mm '-' dd 'T' hh ':' mm ':' ss ('.' s+)? (zzzzzz)?

This means that 2017-08-29T09:16:45.0631274-05:00 is a valid value which makes this issue a proper bug report, and it should be fixed 👍

c00kiemon5ter avatar Jun 08 '18 15:06 c00kiemon5ter

According to the dateTime datatype spec, it is mandatory to include the year, the month, the day, the hour, the minutes and the seconds. One can omit only the second fragments and the timezone info. This means that some of the results that dateutil.parser.parse returns (as proposed by @peppelinux) are invalid.

I feel it is OK to relax this rule and accept datetime values such as 29/08/2017 03:01:05 and 2017-08-29T which would otherwise be invalid by the spec. If people think this should be strict we can do that by fixing the TIME_FORMAT_WITH_FRAGMENT regex and make consistent use of that.

I also agree that time_utils.py should be refactored to use standard built-in methods to do time calculations.

c00kiemon5ter avatar Jun 08 '18 15:06 c00kiemon5ter

I went through the code and it seems pysaml2 uses many different libs:

  • time builtin module
  • datetime builtin module
  • calendar builtin module
  • dateutil lib
  • pytz lib

Additionally, not all time-related operations are handled through time_utils.py.

I also went through the XML Schema spec about dateTime datatype and I feel very weird reading that

The ·value space· of dateTime is closely related to the dates and times described in ISO 8601

meaning that the dateTime datatype does not represent/support the full ISO 8601 specification, and at the same time there is no clear text about the differences. I cannot understand why someone found it better to semi-specify a new format instead of taking advantage of a specification about the exact thing they wanted to specify in the first place 😞.

Regardless, I started looking for libraries with ISO 8601 support. To my surprise, there is not much support for a well-defined standard, but there are lots of libs. I looked at the python modules {time, datetime, calendar}, and libraries: pytz, arrow, moment, iso8601, isodatetime, ciso8601, numpy/datetime64, pandas/timeseries and finally aniso8601.

I think that aniso8601 stands out as the most complete implementation that is taking edge cases seriously. Additionally, it is a pure python implementation and works with the standard datetime python type. I think a refactoring should be based on that lib.

c00kiemon5ter avatar Jun 08 '18 19:06 c00kiemon5ter

@c00kiemon5ter you're right, I didn't want to introduce invalid datetime lenght but only show dateutils usage flexibility. Nothing more. I agree that iso format should be mandatory.

aniso8601 looks good.

I found also: pyiso8601 and iso8601utils It seems also that starting from Python 3.7 there's a native support with colon delimiters in UTC offsets. It also come with datetime.fromisoformat function.

peppelinux avatar Jun 10 '18 12:06 peppelinux

Coming back to this.

The SAML2 core specification specifies:

1.3.3 Time Values

All SAML time values have the type xs:dateTime, which is built in to the W3C XML Schema Datatypes specification [Schema2], and MUST be expressed in UTC form, with no time zone component. SAML system entities SHOULD NOT rely on time resolution finer than milliseconds. Implementations MUST NOT generate time instants that specify leap seconds.

This conflicts with the dateTime datatype specification as it disallows the timezone component and considers all dates to have been specified as UTC beforehand. The dateTime specification defines two timelines - one for timezoned dateTimes and one for untimezoned dateTimes. The SAML2 spec essentially dictates that there is only one timeline, always timezoned as UTC.

This makes things easier for the implementers, as for example, they need not consider how to compare dateTimes from different timelines (a timezoned dateTime with an untimezoned datetime.)

As such, the originally posted date 2017-08-29T09:16:45.0631274-05:00 is invalid, as it contains timezone info -05:00. The correct representation is 2017-08-29T14:16:45.063127 (converted to UTC/Z/±00:00 and timezone info removed.)

The question now becomes whether implementations actually hold that promise, and if not, how do we handle it..

c00kiemon5ter avatar Jun 18 '18 14:06 c00kiemon5ter

Personally I would like to implement the most flexible and smart way to handle time, with or without timezone, with the possibility to succesfully handle exceptions on theri occurrence.

It's more like a hopefull whish, not a real TODO

peppelinux avatar Jun 18 '18 15:06 peppelinux

This was further discussed in the mailing list and ..it seems everything is backwards. This is what the technical committee had to say:

Subject: Re: [saml-dev] dateTime "time zone component" Date: Wed, 2 Oct 2013 15:53:00 +0000

The SAML spec has this part about "MUST be expressed in UTC form, with no time zone component." which would literally support the format given above (I think; xmlschema dateTime itself would certainly allow for that) but as Ian (and Olav, I think) point out in this thread

https://groups.google.com/forum/#!topic/simplesamlphp/7m07m9gLzzA

the spec probably intended to say no timezone /offset/?

I would think so.

-- Scott

which means that the timezone information should be there, but it is only allowed to be 'Z'.


The plan is:

  • accept all (ISO-8601) formats that we can parse, and convert them to UTC.
  • if there is no tzinfo, we assume the other side is respecting the current wording of the SAML2-core specification, and thus the time is already in UTC.

All dates are normalised to UTC and can be compared and worked-with under that assumption. Dates we produce should be in ISO-8601 format always in UTC with timezone info present as 'Z'.

c00kiemon5ter avatar Jun 19 '18 15:06 c00kiemon5ter