Bug reading datetimes (floating datetime format)
When I look at my data on Socrata this column is datetime type, and I can see the date and time down to the second, but when I use read.socrata it comes in as character type and while the date is there, the times are all 00:00:00. Example:
sessiondf <- read.socrata(https://data.cambridgema.gov/resource/99xk-ybak.json, email = SocrataUsername, password = SocrataPW)
class(sessiondf$starttime) [1] "character" head(sessiondf$starttime) [1] "2020-01-01T00:00:00.000Z" "2020-01-01T00:00:00.000Z" "2020-01-01T00:00:00.000Z" "2020-01-01T00:00:00.000Z" [5] "2020-01-01T00:00:00.000Z" "2020-01-01T00:00:00.000Z" tail(sessiondf$starttime) [1] "2024-04-16T00:00:00.000Z" "2024-04-16T00:00:00.000Z" "2024-04-16T00:00:00.000Z" "2024-04-15T00:00:00.000Z" [5] "2024-04-16T00:00:00.000Z" "2024-04-16T00:00:00.000Z"
It's difficult to troubleshoot without access, but the trailing Z looks suspicious in this example "2024-04-16T00:00:00.000Z". Can you provide more detail on the format?
@levyj do you have any insight into this?
posixify seems to be missing the pattern for the trailing Z, but I'd like to confirm that this is formatted as a date time, it's a valid format, and I'm not missing anything else.
Thanks
I think it is real. It indicates UTC time zone.
https://en.wikipedia.org/wiki/ISO_8601
I would like to note that if I am right, I just out-dorked @geneorama on a date/time issue! :astonished:
@levyj I was not sure if that's an expected format for how Socrata returns timestamps. I didn't think Socrata supported time zones or even specified UTC.
I did misremember the ISO-8601 format as having the Z in the middle (replacing the T), so thanks for confirming that it's a valid representation.
I see three possibilities.
- This is text that happens to be in ISO8601 format, but it isn't stored as a timestamp and should be treated as text.
- This is just a new issue that nobody has noticed or I've forgotten.
- This is a new bug because Socrata has changed their supported formats and there may be a slow trickle of bug reports as other new formats are discovered.
If it's number 3 I'm most inclined to suggest that the user convert the column manually with something like as.POSIXct("2020-11-14",format="%Y-%m-%d")
(I would suggest doing the conversion manually anyway because it may be a while before we can patch this.)
I was able to fix the problem by changing the data type on socrata from fixed datetime to floating datetime (ie from with a timezone to without).
On Wed, Apr 24, 2024 at 7:10 PM Gene Leynes @.***> wrote:
@levyj https://github.com/levyj I was not sure if that's an expected format for how Socrata returns timestamps. I didn't think Socrata supported time zones or even specified UTC.
I did misremember the ISO-8601 format as having the Z in the middle (replacing the T), so thanks for confirming that it's a valid representation.
I see three possibilities.
- This is text that happens to be in ISO8601 format, but it isn't stored as a timestamp and should be treated as text.
- This is just a new issue that nobody has noticed or I've forgotten.
- This is a new bug because Socrata has changed their supported formats and there may be a slow trickle of bug reports as other new formats are discovered.
If it's number 3 I'm most inclined to suggest that the user convert the column manually with something like as.POSIXct("2020-11-14",format="%Y-%m-%d")
(I would suggest doing the conversion manually anyway because it may be a while before we can patch this.)
— Reply to this email directly, view it on GitHub https://github.com/Chicago/RSocrata/issues/231#issuecomment-2076011149, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXIFSKCSWEV3LPPDSFU4TQDY7A3XRAVCNFSM6AAAAABGXL6VESVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZWGAYTCMJUHE . You are receiving this because you authored the thread.Message ID: @.***>
-- Alexandra Epstein @.***
@alixepstein thanks for letting us know! I modified the issue title for this additional information which will help tracking in the future.
This is still an issue, even though you've found a way around it. I think the right way to resolve it would be to do client side time zone conversions.