pgloader icon indicating copy to clipboard operation
pgloader copied to clipboard

Way to change epoch milliseconds into seconds (from a csv source)?

Open empireshades opened this issue 4 years ago • 5 comments

I've got a CSV file with one field containing milliseconds since epoch expressed like '157204849300'. Using the unix-timestamp-to-timestampt transformation doesn't seem to pick up on this and instead detects the date as some time far in the future. Besides an external perl streamer, is there anyway I can use FORMATting to just drop the 3 trailing 0s in the column before insert? Thank you.

empireshades avatar Apr 12 '20 16:04 empireshades

You could use a bit of Lisp trickery to always drop the last three digits of a number.

CL-USER> (prin1-to-string (round (parse-integer "157204849300") 1000))
"157204849"

If the number is already read as an integer, drop the parse-integer call.

phoe avatar Apr 12 '20 17:04 phoe

Thanks. Can you provide an example of how this might be used in a command-file?

empireshades avatar Apr 12 '20 19:04 empireshades

I sadly cannot, as my pgloader-fu is too small; let's wait on @dimitri for that one.

phoe avatar Apr 12 '20 19:04 phoe

Have a look at https://github.com/dimitri/pgloader/issues/81 which should be enlightening as of how to make that happen.

dimitri avatar Apr 16 '20 17:04 dimitri

I was able to work out a user-defined function for parsing milliseconds-since-unix-epoch timestamps which does not lose precision, based on the existing transformation function unix-timestamp-to-timestamptz and using the hints in #81 - thanks, @dimitri and @oldlance!

epochms.lisp:

(in-package #:pgloader.transforms)

(defun unix-milliseconds-to-timestamptz (unixtime-string)
  "Takes a unix timestamp (milliseconds since beginning of 1970) and converts it
  into a string of format \"YYYY-MM-DD hh:mm:ss.sssZ\".

  Assumes that the unix timestamp is in UTC time."
  (when unixtime-string
    (let ((unixtime-ms (ensure-parse-integer unixtime-string))
          ;; Universal time uses a different epoch than unix time
          (unix-universal-diff (load-time-value
                                 (encode-universal-time 0 0 0 1 1 1970 0))))
      (multiple-value-bind    
        (unixtime millisecond)
        (floor unixtime-ms 1000)
        (multiple-value-bind
          (second minute hour date month year)
          (decode-universal-time (+ unixtime unix-universal-diff) 0)
          (format nil
                  "~d-~2,'0d-~2,'0d ~2,'0d:~2,'0d:~2,'0d.~3,'0dZ"
                  year month date hour minute second millisecond))))))

Add the file to the pgloader command line with --load-lisp-file epochms.lisp, and specify the new unix-milliseconds-to-timestamptz transformation function in the command file: CAST COLUMN example.happened_at TO timestamptz USING unix-milliseconds-to-timestamptz.

@dimitri, would you be interested in a pull request that would add this to the standard set of transformation functions? If so, any feedback on naming, functionality, or style?

jasonaowen avatar Sep 21 '22 21:09 jasonaowen