fast-io icon indicating copy to clipboard operation
fast-io copied to clipboard

Speed increase for in memory decoding

Open psilord opened this issue 8 years ago • 4 comments

Hello, I noticed how to get a ten times speed increase in decoding of in memory encoded data. Thank you!

;; in the fast-io package, replace these two macros with the
;; supplied forms. The THE is the part I added. It informs the
;; compiler that fast-read-byte is returning an
;; (unsigned-byte 8). This way the compiler inference can
;; propagate properly and we can remove the need for bignum
;; and a bunch of type checking.
;;
;; Fix this in file:
;; io.lisp in fast-io.

(defmacro read-unsigned-be (size buffer)
  (with-gensyms (value)
    (once-only (buffer)
      `(let ((,value 0))
         ,@(loop for i from (* (1- size) 8) downto 0 by 8
                 collect `(setf (ldb (byte 8 ,i) ,value) (the (unsigned-byte 8) (fast-read-byte ,buffer))))
         ,value))))

(defmacro read-unsigned-le (size buffer)
  (with-gensyms (value)
    (once-only (buffer)
      `(let ((,value 0))
         ,@(loop for i from 0 below (* 8 size) by 8
                 collect `(setf (ldb (byte 8 ,i) ,value) (the (unsigned-byte 8) (fast-read-byte ,buffer))))
         ,value))))

psilord avatar Jun 11 '16 18:06 psilord

In looking at this more closely, my change actually has an error. I hadn't realized that fast-read-byte could also return an eof-value in addition to an unsigned-byte. I would suggest fixing it to return a values if eof-value is not an (unsigned-byte 8).

Hrm, in more thinking about it. There is no generic eof-value that can be an (unsigned-byte 8) since it overlaps the domain of any byte the fast-read-byte function can nominally return. So, a values is very much a solution here to return the eof-value out of the domain of (unsigned-byte 8).

It also seems no current code uses the eof-value feature, so my fix is valid, but a hidden land mine.

psilord avatar Jun 11 '16 22:06 psilord

And, here is a patch which fixes fast-read-byte to have a 1MB cache vector when reading from the stream as opposed to byte by byte reading. This made my read performance from disk about 25% faster.

This patch also includes the above patch to allow type inference to happen.

io.txt

psilord avatar Jun 11 '16 23:06 psilord

And, here is the function I used to test it:

(defun cpk-test (num-elements file)
  (let ((data (make-array num-elements
              :element-type '(unsigned-byte 32)
              :initial-element 0)))
    (format t "Initializing data array.~%")
    (loop for i from 0 below num-elements do (setf (aref data i) i))

    (format t "encoding...~%")
    (time (encode-to-file data file))

    (format t "decoding...~%")
    (let ((result (car (time (decode-file file)))))
      (format t "Found ~A elements.~%" (length result))
      (format t "First 32 elements: ")
      (loop for i from 0 below (min num-elements 32) do
       (format t "~A " (aref result i)))
      (format t "~%")))
  (sb-ext:gc :full t)
  T)

psilord avatar Jun 11 '16 23:06 psilord

I checked out sbcl 1.3.6 and discovered that the THE patch has almost no effect, so the compiler seems to have gotten smarter, but the buffering patch indeed improved performance. So, the THE patch seems superfluous at this time.

I did my initial tets with sbcl 1.3.1 and there the THE patch produce huge speedups. Now I just get the same speedups naturally with 1.3.6.

psilord avatar Jun 14 '16 01:06 psilord