byte-streams icon indicating copy to clipboard operation
byte-streams copied to clipboard

Late declarations of lower-cost conversions are ineffective

Open DerGuteMoritz opened this issue 3 years ago • 3 comments

Declaring new conversions via def-conversion which would make conversion between two types less costly are ineffective if the conversion in question has occurred at least once before. This is caused by the global converter memoization which captures the state of the conversion graph at the point in time of the very first invocation for a given pair of source and dest types.

Reproducer:

(defrecord Foo [data])
(defrecord Bar [data])

(defn run [n]
  (prn :begin n)
  (bs/convert (Foo. "foo") String)
  (prn :end n)
  (newline))

(bs/def-conversion ^{:cost 0} [Foo Bar]
  [x _]
  (prn :foo->bar)
  (Bar. (:data x)))

(bs/def-conversion ^{:cost 0} [Bar String]
  [x _]
  (prn :bar->string)
  (:data x))

;; At this point we can only indirectly convert from `Foo` to `String` via `Bar`
(run 1)

;; Now we declare a direct conversion path from `Foo` to `String`
(bs/def-conversion ^{:cost 0} [Foo String]
  [x _]
  (prn :foo->string)
  (:data x))

;; But because the first invocation has already memoized the more costly path, it has no effect
(run 2)

Output:

:begin 1
:foo->bar
:bar->string
:end 1

:begin 2
:foo->bar
:bar->string
:end 2

In contrast, moving both run invocations after the last bs/def-conversion outputs:

:begin 1
:foo->string
:end 1

:begin 2
:foo->string
:end 2

This is a bit of a gotcha which might at least be worth documenting. Alternatively, def-conversion could reset the memo which should solve this issue.

DerGuteMoritz avatar Sep 12 '22 10:09 DerGuteMoritz

Just found https://github.com/clj-commons/byte-streams/issues/10 which also touches on the issue of the hidden initialization cost which resulted in the introduction of a precache-conversions API. This didn't solve the early capturing issue but at least is some prior art in the spirit of my suggestion. However, it later got removed again without further explanation. Hm!

DerGuteMoritz avatar Sep 12 '22 10:09 DerGuteMoritz

The main issue here could also be solved by invalidating the memo whenver new conversions are declared. However, this wouldn't also address the performance gotcha, so I decided to break it out into its own issue.

DerGuteMoritz avatar Sep 12 '22 11:09 DerGuteMoritz

It's important to articulate the problem; is this a real concern for anyone?

The only non-toy/demo/example I found of using def-conversion anywhere on Github was for clj-fdb, and that was for a new conversion, and in the tests.

This issue is currently theoretical afaict, and I don't think anyone should spend time on it. 😄

KingMob avatar Sep 13 '22 05:09 KingMob