sparkling
sparkling copied to clipboard
Sparkling incompatibility with prismatic/schema
I'm getting this error from my code:
16/06/02 15:10:30 WARN Utils: Error serializing IFn sparkling.core$ftruthy_QMARK_$fn__3467@7ce0f591
java.io.NotSerializableException: clojure.lang.Delay
whenever I run (spark/filter parse/error?) on an RDD.
The function in question is here, I've simplified it as much as possible:
(s/defn error? :- s/Bool
[log :- s/Str]
true)
So, turns out it's prismatic/schema that returns a Delay. Any thoughts on how to use schema'd functions with spark?
Maybe now's the time to check out spec
Not sure if related but i found when passing higher order functions to spark it would save me a lot of weird errors if i did NOT explicitly type out (fn[x] (parse/error? x)) as opposed to using the # reader macro or just passing the function symbol. Both of those led to a lot of wierdness that i dont fully understand.
Ok, I'm pretty sure it's not what you want to have an RDD full of Delays, so serializing a Delay actually doesn't make sense. So what you need to do (guess, haven't tried it out) is to actually deref the Delay to get the actual value...