pickling icon indicating copy to clipboard operation
pickling copied to clipboard

A path forward for pickling Java objects

Open eed3si9n opened this issue 9 years ago • 5 comments

Related: #60, #263, #33, #211, #295, #301

This is a general issue for discussing pickling Java objects. Hopefully we can reach some positive path forward.

Joda-Time isn't fixed

First, I would like to dispel the notion that Pickling 0.10.0 works with Joda-Time. It really doesn't. Here's a demonstration of #33 and #60 combined:

scala> import scala.pickling._, Defaults._, json._
import scala.pickling._
import Defaults._
import json._

scala> import org.joda.time._
import org.joda.time._

scala> val dt = DateTime.now.plusDays(3)
dt: org.joda.time.DateTime = 2015-02-15T23:12:36.915-05:00

scala> dt.pickle
res0: scala.pickling.json.pickleFormat.PickleType =
JSONPickle({
  "$type": "org.joda.time.DateTime",
  "Millis": "1424059956915",
  "Chronology": {
    "$type": "org.joda.time.chrono.ISOChronology"
  }
})

scala> res0.unpickle[DateTime]
res1: org.joda.time.DateTime = 2015-02-15T23:12:36.915-05:00

scala> val ld = LocalDate.now.plusDays(3)
ld: org.joda.time.LocalDate = 2015-02-15

scala> ld.pickle
res2: scala.pickling.json.pickleFormat.PickleType =
JSONPickle({
  "$type": "org.joda.time.LocalDate"
})

scala> res2.unpickle[LocalDate]
res3: org.joda.time.LocalDate = 2015-02-12

Why does it "work" for DateTime, but not for LocalDate? What was added in #211 is a matching on method name that starts with "set" and "get". DateTime appears to roundtrip because it happens to have to have setters setChronology and setMillis. LocalDate doesn't have any methods that starts with "set" so Pickling 0.10.0 happily generates a pickler that pickles an empty object. Note "set" and "get" are not the only pair allowed in JavaBeans Specification.

Guessing is dangerous

#263 illustrates that java.lang.Byte can roundtrip any number to 0.

scala> val r1: java.lang.Byte = 10.toByte
r1: Byte = 10

scala> val r2 = r1.pickle.unpickle[java.lang.Byte]
r2: Byte = 0

The fact that it's a boxed primitives is irrelevant. Due to the current logic of only picking up "set" and "get" pairs, the state of java.lang.Byte is not the exception, it's the rule. Note the problem with java.lang.Byte is not the fact that it's not handled, but the fact that it did not fail to compile. We should stop guessing, and fail when we don't have the complete information.

(More reports from the field)

scala> val uri = new java.net.URI("urn:isbn:096139210x")
uri: java.net.URI = urn:isbn:096139210x

scala> uri.pickle.unpickle[java.net.URI]
res5: java.net.URI = null

Java reflection

Are we doomed about Java? If we give up the speed and fall back to using reflection, we can still achieve correctness. I think that's a better approach.

scala> val r1: java.lang.Byte = 10.toByte
r1: Byte = 10

scala> r1.getClass.getDeclaredFields.toList foreach println
public static final byte java.lang.Byte.MIN_VALUE
public static final byte java.lang.Byte.MAX_VALUE
public static final java.lang.Class java.lang.Byte.TYPE
private final byte java.lang.Byte.value
public static final int java.lang.Byte.SIZE
private static final long java.lang.Byte.serialVersionUID

scala> {
     |   val f = r1.getClass.getDeclaredField("value")
     |   f.setAccessible(true)
     |   f.getByte(r1)
     | }
res1: Byte = 10

BeanInfo

There's BeanInfo, but I don't think we can discover setters reliably using it.

scala> import org.joda.time._
import org.joda.time._

scala> import java.beans.Introspector
import java.beans.Introspector

scala> val info = Introspector.getBeanInfo(classOf[DateTime])
info: java.beans.BeanInfo = java.beans.GenericBeanInfo@b508085

scala> info.getPropertyDescriptors.toList 
res2: List[java.beans.PropertyDescriptor] = List(java.beans.PropertyDescriptor[name=afterNow; propertyType=boolean; readMethod=public boolean org.joda.time.base.AbstractInstant.isAfterNow()]....

scala> info.getPropertyDescriptors.toList collect { case p if p.getWriteMethod != null => p }
res3: List[java.beans.PropertyDescriptor] = List()

eed3si9n avatar Feb 13 '15 06:02 eed3si9n

interesting, this means that Joda doesn't abide by the JavaBeans spec (not surprising).

SO, another option you don't mention is sun.misc.Unsafe (see Kryo's usage as an example), or ASM bytecode reading. We could probably compile time verify these.

jsuereth avatar Feb 13 '15 14:02 jsuereth

Another case important for me: java.net.URI. When compiling this:

  implicit val URIPickler = Pickler.generate[URI]

I get:

exception during macro expansion: 
scala.pickling.PicklingException: No fields are captured. You need a custom pickler to handle this.

Not a big deal to write a custom pickler, but I am wondering in the universe of classes just how many of these I would need to write. Speaking of which, is there a library of custom picklers for things? It seems like we would want to have a separate project with various jar files full of picklers for often used things.

reid-spencer avatar Jun 12 '15 18:06 reid-spencer

@reid-spencer The error message "No fields are captured" that you got is something I added in #295. It's not ideal, but at least the failed state is captured at compile time. The previous releases of Pickling would have happily generated a pickler/unpickler that loses data silently. It was a stopgap measure until the issue described in this issue is resolved.

I think @jsuereth is exploring the ASM route, and with luck we might be able to correctly capture the fields for both Java and Scala objects.

Speaking of which, is there a library of custom picklers for things? It seems like we would want to have a separate project with various jar files full of picklers for often used things.

A separate projects for the community to share pickler definition is actually pretty good idea. It might need to maintain lockstep with Pickling since the macro expansion might change, but it could also evolve faster than Scala Pickling. The sbt project has sbt/serialization project which adds a few things around Picking including a custom JSON format.

eed3si9n avatar Jun 12 '15 18:06 eed3si9n

@eed3si9n - Thanks for the explanation. Makes sense given the current state of the library.

I've seen the sbt serialization project and it is what gave me the idea that there ought to be some sort of plugin system to scala pickling so that we can build up libraries of hand-coded picklers that the macro generator can make use of when it comes across specific types. That is, the plugins would provide some sort of mapping between TypeTags and their picklers or just directly call something like scala.pickling.registerPickler[T](Pickler_For_T).

This might actually hasten adoption of scala pickling because it provides a workaround, even if temporary, until less direct methods (i.e. scala pickling does it for you) become generally available and work across a wide enough variety of java and scala classes that customer picklers are only relevant for highly optimized performance needs or to handle special cases.

As a mechanism, scala pickling ought to use the same system for optimized hand crafted picklers for specific often-used classes, especially ones that the macros do not generate correctly today.

reid-spencer avatar Jun 12 '15 18:06 reid-spencer

Note: #357 is meant to give us the means to solve the issues with Java classes. Also, please take a look at the ability to provide your own pickling runtime. I'd like to discuss/improve how we can register runtime picklers (not autogenreated code, but just a runtime pickler registry lookup for statically-generated picklers).

In any case, I hope to move this bug into the solved category after #357 is merged and I can work on phase 2 of fixing java. At least #357 will accurately warn us when things go awry and fail to create a pickler/unpickler instead of creating an erroneous one.

jsuereth avatar Aug 08 '15 20:08 jsuereth