gremlin-scala icon indicating copy to clipboard operation
gremlin-scala copied to clipboard

@id attribute of case class without Option

Open kusumakarb opened this issue 7 years ago • 6 comments

We are using gremlin-scala v-3.3.3.4. We use JanusGraph and have the following data model w.r.t case classes. For each vertex, we have two case classes one for writing to the database and one for reading from the database. The only difference is that the case class which is used for writing to the database has the id as Option[id] whereas in the case class which used for reading from database the Id is not an Option.

For example, we have the user vertex as follows:

@label("user") final case class UserDB(@id id: Option[Long], name: String, createdAt: Long)
@label("user") final case class UserVertex(@id id: Long, name: String, createdAt: Long)

// UserDB is used when the vertex is created
val user = graph + userDB

// UserVertex is when we convert the vertex in to the case class
val userVertex = user.toCC[UserVertex]

The reason why we have a different case class during the read is because we are sure that the Id is going to be there when we read from the Database and we can avoid matching on the Option[id] to get the Id. Upgrading to the latest version of gremlin-scala is giving the following errors for the all case classes which doesn't have Id as Option.

[error] java.lang.AssertionError: assertion failed: @id parameter *must* be of type `Option[A]`. In the context of Marshallable, we have to let the graph assign an id
[error] 	at scala.Predef$.assert(Predef.scala:219)
[error] 	at gremlin.scala.Marshallable$.handleId$1(Marshallable.scala:142)
[error] 	at gremlin.scala.Marshallable$.$anonfun$materializeMappableImpl$1(Marshallable.scala:163)
[error] 	at scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:156)
[error] 	at scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:156)
[error] 	at scala.reflect.internal.Scopes$Scope.foreach(Scopes.scala:408)
[error] 	at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:156)
[error] 	at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:154)
[error] 	at scala.reflect.internal.Scopes$Scope.foldLeft(Scopes.scala:60)
[error] 	at gremlin.scala.Marshallable$.materializeMappableImpl(Marshallable.scala:28)
[error]           .map(_.toCC[UserVertex])

We understand the error message we have to let the graph assign an id which is only when adding the vertex to the database, but why is this enforced on toCC[UserVertex] ?

kusumakarb avatar Nov 30 '18 08:11 kusumakarb

This is enforced in Marshallable, and since that's used by both graph.+ and element.toCC, the same restriction applies. You're right though, this isn't a technical necessity, it's just to keep the implementation short. Most people use the same case class definition for both use cases.

You could do the following though:

@label("user") final case class UserVertex(@id _id: Option[Long], name: String, createdAt: Long) {
  val id = _id.get
}

mpollmeier avatar Dec 02 '18 21:12 mpollmeier

Initially I expected that wartremover will complain about the usage of .get, which surprisingly didn't, but the above code will fail during run time as it will do a None.get when it tries to initialise the id. We can use a lazy val, but that's costly in the sense that the id will get initialised every time it's used. Is there way you are going to change this in the code to work differently for graph.+ and element.toCC? Or If you are open for a PR, I can do it and submit it.

kusumakarb avatar Dec 04 '18 06:12 kusumakarb

but that's costly in the sense that the id will get initialised every time it's used

I don't follow - lazy val is only initialized once. Or what do you mean? I think that option with lazy val is the easiest solution.

I'm hesitant of duplicating Marshallable. If the above really doesn't work for you, maybe it's best to get the old behaviour back and remove that assertion in the macro. Still, please do consider the lazy val option.

mpollmeier avatar Dec 04 '18 07:12 mpollmeier

Sorry, my bad, I meant that the lazy val is checked for initialization (whether it is initialized or not) every time the variable is accessed, which is costly because we operate on a lot of vertices. We can't do that.

kusumakarb avatar Dec 04 '18 09:12 kusumakarb

The internet disagrees:

The numbers show that lazy vals performance penalty is quite small and can be ignored in practice. 

https://dzone.com/articles/cost-laziness

Anyway, we can lift that assertion if you like (basically back to how it was). Mind sending me a PR?

mpollmeier avatar Dec 05 '18 01:12 mpollmeier

Sure, will submit the PR for this.

kusumakarb avatar Dec 05 '18 08:12 kusumakarb