frameless icon indicating copy to clipboard operation
frameless copied to clipboard

Provide a way to construct a `TypedColumn` reference from a Symbol

Open iravid opened this issue 8 years ago • 8 comments

Currently, we cannot chain selects that reference columns introduced in a select without introducing intermediate vals. Here's a (contrived) example:

case class X2(i: Int, j: Int)

val ds = TypedDataset.create(Seq(X2(10, 10))

ds.select(ds('i), ds('j), ds('i) + 1).select(ds('_3))

One possible solution for this would be to provide an overload to TypedDataset.select that uses Symbol arguments and a similar frameless.functions.col function.

iravid avatar Sep 21 '17 18:09 iravid

Hi @iravid, is this covered by #187?

imarios avatar Sep 24 '17 02:09 imarios

Yes, with some limitations to type inference, discussed there

iravid avatar Sep 24 '17 03:09 iravid

I think we tried this a couple of times over the past year and always type inference was stopping us from having something that works in a useful way. Probably the macro approach that @OlivierBlanvillain suggest is the best way to go. Also, @jeremyrsmith alternative select macro was kind of solving a similar problem.

imarios avatar Sep 24 '17 04:09 imarios

Yes, there’s very little flexibility if T is a type parameter on the column.

I personally find this better than nothing currently but understood if you disagree :-)

Macros are very attractive usage wise, yes, but I have a feeling they’d present a maintenance challenge to the project compared to shapeless-based approaches.

iravid avatar Sep 24 '17 04:09 iravid

If I remember correct, the issue with this approach is that it will work great under some context and not under something else; which makes the API kind of "clumsy".

I do get the issue with macros, and personally I barely understand what is going on there. At the same time, I do believe that not all macros are evil. Some macros are worst than others. If the macro is simple enough and solves an important problem, I would surely consider using it and spend the extra brain power to understand and maintain it.

imarios avatar Sep 24 '17 05:09 imarios

Btw - can you point me to the macro suggestion you’re referring to?

iravid avatar Sep 24 '17 06:09 iravid

Hi @iravid, that's #110. Also, what @OlivierBlanvillain suggest about intermediate case classes might work well. Not sure which one is easier, but I would with the easiest.

imarios avatar Sep 25 '17 17:09 imarios

Ah thanks for pointing out @imarios. How would #110 solve this usecase though? You still need to name the intermediate dataset for this usecase.

For example, this doesn't work:

case class X2(i: Int, j: Int)

val ds = TypedDataset.create(Seq(X2(10, 10))

ds.select(ds(_.i), ds(_.j), ds(_.i) + 1)
  .select(ds(_._3))

But this does:

case class X2(i: Int, j: Int)

val ds = TypedDataset.create(Seq(X2(10, 10))

val intermediate = ds.select(ds(_.i), ds(_.j), ds(_.i) + 1)

intermediate.select(intermediate(_._3))

iravid avatar Sep 26 '17 19:09 iravid