frameless icon indicating copy to clipboard operation
frameless copied to clipboard

Expressive types for Spark.

Results 63 frameless issues
Sort by recently updated
recently updated
newest added

My understanding is that `Option` should be used to represent columns that one might mark nullable in vanilla Spark. I tried something along the lines of the following: ``` case...

feature

Consider the following example: ```scala import frameless.functions.aggregate.{collectSet, max, min} import frameless.syntax._ import frameless.TypedDataset case class Foo(bar: Int) val ds = TypedDataset.create(List.empty[Foo]) ds .agg( min(ds('bar)), collectSet(ds('bar)) ) .collect .run ``` It...

We are currently missing these two Dataset method: - DataStreamWriter writeStream() - Dataset withWatermark(String eventTime, String delayThreshold) That require some understanding of Spark streaming to be properly typed and tested....

enhancement
help wanted

Would it make sense to be able to introduce support for `avro` schema for `TypedDataSet`? The current code defines schema based on the `SparkSQL` "language": https://github.com/typelevel/frameless/blob/576eb675dbd121453679a57ae7117e4fb53d9212/dataset/src/main/scala/frameless/TypedDatasetForwarded.scala#L43-L44 On the other hand...

discussion

It is possible to define a case class with reserve field names using back-ticks. ```scala case class Foo(a: String, `if`: Int) val t = TypedDataset.create(Seq(Foo("a",2), Foo("b",2))) ``` Fails with the...

bug

Hi, I have recently started exploring frameless and trying to figure out joins. Especially left and right joins. Would it be possible to add additional examples in the documentation? It...

Vanilla Spark: ```scala val df: DataFrame = ??? val filtered = df.filter(df("value")

bug
work in progress

Hello, I am starting with Frameless and I am having a hard time converting my code based on spark-Dataframes to the Frameless framework. The blocking point I reach now is...

documentation

Meta-issue to list what has been done in frameless-ml and what remains to be done. Spark ML docs: https://spark.apache.org/docs/latest/ml-guide.html # Abstractions - [x] `TypedTransformer`, the type-safe equivalent of Spark ML...

help wanted
beginner friendly

Exhaustive status of the API implemented by `frameless.TypedColumn` compared to Spark's `Column`. It's split into two, the methods implemented directly on `Columns`, and the methods comings from `org.apache.spark.sql.functions._` ### Column...

feature
beginner friendly