spark-daria
spark-daria copied to clipboard
Essential Spark extensions and helper methods ✨😲
@alfonsorr made a better release process for spark-fast-tests, see [this issue](https://github.com/MrPowers/spark-fast-tests/pull/106). Perhaps we should make the same changes for this repo so JAR files are easily accessible for Scala 2.11,...
I had some code that was working correctly with DariaWriters, where i wrote data to a Staging directory under a bucket in s3. The data was effectively moved from there...
([This single commit](145/commits/a90b4bb) PR requires the deep DF validation changes) When validating data against a schema, it verifies that all field elements are the same. However, a data frame column...
Hi! I recently upgraded my project to **Spark 3** (Scala 2.12.0) and therefore I am now using _spark-daria_ v1.0.0 I am a big fan of the "Optimize Imports" functionality in...
It uses regular expression part.+c\d{3} to find uncompact parquet. But I think only file written by Spark can have this feature.
Can you add 2 functions: 1. greatest that do not filter null but return null if one of the input is null 2. least that do not filter null but...
Error message when comparing data frames with different records count is misleading. 
As noted here: https://github.com/MrPowers/spark-daria/issues/121#issuecomment-610867295 Thanks for pointing this out @gorros.
@MrPowers, @nvander1, do you think it's a good idea to add thie project to https://github.com/fthomas/scala-steward to keep the dependencies up to date ?
spark-daria follows the standard Scala / Java deep nesting package convention that's annoying when importing code. Users currently need to import code like this: `import com.github.mrpowers.spark.daria.sql.ColumnExt._` I noticed that some...