Matthew Powers comments

Results 285 comments of


                                            Matthew Powers

Create Dask Delta writer

@rajagurunath - the global _metadata file that Dask is currently creating is not scalable and I think we should ignore that file for purposes of this project. Assume `write_metadata_file=False`. Let's...

Try implementing this lib with delta-rs

@rajagurunath - I actually think this is going to be super easy. Think this will work: ```python from deltalake import DeltaTable import dask.dataframe as dd dt = DeltaTable("tmp/some-delta-pyspark") ddf =...

Try implementing this lib with delta-rs

Sounds good, keep me posted ;)

Has this project been abandoned?

Awesome, thanks @kalbasit, appreciate the response. I don't have time to maintain this either. Maybe we could put out some messaging that we're looking for a maintainer? There is a...

Use sort order for second dataset when using orderedComparison = false & ignoreColumnNames = true

@pkoplik24 - Thanks for pointing out this edge case. I think the function should error out if orderedComparison=false and ignoreColumnNames=true. We can have it return a descriptive error message that...

assertColumnEquality doesn't work with floating point numbers

@snithish - Have you used `assertColumnEquality` yet? It's way better than `assertSmallDataFrameEquality` usually... Can you please help me get the `assertDoubleTypeColumnEquality ` method to work properly: https://github.com/MrPowers/spark-fast-tests/blob/master/src/main/scala/com/github/mrpowers/spark/fast/tests/ColumnComparer.scala#L50-L65 This method is...

Matthew Powers

Create Dask Delta writer

Try implementing this lib with delta-rs

Try implementing this lib with delta-rs

Has this project been abandoned?

Use sort order for second dataset when using orderedComparison = false & ignoreColumnNames = true

assertColumnEquality doesn't work with floating point numbers

Add support for StructType columns in equality checks

Add support for StructType columns in equality checks

Add support for StructType columns in equality checks

Make the Dataset equality inequality messages better