spark-csv icon indicating copy to clipboard operation
spark-csv copied to clipboard

CSV Data Source for Apache Spark 1.x

Results 11 spark-csv issues
Sort by recently updated
recently updated
newest added

Adds Relation, LineReader and BulkReader traits to avoid duplicated code. Largely derived from https://github.com/quartethealth/spark-csv and https://github.com/quartethealth/spark-fixedwidth. This is in response to the following PR (created by @blrnw3) being closed without...

Added the comments for csv file paths

This is the change that allows an option to render errors when parsing such as number format exceptions as nulls. It was in this pull request, https://github.com/databricks/spark-csv/pull/298 but I thought...

several parsing options are added. they are organized in classes because there are many of them. a "text" based API to configure options is provided. another feature is the ability...

stale / awaiting update

I don't know Scala (at all!) so there's almost certainly cleaner ways - my apologies. The logging at the moment is sometimes unhelpful as it's hard to see the real...

For the context and discussion on this, please refer to https://github.com/databricks/spark-csv/pull/244.

There's datasets where each column has it's own marker for missing values. spark-csv assumes only empty string for missing values. To avoid additional data transformation and saving on user's side...

stale / awaiting update

If you are not using userSchema by default all fields in csv file are assumed to be StringType. This commit adds possibility to setup types for fields which are not...

stale / awaiting update