csvdiff icon indicating copy to clipboard operation
csvdiff copied to clipboard

Compare files with different columns count

Open tmtben opened this issue 5 years ago • 5 comments
trafficstars

Hello,

Thanks for this great tool!

Here is my use case: • my base-csv contains whole rows with many columns. • my delta-csv contains a list of primary keys (one column only). I would like to get diff comparing only primary keys.

I got this error with csvdiff version 1.3.0:

# csvdiff base.csv pk.csv
csvdiff: command failed - base-file and delta-file columns count do not match

Best regards, Ben

tmtben avatar Feb 05 '20 07:02 tmtben

While validating both the configuration files, there is a check for the column cout here

How about

csvdiff base.csv pk.csv --columns 0

This means only one column is compared and it will also check if that index of that column is present in both the CSV files.

aswinkarthik avatar Feb 23 '20 11:02 aswinkarthik

Using version 1.4 with "columns" flag:

# csvdiff base.csv pk.csv --columns 0
csvdiff: command failed - base-file and delta-file columns count do not match

tmtben avatar Feb 23 '20 14:02 tmtben

I was thinking of introducing a feature where if we specify columns flag, we dont need to check if columns count match. All that matters is if the specified column is present on both csvs.

That would satisfy your requirement i believe.

aswinkarthik avatar Feb 23 '20 17:02 aswinkarthik

Any update on this?

This would be really nice. Furthermore it would be nice, if we could check which columns have been added (with which values).

pascalbe-dev avatar Nov 21 '20 13:11 pascalbe-dev

Maybe just replace following line https://github.com/aswinkarthik/csvdiff/blob/1007bf3599b3077a22e754281e671e8ac2996d42/cmd/config.go#L55 by this one if len(valueColumnPositions) == 0 && baseRecordCount != deltaRecordCount {

Now, I can use "--columns 0" to compare only the first column of two CSV files having different headers.

tmtben avatar Dec 24 '20 14:12 tmtben