bad-data-guide icon indicating copy to clipboard operation
bad-data-guide copied to clipboard

Records combined around end-of-line characters

Open onyxfish opened this issue 7 years ago • 0 comments

Via Chris Wright: "Finally record counts in general should be checked when receiving and loading data. Excel gives people the option to add in new lines within cells, this is stored as a Line Feed (LF) character (at least under Windows where I work), some applications reading this in will take everything after that as a new record, potentially resulting in data being loaded into wrong columns if you're loading into a database. Another fun trick is when you end up with an end of file character embedded in a text string. I've yet to work out how on earth these end up in the files (it's happened to me maybe 3 times over the last 5 years), but these essentially tell the process reading the data that it has reached the end of the file and to stop reading it there. The ASCII code for it resolves to CTRL+Z, so my current working theory is that the source system is capturing people undoing an typo. I've never been able to replicate this though. In both cases, knowing up front how many records you are expecting, and counting the number of records you've loaded into your working system captures these problems."

onyxfish avatar Oct 31 '16 19:10 onyxfish