csvjdbc
csvjdbc copied to clipboard
Facing issue with case sensitive columns like ID, Id
Is it correct that you have columns named ID and Id in the same CSV file?
Column names in CsvJdbc are case insensitive (same as in a real database such as PostgreSQL).
CsvJdbc will not work correctly if there are columns named ID and Id in the same CSV file. There is no check for this situation in CsvJdbc, and CsvJdbc may pick either of the columns when column id is used in an SQL query.
The best solution is to ensure that the column names in your CSV files are different. For example, ID1 and Id2.
This problem is solved by the changes for Issue #37.
When database property defectiveHeaders=true, any duplicate column name is renamed to COLUMNx, where x is the column index. Case insensitive comparisons are used, so column names ID and Id are duplicates, and one of the columns will be renamed.
@simoc A small correction to Column names in CsvJdbc are case insensitive (same as in a real database such as PostgreSQL).
Columns are case sensitive in Postgres and all other databases. It is just that if they are not double-quoted, the identifiers get converted to uppercase.
Example:
create table test1 (id int, "id" int);
leads to a table with two columns, ID and id.
Hence my strong recommendation would be to take the header columns as is. If the header consists of id, ID then this is absolutely fine.
But adjust the SQL parser. The statement select id from table returns the ID column as all unquoted qualifiers are converted to uppercase. The statement select "id" from table returns the id column as all quoted qualifiers are kept as is.
The alternative, for better backward compatibility, is to use the same double-quoted logic for headers just like with the example of the create table statement. Quoted are treated as is, unquoted are converted to uppercase.