dbplyr icon indicating copy to clipboard operation
dbplyr copied to clipboard

`copy_to` creates tables with invalid column names

Open jensmassberg opened this issue 1 year ago • 2 comments

I use dbplyr with a database where some column names are invalid. If a table with invalid column name is copied to the data base, invalid names are changed to valid ones. Thus when using copy_to, the table in the database might have column names than are different from the original data frame.

This was no problem in dbplyr 2.1.1, but in dbplyr 2.2.1 the returned tbl object contains the old (invalid) column names instead of the new correct ones. This leads to bugs when working with the table.

A closer look into the code showed that the following line led to the problem: https://github.com/tidyverse/dbplyr/blame/main/R/verb-copy-to.R#L86

Before the line was changes, the column names were read from the table. Now it is assumed that they are the same as the ones in the data frame. This assumption isn't true in my case and I think that there are also other data bases where this isn't necessarily true (random example: https://www.techtalkcorner.com/parquet-files-column-names-with-spaces/).

Can we either

  • get back to the old behaviour in copy_to to read the column names from the table or
  • add a parameter to switch the behavior (use column names from data frame or read them from table)?

jensmassberg avatar Sep 29 '22 08:09 jensmassberg

It feels quite unexpected to me that the database changes the column names. So, I'm not a big fan of the old behaviour. At least, this should be a warning. So it also does not make sense to me to add such a parameter. Instead you should rather rename the columns before copying. But the current behaviour is not that good either. Though, it feels like quite an edge case.

@hadley What's your opinion on that?

mgirlich avatar Oct 06 '22 07:10 mgirlich

Hmmmmmm, given that some databases do this, and we can't really protect it, it seems like it would be safest to ask the database for the column names, and then maybe warn the user if they're changed from what they supplied?

hadley avatar Oct 06 '22 11:10 hadley