csvkit icon indicating copy to clipboard operation
csvkit copied to clipboard

csvsql: Opt-in to use INTEGER instead of DECIMAL

Open bahoo opened this issue 5 years ago • 6 comments

My use-case: I use csvsql to quickly build schemas based on text files. I'm very happy with it, but sometimes forget to modify some DECIMAL fields to be integers, as a lot of the data I work with is relational and uses integer-based foreign keys.

The docs are unambiguous in its opinion, for good reason I think:

Output favors consistency over brevity: Numbers always include at least one decimal place, even if they are round

But I wondered if adding an optional flag, i.e. --int to suppress this behavior, would be a welcome change? I may put together a PR if there's interest.

bahoo avatar Apr 06 '20 23:04 bahoo

We welcome PRs 😃 See also #151. A generic solution might be to declare the desired type of individual columns.

jpmckinney avatar Jul 07 '20 17:07 jpmckinney

I agree, but the problem here is that inference does not work for integers (DECIMAL seems always used).

By the way, great software! :)

frafra avatar Jun 11 '21 18:06 frafra

I think it's fine if the data is read as decimals via inference. The feature is to allow the user to override the column type used by csvsql when writing the data to the database. (I suppose an alternative way to implement the feature is to add "integer" as another inferred type, but that will likely require changing a lot more code.)

jpmckinney avatar Jul 12 '21 22:07 jpmckinney

We were happily using csvsql and when installing on a new system it got upgraded from 0.9.0 to 1.0.6 and completely broke our application as all the integer columns in postgresql went to numeric. Our application which reads the database does not expect to get .0 appended to ID numbers used to generate file names, for example. Perhaps a backwards compatibility mode would be useful?

mikeubell avatar Feb 05 '22 00:02 mikeubell

1.0 is not backwards compatible with 0.9 - that’s just semantic versioning. You can still install and run 0.9.

jpmckinney avatar Feb 05 '22 08:02 jpmckinney

You can hack around this by issuing cast(columnname as int) in the SQL query for all the columns getting converted to decimal where you need ints. A flag would be helpful.

tgoeg avatar Aug 21 '23 16:08 tgoeg