csvkit
csvkit copied to clipboard
csvsql: Opt-in to use INTEGER instead of DECIMAL
My use-case: I use csvsql to quickly build schemas based on text files. I'm very happy with it, but sometimes forget to modify some DECIMAL fields to be integers, as a lot of the data I work with is relational and uses integer-based foreign keys.
The docs are unambiguous in its opinion, for good reason I think:
Output favors consistency over brevity: Numbers always include at least one decimal place, even if they are round
But I wondered if adding an optional flag, i.e. --int to suppress this behavior, would be a welcome change? I may put together a PR if there's interest.
We welcome PRs 😃 See also #151. A generic solution might be to declare the desired type of individual columns.
I agree, but the problem here is that inference does not work for integers (DECIMAL seems always used).
By the way, great software! :)
I think it's fine if the data is read as decimals via inference. The feature is to allow the user to override the column type used by csvsql when writing the data to the database. (I suppose an alternative way to implement the feature is to add "integer" as another inferred type, but that will likely require changing a lot more code.)
We were happily using csvsql and when installing on a new system it got upgraded from 0.9.0 to 1.0.6 and completely broke our application as all the integer columns in postgresql went to numeric. Our application which reads the database does not expect to get .0 appended to ID numbers used to generate file names, for example. Perhaps a backwards compatibility mode would be useful?
1.0 is not backwards compatible with 0.9 - that’s just semantic versioning. You can still install and run 0.9.
You can hack around this by issuing cast(columnname as int) in the SQL query for all the columns getting converted to decimal where you need ints.
A flag would be helpful.