grass
grass copied to clipboard
Accept more numeric types in Python modules
There is more numeric types in SQL than just INTEGER and DOUBLE PRECISION, for example, typical one in SQLite is REAL, so db.univar (and v.db.univar) need to accept those too as numeric types. Currently, db.univar errors when column has type REAL, although it should just work. This PR fixes that.
v.db.update allows user to pass invalid numbers because it does apply quoting even for some numeric types such as REAL. This causes zeros being recorded in the table without any warning with SQLite backend. The new version in this PR skips quoting for more types and causes SQL syntax error when the value provided is not a number (e.g. when user puts an expression into value instead of query_column).
v.dissolve does not seem to have issues with SQLite unlike the other two (three), but it seems to me that a similar fix is needed. I reused the functions created for the other cases, but the v.dissolve case is little more complex since string types are involved too. If someone can test this with PostgreSQL or comment on it, that would be great. The question is if v.dissolve accepts all data types it should accept? Maybe the best course of action is to solely rely on v.reclass for this test (and an informative error message!) or drop v.dissolve from this PR completely. Let me know what you think.
Anyway, the functionality needed for db.univar and v.db.update seems general enough, so I added the utility functions to the library (grass.script.db). (In fact, the initial implementation of this is in v.db.pyupdate, so that's another use for them.) An alternative would be to somehow mimic the db_sqltype_to_Ctype() from the C library or somehow wrap that in Python, but I'm not sure about the cost/benefit ratio there.
There is some more info in the individual commit messages.
Well, the two usages we have now are Should I quote it? and Can I use it for statistics?.
The function name is now sql_type_is_numeric(), but perhaps there should be two, sql_type_needs_quotes() (same as the current sql_type_is_numeric(), just opposite) and sql_type_is_number() (like the current one, but with a more neural name and may include DECIMAL/NUMERIC).
The thing is that the C modules do their own test and if the Python module is just doing some checking before running the C module as opposed to doing the computation in Python or sending a query to the database, the test needs to fit the C module. This suggests there is even more context than just quoting versus computing. This brings back the idea of getting result of db_sqltype_to_Ctype() into Python, perhaps as a column type with some v.info -c alternative.
I think the current approach is that DECIMAL needs special treatment which is not provided outside of the actual database, so if users wants to do computation, the right course of action is to convert that to DOUBLE and than do the computation (and than convert back to DECIMAL if needed).