sqlacodegen
sqlacodegen copied to clipboard
Add option to use dialect specific column types instead of generic ones
Discussed in https://github.com/agronholm/sqlacodegen/discussions/202
Originally posted by shengha2 April 29, 2022 We are generating our python code using sqlacodegen so far it works really well. But recently, we bump into an issue with only the Postgres dialect ARRAY support feature that could not be supported with sqlalchemy ARRAY.
We have (generated from generator.py)
from sqlalchemy import ARRAY
We need
from sqlalchemy.dialects.postgresql import ARRAY
Is it possible to specify a flag for generating the postgresql dialect?
I think this can be done in get_adapted_type()
, but I would rather finish my big refactoring work before working on this.
I think we can skip calling get_adapted_type
altogether and just use whatever datatype MetaData.reflect
gives us. The reflection gives the most specific datatype by default. From docs:
When the columns of a table are reflected, using either the Table.autoload_with parameter of Table or the Inspector.get_columns() method of Inspector, the datatypes will be as specific as possible to the target database.
But yes, this can definitely wait until the big refactoring is done
Is there an ETA of the refactoring? In our case, we are blocked by not having the ability to select Array from Postgres dialect. We will need to write raw SQL queries as a workaround.
Why are you not able to manually change the import? sqlacodegen is like MonkeyType, it saves you a lot of typing but was never intended to produce a perfect output for everyone.
Why are you not able to manually change the import? sqlacodegen is like MonkeyType, it saves you a lot of typing but was never intended to produce a perfect output for everyone.
@agronholm I like to keep sqlacodegen's output untouched because that way I can rerun it to check if anything has changed in the database schema.
So why not just compare the postprocessed results of "before" and "after"?
So why not just compare the postprocessed results of "before" and "after"?
@agronholm sorry, I don't understand what you mean
- Run sqlacodegen
- Run
sed
or similar tool to change the column types - (time passes)
- Run sqlacodegen again
- Run
sed
or similar tool to change the column types, against the newly generated code - Compare the results of the first run vs second run
@agronholm Yeah that's good, as long as you can automate the "postprocessing"
If you only need postgresql arrays, that seems like a simple thing, yes?
If this feature was added, it would be nice to be able to have unsigned columns in MySQL. Currently, the BIGINT UNSIGNED columns in my database use the SqlAlchemy BigInt type. But if I wanted to compare that the database can be created properly using the metadata object, the unsigned property is lost completely.
Does the unsigned
flag appear in the reflected column type?
It does not show up in the reflected column type. But it shows up in get_adapted_type()
on this line: compiled_type = coltype.compile(self.bind.engine.dialect)
.
This isn't too much of a problem for my project because the data matches the correct types on the Python side of things. I've just been looking into seeing if there was a way to modify the generator to do something like this for BIGINT UNSIGNED columns in MySQL:
col = Column(mysql.BIGINT(unsigned=True))
I'm confused. When sqlacodegen reflects the schema from MySQL, and the unsignedness does not show up in the column type, how could it possibly render it correctly?
My apologies. I'm new to SqlAlchemy reflection. I've done all of my previous development from scratch.
I was incorrect in my last comment. The reflected column (from tables.c
) for a BIGINT UNSIGNED type does show that unsigned=True
. It's an attribute in the sqlalchemy.dialects.mysql.types._NumericType
class that the sqlalchemy.dialects.mysql.types.BIGINT
inherits from.
Doe this happen also with v3.0.0rc1?
Yes. It happens in both v2.3.0 and v3.0.0rc1.
Edit: I could be wrong but it seems the issue is coming from render_column_type()
. It doesn't seem like attributes from parent classes are being rendered.
If the column type was adapted, then it most certainly lost the unsigned
flag, as it does not exist in the general BigInteger
class.
I don't believe the column type was adapted because in the output file, it still uses the mysql dialect classes. I was able to add a quick and dirty solution in render_column_type()
to get the unsigned
flag working again.
if isinstance(coltype, mysql.types._IntegerType):
if getattr(coltype, "unsigned", None):
kwargs["unsigned"] = True
But nuances like these would be nice to have a flag set so it either uses the logic you currently have or it uses dialect specific logic.
My plan was to use the original column type unless it can be cleanly adapted into a generic column type. Would that not suffice?
That should work in the vast majority of cases but it seems for some dialects like mysql
, there seem to be some attributes that can be set that need to be applied to the Column Type object.
Another column type I know that MySQL databases allocate differently is the Float type. The database I am working with right now uses Doubles for the column because we need the size of Doubles over Floats. When running sqlacodegen on the database and then re-creating it, because of the generic SQLAlchemy Float type, it will make those columns Float in MySQL, not Double.
That's just my point: when adaptation causes the column type to be changed in the round-trip, then that adaptation should not be done.
Yeah, I just thought I'd drop my two cents here because my team is happy that we saw (and now use) this library, and really do appreciate all the work you've put into it so far. We saw this issue posted and figured to add some more support in favor of it, while also addressing the missing attribute information from parent classes.
For now, we can use sqlacodegen as a starting point because we have some big databases we have to replicate now in a web application versus an old desktop application.
If you need a sed solution, this snippet might help you:
sed -i.bak '/ARRAY,/d' models.py \
&& sed -i.bak 's/from sqlalchemy import (/from sqlalchemy.dialects.postgresql import ARRAY\n&/' models.py
A native solution would be very nice!