sqlacodegen Add option to use dialect specific column types instead of generic ones

Discussed in https://github.com/agronholm/sqlacodegen/discussions/202

^{Originally posted by shengha2 April 29, 2022} We are generating our python code using sqlacodegen so far it works really well. But recently, we bump into an issue with only the Postgres dialect ARRAY support feature that could not be supported with sqlalchemy ARRAY.

We have (generated from generator.py) from sqlalchemy import ARRAY

We need from sqlalchemy.dialects.postgresql import ARRAY

Is it possible to specify a flag for generating the postgresql dialect?

May 02 '22 14:05 leonarduschen

I think this can be done in get_adapted_type(), but I would rather finish my big refactoring work before working on this.

May 05 '22 20:05 agronholm

I think we can skip calling get_adapted_type altogether and just use whatever datatype MetaData.reflect gives us. The reflection gives the most specific datatype by default. From docs:

When the columns of a table are reflected, using either the Table.autoload_with parameter of Table or the Inspector.get_columns() method of Inspector, the datatypes will be as specific as possible to the target database.

But yes, this can definitely wait until the big refactoring is done

May 07 '22 08:05 leonarduschen

Is there an ETA of the refactoring? In our case, we are blocked by not having the ability to select Array from Postgres dialect. We will need to write raw SQL queries as a workaround.

May 09 '22 14:05 shengha2

Why are you not able to manually change the import? sqlacodegen is like MonkeyType, it saves you a lot of typing but was never intended to produce a perfect output for everyone.

May 12 '22 07:05 agronholm

Why are you not able to manually change the import? sqlacodegen is like MonkeyType, it saves you a lot of typing but was never intended to produce a perfect output for everyone.

@agronholm I like to keep sqlacodegen's output untouched because that way I can rerun it to check if anything has changed in the database schema.

May 12 '22 19:05 andrew222651

So why not just compare the postprocessed results of "before" and "after"?

May 12 '22 20:05 agronholm

So why not just compare the postprocessed results of "before" and "after"?

@agronholm sorry, I don't understand what you mean

May 12 '22 20:05 andrew222651

Run sqlacodegen
Run sed or similar tool to change the column types
(time passes)
Run sqlacodegen again
Run sed or similar tool to change the column types, against the newly generated code
Compare the results of the first run vs second run

May 12 '22 20:05 agronholm

@agronholm Yeah that's good, as long as you can automate the "postprocessing"

May 12 '22 20:05 andrew222651

If you only need postgresql arrays, that seems like a simple thing, yes?

May 12 '22 20:05 agronholm

If this feature was added, it would be nice to be able to have unsigned columns in MySQL. Currently, the BIGINT UNSIGNED columns in my database use the SqlAlchemy BigInt type. But if I wanted to compare that the database can be created properly using the metadata object, the unsigned property is lost completely.

Oct 20 '22 15:10 cmpgamer

Does the unsigned flag appear in the reflected column type?

Oct 20 '22 15:10 agronholm

It does not show up in the reflected column type. But it shows up in get_adapted_type() on this line: compiled_type = coltype.compile(self.bind.engine.dialect).

This isn't too much of a problem for my project because the data matches the correct types on the Python side of things. I've just been looking into seeing if there was a way to modify the generator to do something like this for BIGINT UNSIGNED columns in MySQL:

col = Column(mysql.BIGINT(unsigned=True))

Oct 20 '22 15:10 cmpgamer

I'm confused. When sqlacodegen reflects the schema from MySQL, and the unsignedness does not show up in the column type, how could it possibly render it correctly?

Oct 20 '22 18:10 agronholm

My apologies. I'm new to SqlAlchemy reflection. I've done all of my previous development from scratch.

I was incorrect in my last comment. The reflected column (from tables.c) for a BIGINT UNSIGNED type does show that unsigned=True. It's an attribute in the sqlalchemy.dialects.mysql.types._NumericType class that the sqlalchemy.dialects.mysql.types.BIGINT inherits from.

Oct 20 '22 18:10 cmpgamer

Doe this happen also with v3.0.0rc1?

Oct 20 '22 18:10 agronholm

Yes. It happens in both v2.3.0 and v3.0.0rc1.

Edit: I could be wrong but it seems the issue is coming from render_column_type(). It doesn't seem like attributes from parent classes are being rendered.

Oct 20 '22 18:10 cmpgamer

If the column type was adapted, then it most certainly lost the unsigned flag, as it does not exist in the general BigInteger class.

Oct 20 '22 18:10 agronholm

I don't believe the column type was adapted because in the output file, it still uses the mysql dialect classes. I was able to add a quick and dirty solution in render_column_type() to get the unsigned flag working again.

if isinstance(coltype, mysql.types._IntegerType):
    if getattr(coltype, "unsigned", None):
        kwargs["unsigned"] = True

But nuances like these would be nice to have a flag set so it either uses the logic you currently have or it uses dialect specific logic.

Oct 20 '22 18:10 cmpgamer

My plan was to use the original column type unless it can be cleanly adapted into a generic column type. Would that not suffice?

Oct 20 '22 21:10 agronholm

That should work in the vast majority of cases but it seems for some dialects like mysql, there seem to be some attributes that can be set that need to be applied to the Column Type object.

Another column type I know that MySQL databases allocate differently is the Float type. The database I am working with right now uses Doubles for the column because we need the size of Doubles over Floats. When running sqlacodegen on the database and then re-creating it, because of the generic SQLAlchemy Float type, it will make those columns Float in MySQL, not Double.

Oct 20 '22 21:10 cmpgamer

That's just my point: when adaptation causes the column type to be changed in the round-trip, then that adaptation should not be done.

Oct 20 '22 21:10 agronholm

Yeah, I just thought I'd drop my two cents here because my team is happy that we saw (and now use) this library, and really do appreciate all the work you've put into it so far. We saw this issue posted and figured to add some more support in favor of it, while also addressing the missing attribute information from parent classes.

For now, we can use sqlacodegen as a starting point because we have some big databases we have to replicate now in a web application versus an old desktop application.

Oct 20 '22 21:10 cmpgamer

If you need a sed solution, this snippet might help you:

sed -i.bak '/ARRAY,/d' models.py \
    && sed -i.bak 's/from sqlalchemy import (/from sqlalchemy.dialects.postgresql import ARRAY\n&/' models.py

A native solution would be very nice!

Dec 14 '23 00:12 zakandrewking