gitea icon indicating copy to clipboard operation
gitea copied to clipboard

Converts columns from NULL to NOT NULL DEFAULT

Open KN4CK3R opened this issue 3 years ago • 2 comments

depends on #18868 and #18869

This PR converts columns from NULL to NOT NULL DEFAULT. This should concern only the tests because the fixtures insert NULL values. The usual xorm.Inserts do not insert NULL values for int and bool.

KN4CK3R avatar Feb 23 '22 20:02 KN4CK3R

Codecov Report

Merging #18871 (de70352) into main (1546580) will decrease coverage by 0.02%. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #18871      +/-   ##
==========================================
- Coverage   46.60%   46.58%   -0.03%     
==========================================
  Files         854      854              
  Lines      122891   122891              
==========================================
- Hits        57271    57243      -28     
- Misses      58701    58734      +33     
+ Partials     6919     6914       -5     
Impacted Files Coverage Δ
models/issue.go 53.84% <ø> (ø)
models/issue_label.go 65.05% <ø> (ø)
models/issue_milestone.go 67.62% <ø> (ø)
models/org_team.go 54.33% <ø> (ø)
models/repo/attachment.go 50.33% <ø> (ø)
models/repo/repo.go 63.04% <ø> (ø)
models/repo/topic.go 61.53% <ø> (ø)
models/user/user.go 57.97% <ø> (ø)
modules/queue/queue_bytefifo.go 46.90% <0.00%> (-5.46%) :arrow_down:
modules/charset/charset.go 69.74% <0.00%> (-4.21%) :arrow_down:
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1546580...de70352. Read the comment docs.

codecov-commenter avatar Mar 08 '22 15:03 codecov-commenter

I checked the models in the migration just now. Changes are still valid.

KN4CK3R avatar May 02 '22 19:05 KN4CK3R

conflicted.

lunny avatar Sep 19 '22 08:09 lunny

Yes this can be done using alter table set default and alter table set not null for all databases except sqlite3

lafriks avatar Sep 27 '22 08:09 lafriks

@lunny Looks like xorm.ColumnString generates an invalid command for mssql. In this case it's ALTER TABLE [label] ALTER COLUMN [num_issues] INT DEFAULT 0 NOT NULL.

KN4CK3R avatar Sep 27 '22 12:09 KN4CK3R

blocking as recreating whole database is not feasable for big instances like...

gitea.com, codeberg.org, ...

I would against about this point. Gitea is not only for codeberg, it's for everyone.

In history, Gitea (gogs)'s code base is not ideal, there will be a lot of refactoring to make Gitea more "right". One instance should not be the blocker for a general purpose project.

For the people who need to customize Gitea a lot, they could alter the database without downtime ahead and skip the migration manually.


Update: this comment is a general thought, not for this PR. For this PR itself, I haven't fully understand the problem and have some worries and added a new comment.

wxiaoguang avatar Oct 13 '22 04:10 wxiaoguang

I would against about this point. Gitea is not only for codeberg, it's for everyone.

In history, Gitea (gogs)'s code base is not ideal, there will be a lot of refactoring to make Gitea more "right". One instance should not be the blocker for a general purpose project.

For the people who need to customize Gitea a lot, they could alter the database without downtime ahead and skip the migration manually.

But why do you think that there could not be other large private instances?

lafriks avatar Oct 13 '22 09:10 lafriks

I would against about this point. Gitea is not only for codeberg, it's for everyone. In history, Gitea (gogs)'s code base is not ideal, there will be a lot of refactoring to make Gitea more "right". One instance should not be the blocker for a general purpose project. For the people who need to customize Gitea a lot, they could alter the database without downtime ahead and skip the migration manually.

But why do you think that there could not be other large private instances?

There could be, IMO they could either take a downtime or apply the changes manually?

wxiaoguang avatar Oct 13 '22 09:10 wxiaoguang

Although I think it's good to make the database columns more strict and said "I couldn't agree with that 'database migrations should be blocked by large instances'", I just mean a general thought. Sorry that I didn't say clearly in first time.

For this PR I have some worries (I haven't thought about these problems carefully, just questions):

  • Does it need a full test for all databases to make sure the migration works well?
  • Does the x.Exec("UPDATE ... need to be merged into one UPDATE SET col1=...,col2=... to speed up for large tables?
  • Does the DEFAULT false or UPDATE SET col=false(default) work for all databases? IIRC XORM does some tricks for bool/true/false

  • One more question, if this PR is mainly for tests and it's difficult to make the migration clear or too much time-consuming, could there be an alternative approach to only make tests correct while keeping the database structure unchanged?

wxiaoguang avatar Oct 16 '22 15:10 wxiaoguang

Does it need a full test for all databases to make sure the migration works well?

How would a "full test" look like?

Does the x.Exec("UPDATE ... need to be merged into one UPDATE SET col1=...,col2=... to speed up for large tables?

That's "not" possible because of the WHERE statement (should not update col1 if col2 is null). A different solution could be col1 = ISNULL(...) but I don't know if that's better or supported.

Does the DEFAULT false or UPDATE SET col=false(default) work for all databases? IIRC XORM does some tricks for bool/true/false

We already have lots of DEFAULT false/true statements so I think that works?!

One more question, if this PR is mainly for tests and it's difficult to make the migration clear, could there be an alternative approach to only make tests correct while keeping the database structure unchanged?

Sure, we would have to add all fields to the fixtures (or enforce default values somehow). Then there would be no change to the database which is a pro or a con...?

KN4CK3R avatar Oct 16 '22 16:10 KN4CK3R

Does it need a full test for all databases to make sure the migration works well?

How would a "full test" look like?

Just a question, not sure whether "tests/integration/migration-test" covers all cases.

Does the DEFAULT false or UPDATE SET col=false(default) work for all databases? IIRC XORM does some tricks for bool/true/false

We already have lots of DEFAULT false/true statements so I think that works?!

I recalled some reviews mentioned that "do not write col=true/false" because it would cause broken SQL, I am not sure whether it's a real problem, so just ask the question. Update: checked with XORM code, it will change the column.Default to dialect values. So no problem

One more question, if this PR is mainly for tests and it's difficult to make the migration clear, could there be an alternative approach to only make tests correct while keeping the database structure unchanged?

Sure, we would have to add all fields to the fixtures (or enforce default values somehow). Then there would be no change to the database which is a pro or a con...?

Just a thought, maybe there could be some "default fixtures" during testing, which contains all default values for the fields, then all loaded fixtures first get filled by default fixtures, then load the real fixture data.

Update: Pros: no changes to database. Cons: may lead to inconsistent test results ..... So maybe it's not a good idea ... in the end the database should have strict structure .......

wxiaoguang avatar Oct 16 '22 16:10 wxiaoguang

@lunny Looks like xorm.ColumnString generates an invalid command for mssql. In this case it's ALTER TABLE [label] ALTER COLUMN [num_issues] INT DEFAULT 0 NOT NULL.

Is this a bug of xorm?

lunny avatar Oct 16 '22 16:10 lunny

Does the x.Exec("UPDATE ... need to be merged into one UPDATE SET col1=...,col2=... to speed up for large tables?

That's "not" possible because of the WHERE statement (should not update col1 if col2 is null). A different solution could be col1 = ISNULL(...) but I don't know if that's better or supported.

Since most columns do not have an index, so most UPDATE will be a full-table scan. So the all-in-one UPDATE might be: UPDATE ... SET col1=COALESCE(col1, DefaultValue), ... (without the WHERE)

wxiaoguang avatar Oct 16 '22 16:10 wxiaoguang

It's safer to update all NULL values to non NULL values before change the columns to NOT NULL, otherwise it will fail when upgrading.

lunny avatar Oct 16 '22 16:10 lunny

It's safer to update all NULL values to non NULL values before change the columns to NOT NULL, otherwise it will fail when upgrading.

Yup, that's what this PR has done, first UPDATE, then modifyColumns

wxiaoguang avatar Oct 16 '22 16:10 wxiaoguang

Hmm ..... after some more thinking, my personal conclusion at the moment is:

  • I support making database structure strict if it's very very necessary.
  • For tests, the fixtures could be fixed by adding missing fields (although it needs some work)
  • If this PR is only for test purpose, maybe we can leave it until there is a more strong requirement for it, because this PR only adds "NOT NULL" to some columns. If we take it, in the future, maybe another similar PR will appear to add more "NOT NULL" for other columns, which is pretty annoying (as a non-native speaker, I can not find another word for it 😁 ) ....

So I would propose to leave this PR as it is, fix the fixtures for Add some api integration tests #18872 manually. I could also help if it's needed.

wxiaoguang avatar Oct 16 '22 17:10 wxiaoguang

@KN4CK3R are you currently working on this? we can close this PR.

puni9869 avatar Sep 01 '23 05:09 puni9869

Not working on this because it was finished back then. It's still relevant but not an easy migration for existing large instances.

KN4CK3R avatar Sep 01 '23 06:09 KN4CK3R

Any forward plans for this. As piece by piece migration. There are lots of migration we need

puni9869 avatar Sep 01 '23 06:09 puni9869