mssql icon indicating copy to clipboard operation
mssql copied to clipboard

feat: Add ` mssql_ha_ag_is_contained` and `mssql_ha_ag_reuse_system_db`

Open zhuatuzilo opened this issue 10 months ago • 34 comments

Feature: add support to create Availability Group (AG) with contained database, and reuse system database

add two new var in playbook: 1、 mssql_ha_ag_is_contained if true, the ag created will be contained if false, the var mssql_ha_ag_reuse_system_db will be false Default: false Type: bool

2、 mssql_ha_ag_reuse_system_db if true, the contained ag will be reuse system database Default: false Type: bool

Reason: These are used to specify whether to CONTAINED and REUSE SYSTEM DATABASE when creating new availability group, it's feature in SQL Server 2022 - see the doc from microsoft: https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/contained-availability-groups-overview?view=sql-server-ver16

Result: Users are able to create contained databases and reuse system databases as per the new features of SQL Server 2022.

zhuatuzilo avatar Feb 21 '25 10:02 zhuatuzilo

And please add an entry to README.md about 2 new variables

spetrosi avatar Feb 21 '25 11:02 spetrosi

And please add an entry to README.md about 2 new variables

sorry,today is my first pr in my life,i will commit README.md later

zhuatuzilo avatar Feb 21 '25 11:02 zhuatuzilo

is work in my test,this is final version of configre_ag.j2 if you do not have any suggest,thx

zhuatuzilo avatar Feb 21 '25 12:02 zhuatuzilo

Can you share how you test it? I'd like to add test for it to this repo too.

spetrosi avatar Feb 21 '25 12:02 spetrosi

ok i just upload my inventory and playbook,if have any question ask me please

zhuatuzilo avatar Feb 21 '25 12:02 zhuatuzilo

@zhuatuzilo I need to cover these new variables with tests. What would be the verification steps to test that this setup works? Is it possible to test this with a single node? It would be the easiest scenario.

How we can verify that the role configures SQL Server according to the new variables that you add?

spetrosi avatar Feb 25 '25 13:02 spetrosi

you must create ag to verify it and can verify it through this sql select name,is_contained from sys.availability_groups if is_contained = 1, mean is work

zhuatuzilo avatar Feb 25 '25 13:02 zhuatuzilo

i try to write task and i hope to help you ` tasks:

  • name: Ensure mssql_version is 2022 block:

    • name: Check if mssql_version is 2022 assert: that: - mssql_version == '2022' fail_msg: "Error: mssql_version must be 2022. Current value: {{ mssql_version }}" success_msg: "mssql_version is already set to 2022"
  • name: Run on all hosts to configure HA cluster and contained availability group include_role: name: microsoft.sql.server

  • name: Verify that the availability group is contained vars: __mssql_sql_content_to_input: - |- IF EXISTS( SELECT is_contained FROM sys.availability_groups WHERE name = '{{ mssql_ha_ag_name }}' and is_contained = 1 ) BEGIN PRINT 'SUCCESS, The availability group {{ mssql_ha_ag_name }} is contained' END ELSE BEGIN PRINT 'FAIL, The availability group {{ mssql_ha_ag_name }} is not contained or does not exist' END include_role: name: linux-system-roles.mssql tasks_from: input_sql_files.yml

  • name: Assert that the template reported the correct state assert: that: >- "SUCCESS, The availability group {{ mssql_ha_ag_name }} is contained" in __mssql_sqlcmd_input.stdout fail_msg: "Failure: The availability group {{ mssql_ha_ag_name }} is not contained or does not exist" success_msg: "Success: The availability group {{ mssql_ha_ag_name }} is contained." `

zhuatuzilo avatar Feb 25 '25 14:02 zhuatuzilo

Is there a way to read the value of REUSE_SYSTEM_DATABASES?

spetrosi avatar Feb 25 '25 14:02 spetrosi

@zhuatuzilo thank you for the submission. Can you please explain a little bit more about what exactly this change does, why it is needed, and what it allows the user to do that the user could not previously do?

richm avatar Feb 25 '25 17:02 richm

Is there a way to read the value of REUSE_SYSTEM_DATABASES?

I think cant't,but it will create tow new database ag_name_master and ag_name_msdb like ExampleAG_master and ExampleAG_msdb in this playbook when create new ag . maybe you can use this SQL to verify the dataabse is created or not select 1 from sys.databases where name in ('ExampleAG_master','ExampleAG_msdb') having COUNT(*) = 2

you can refet to this table

重用系统数据库 AG专用系统库 结果
存在 成功创建,并重用AG专用系统库
不存在 自动创建AG专用系统库
存在 创建失败
不存在 自动创建AG专用系统库

zhuatuzilo avatar Feb 26 '25 02:02 zhuatuzilo

@zhuatuzilo thank you for the submission. Can you please explain a little bit more about what exactly this change does, why it is needed, and what it allows the user to do that the user could not previously do?

ok It is used to specify whether to CONTAINED and REUSE SYSTEM DATABASE when creating new availability group, it's feature in SQL Server 2022 see the doc from microsoft: https://learn.microsoft.com/zh-cn/sql/database-engine/availability-groups/windows/contained-availability-groups-overview?view=sql-server-ver16

zhuatuzilo avatar Feb 26 '25 02:02 zhuatuzilo

Is there a way to read the value of REUSE_SYSTEM_DATABASES?

I think cant't,but it will create tow new database ag_name_master and ag_name_msdb like ExampleAG_master and ExampleAG_msdb in this playbook when create new ag . maybe you can use this SQL to verify the dataabse is created or not select 1 from sys.databases where name in ('ExampleAG_master','ExampleAG_msdb') having COUNT(*) = 2

you can refet to this table

含 重用系统数据库 AG专用系统库 结果 是 是 存在 成功创建,并重用AG专用系统库 是 是 不存在 自动创建AG专用系统库 是 否 存在 创建失败 是 否 不存在 自动创建AG专用系统库

Please translate it in English

xiaohuazi123 avatar Feb 26 '25 06:02 xiaohuazi123

Hi guys,can we merge this pr? If have any suggestion,tell me plz. @spetrosi @richm

zhuatuzilo avatar Apr 01 '25 02:04 zhuatuzilo

@spetrosi lgtm - are you still waiting for information from the author?

richm avatar Apr 01 '25 14:04 richm

@spetrosi ready to merge?

richm avatar Apr 11 '25 21:04 richm

Thank you for your contribution, I am sorry this takes so long from our part. I need to cover this with automated tests, but was busy with other staff. I plan to write the tests in May.

spetrosi avatar May 06 '25 14:05 spetrosi

I am solving how to identify the status of those settings. With contained, it's easy: SELECT name, is_contained FROM sys.availability_groups;

But with reuse system databases, it seems like it's not possible at all. It's not possible to change the REUSE_SYSTEM_DATABASES settings after AG is created, need to drop and re-create the AG. So if we add this variable, the role will need to follow this logic: If mssql_ha_ag_is_contained = true -> drop the AG and create it with proper REUSE_SYSTEM_DATABASES. It would create an offline time for the cluster, which I do not like. It also makes the role not idempotent.

I am currently solving this with a Microsoft expert.

spetrosi avatar May 15 '25 13:05 spetrosi

you can refert this table:

is_contained reuse_system_database ag's system db result
true true exist create ag successful, and reuse system database which is exist (like xxxAg_master,xxxAg_msdb)
true true not exist create ag successful, and will create ag's system db
true false exist create ag fail, error message: "the contained system database xxx already exists"
true false not exist create ag successful, and will create ag's system db

if need reuse system database but reuse_system_database = false in playbook and ag'system db is existed, create ag will fail. if reuse_system_database = true , it will created ag successful whether ag's system db is existed or not. because of create ag will be failed when is_contained = true and reuse_system_database = false and ag's system db is existed, so need check ag's system db is existed or not, we can use this sql select 1 from sys.databases where name in ('ExampleAG_master','ExampleAG_msdb') having COUNT(*) = 2(the 'ExampleAG' is ag name), if result is 1 ,means ag's system db is existed, if result is NULL, it means ag's system db is not existed.

zhuatuzilo avatar May 16 '25 07:05 zhuatuzilo

Thank you for the clarification, this helps a lot. I am now able to verify if reuse_system_db is used and configure ag accordingly. I've added a commit with my improvements. I've added tests inside the existing test that runs in our CI in multihost manner. Let's see how it works.

spetrosi avatar May 16 '25 19:05 spetrosi

[citest]

spetrosi avatar May 16 '25 19:05 spetrosi

If mssql_ha_ag_is_contained and not mssql_ha_ag_reuse_system_db, because of the command DROP AVAILABILITY GROUP {{ mssql_ha_ag_name }} without drop ag's system db, so we must drop ag's system db which is existed for ag re-created successfully.

zhuatuzilo avatar May 17 '25 04:05 zhuatuzilo

  1. is_contained = 0 to is_contained = 1, need re-create ag, need check and drop ag's system db when reuse_system_db = 0.
  2. is_contained = 1 to is_contained = 0, need re-create ag, no need pay attention to whether ag's system db is existed.
  3. is_contained = 1 to is_contained = 1, no need re-create ag, no need pay attention to whether ag's system db is existed.
  4. is_contained = 0 to is_contained = 0, no need re-create ag, no need pay attention to whether ag's system db is existed
  5. create new ag, the vars are is_contained = 1 and reuse_system_db = 0, but ag's system db is existed, we just report the error message from SQL Server.

zhuatuzilo avatar May 17 '25 07:05 zhuatuzilo

I made a mistake, we should not drop ag's system db, it maybe casuing DBA lost his data. So if is_contained = 1 and reuse_system_db = 0 and ag's system db is existed, we just report 'Please check it' and RAISERROR. Hope you can tell me your thoughts.

zhuatuzilo avatar May 17 '25 08:05 zhuatuzilo

I change this ag_is_reuse_system_db.j2. because of where name in means where name = 'xx_master' or name ='xx_msdb',so we need control count(*) = 2, it means ‘xx_master’ and 'xx_msdb' are coexisted.

I think count(*) = 2 is not correct because if either of the databases exists, the role must fail with the error message. So not both of them but even 1.

With SELECT name vs SELECT 1, it doesn't matter because if you look closely, the script does IF EXISTS(...)BEGIN PRINT (1) ELSE PRINT (0). This if for the script to return exactly one int. But having name is more helpful for troubleshooting because I can just copy-paste it and see names. If I do SELECT 1 without the construct with PRINT, it returns extra symbols:

1> SELECT 1 FROM sys.databases WHERE name in ('ExampleAG_msdb','ExampleAG_master') 
2> go
           
-----------
          1

spetrosi avatar May 19 '25 09:05 spetrosi

[citest]

spetrosi avatar May 19 '25 11:05 spetrosi

[citest]

spetrosi avatar May 19 '25 12:05 spetrosi

[citest]

spetrosi avatar May 19 '25 13:05 spetrosi

[citest]

spetrosi avatar May 19 '25 14:05 spetrosi

[citest]

spetrosi avatar May 19 '25 16:05 spetrosi