NServiceBus.Persistence.Sql icon indicating copy to clipboard operation
NServiceBus.Persistence.Sql copied to clipboard

SQL Server: Add a SQL2019 dialect that supports storing text as Utf8

Open ramonsmits opened this issue 2 years ago • 0 comments

Describe the suggested improvement

Is your improvement related to a problem? Please describe.

Currently, outbox transport operations (MessageId, Operations columns), saga data (Datacolumn) and saga string correlation (Correlation_% column) use the nvarchar type.

This requires twice as much storage for most data stored in these columns as SQL Server its nvarchar type uses 16 bit characters.

Supporting varchar UTF8 collation would half the required storage.

Describe the suggested solution

SQL Server 2019 now supports UTF8:

  • https://learn.microsoft.com/en-us/sql/sql-server/what-s-new-in-sql-server-2019?view=sql-server-ver15&preserve-view=true#unicode-support
  • https://learn.microsoft.com/en-us/sql/relational-databases/collations/collation-and-unicode-support?view=sql-server-ver15

Likely the easiest approach is to:

  1. Clone SqlDialect.cs to Sql2019Dialect.cs and make it inherit from SqlDialect:
  2. Change column definition to UTF8 in the script builder to use an explicit collation that supports UTF8:
  • From: Operations nvarchar(max) not null
  • To: Operations varchar(max) COLLATE Latin1_General_100_CI_AI_SC_UTF8 not null

Describe alternatives you've considered

An alternative is to not use nvarchar but varbinary and convert to utf8 byte structure in the persisted which would also be compatible with older versions of SQL Server.

ramonsmits avatar Nov 16 '23 16:11 ramonsmits