SWIG Layer.SetAttributeFilter method fails with MSSQL for non utf-8 strings
What is the bug?
SWIG SetAttributeFilter method converts strings to utf-8 byte array. As i understand MSSQL trying to decode string as cp1251 encoded (or other depending on MSSQL collation) and filter is not working properly
Tried this with C# bindings and python bindings with the same result.
Linked issue: MaxRev-Dev/gdal.netcore#135
Steps to reproduce the issue
Use SetAttributeFilter with any cyrillic symbols on MSSQL dataset with cyrillic collation.
Versions and provenance
OS: Windows 11 GDAL version: 3.9.0 C# library: https://github.com/MaxRev-Dev/gdal.netcore Python library: GDAL 3.3.3 MSSQL: docker image mcr.microsoft.com/mssql/server:2019-latest with collation: Cyrillic_General_CI_AS
Additional context
Temporary Solution Invoke gdal function directly and pass string with proper encoding (in c#) or properly encoded bytearray (C# and python)
In C# with cyrillic encoding (string)
OGR_L_SetAttributeFilter(Layer.getCPtr(inputLayer), filter);
[DllImport("gdal")]
static extern int OGR_L_SetAttributeFilter(HandleRef layer, [MarshalAs(UnmanagedType.LPStr)] string filter);
In C# with cyrillic encoding (bytearray)
OGR_L_SetAttributeFilter(Layer.getCPtr(inputLayer), Encoding.GetEncoding(1251).GetBytes(filter));
[DllImport("gdal")]
static extern int OGR_L_SetAttributeFilter(HandleRef layer, byte[] filter);
In Python with cyrillic encoding
import ctypes as C
gdal_dll = C.CDLL("gdal\dll\path")
gdal_dll.OGR_L_SetAttributeFilter(C.c_void_p(int(input_layer.this)), filter_string.encode("cp1251"))
Sugestion I think best solution would be modify OGRSQL converting to transact sql - passing all strings like unicode (N'string') (if i understand the problem correctly) or add overload of SetAttributeFilter in SWIG with bytearray filter argument so we can pass propetly encoded byte arrays to method if needed
CC @szekerest