CosmosDB icon indicating copy to clipboard operation
CosmosDB copied to clipboard

SQL query containing Unicode characters does not work properly (encoding issue)

Open gocha opened this issue 3 years ago • 4 comments

Issue

A SQL query containing Unicode characters does not work properly. The query works fine in the portal, but Get-CosmosDbDocument cmdlet does not return the resulting documents.

$query = "SELECT * FROM customers c WHERE (c.content = '杉本 司')"
Get-CosmosDbDocument -Context $cosmosDbContext -CollectionId 'MyNewCollection' -Query $query

This problem seems to be the same as #151. It seems to be solved if Invoke-CosmosDbRequest encodes the request in UTF-8. When I checked the request with Fiddler, the Unicode characters in the query appeared to be garbled.

I'm wondering, when does the client need to send the request as non UTF-8? Does it work properly? I have a feeling that if Invoke-CosmosDbRequest always sends the request as UTF-8, this kind of problem would not occur.

Also, since Unicode characters are not special in the non-English speaking world, I find it inconvenient to have to specify the encoding in order to use them.

  • PowerShell version: 5.1 and 7.1
  • Operating system: Windows 10
  • Version of CosmosDB PowerShell Module: 4.4.3

Thank you for reading.

gocha avatar Mar 03 '21 10:03 gocha

Although I haven't tried it, I expect that the Invoke-CosmosDbStoredProcedure cmdlet has the same problem and will not work correctly if you use Unicode characters in the StoredProcedureParameters parameter.

gocha avatar Mar 03 '21 13:03 gocha

Thanks for raising this @gocha - I suspect you're right. I think sending everything as UTF-8 is the way to go. I'll aim to make the change over the weekend.

PlagueHO avatar Mar 03 '21 19:03 PlagueHO

Can it be that also "New-CosmosDbDocument" is not able to upload JSON Content with Unicode characters (UTF-8) correctly?

Solved for "New-CosmosDbDocument" with the Attribute "-Encoding 'UTF-8'"

weyCC81 avatar Nov 16 '23 23:11 weyCC81

Sometimes I run into problems with using UTF8 encoding. Turns out I just needed to disable the BOM and it worked fine.

loligans avatar Mar 06 '24 14:03 loligans