Enforce the use of utf-8 as default charset for api mappings
This fixes #1346 .
A CharacterEncodingFilter is added to enforce the use of UTF-8 encoding for any api request.
@amvanbaren - At your earliest convenience, could you please take a look at this MR?
fyi: this is just one solution to the problem, I am happy to discuss other approaches but we should certainly ensure utf-8 encoding throughout the server imho.
another option would be to add that to the application.yaml:
server:
servlet:
encoding:
charset: UTF-8 # its already the default, just to make it clear that this is what we want
force: true
Other option would be to explicitly set the content encoding to UTF-8 for all responses, but that is tedious and you might miss some occurrences.
The downside of updating the configuration is that you must ensure that it is configured like that for your instance instead of hardcoding it in the application itself.
fyi: this is just one solution to the problem, I am happy to discuss other approaches but we should certainly ensure utf-8 encoding throughout the server imho.
Throughout or only /api?
so the change is currently for /api as these routes are most affected, but the whole app should probably default to utf-8. Not sure why its not the case, the spring documentation on this is rather sparse.
Some claim that this is the default, but I failed to find official documentation about it. Maybe just the force parameter is not set, so the default might be UTF-8.
This works for local storage, but not for cloud storage.
I could not test yet on a cloud storage, so I feared that it will not work.
Digging more into this topic, you can actually set properties for files stored in a blob: https://learn.microsoft.com/en-us/rest/api/storageservices/set-blob-properties?tabs=microsoft-entra-id
That should also include content type and encoding, so we should change the existing storage provider to set the encoding to utf-8 by default.
The question is how we modify existing files, there are currently 1.3M entries, of which there are several 100k text / json files which should be changed afaict