libs-base
libs-base copied to clipboard
gnustep-base 1.28.0 triggers a regression in sogo
cf https://www.sogo.nu/bugs/view.php?id=5416 for more details, on OpenBSD 7.0 i've found out that gnustep-base 1.28.0 triggers a wrong behaviour in sogo where xml content gets encoded from ICS content served by the database. This breaks end-users not being able to see events they just created from thunderbird and pushed to SOGo.
working curl caldav request with gnustep-base 1.27.0 gives:
...
<D:propstat><D:prop><C:calendar-data>BEGIN:VCALENDAR
...
END:VCALENDAR</C:calendar-data><D:getetag>"gcs00000000"</D:getetag>...
with gnustep-base 1.28.0 this gives:
<D:propstat><D:prop><C:calendar-data>BEGIN:VCALENDAR
...
END:VCALENDAR</C:calendar-data><D:getetag>"gcs00000000"</D:getetag>
note the extra 
which is an xml-encoded representation of the UTF8 BOM character. I'm not sure where the gnustep-base code is called from SOGo, but i know it's a regression from the gnustep-base upgrade.
looking, but i'm suspicious about bd5f2909e6edc8012a0a6e44ea1402dfbe1353a4 now.
testing with sogo 5.2.0 built against gnustep-1.28.0 with bd5f290 reverted, the setup is working fine. So that's definitely a regression from bd5f290, to which i dunno if sogo should adapt ...
It really would help to understand which encoding your code is requesting here. Most likely it is NSUnicodeStringEncoding with the dataUsingEncoding: method on NSString. You should be fine when using NSUTF16BigEndianStringEncoding.
Looking into the original bug report you linked to I can see that this is about UTF-8 and there this makes no sense. As far as I am aware we do not add a BOM for UTF-8. What could happen is that there is already a BOM in the string. Now it really would help to understand how this data gets produced.
From what i can understand (and i understand absolutely nothing to obj-c) getetag
(mapped to davEntityTag
in sope, https://github.com/inverse-inc/sope/search?q=getetag + https://github.com/inverse-inc/sogo/search?q=davEntityTag) and calendar-data
(mapped to davCalendarData
in sope, https://github.com/inverse-inc/sope/search?q=calendar-data + https://github.com/inverse-inc/sogo/search?q=davCalendarData) seem to be NSString
. maybe @cgx would know ?
I've been bitten by this, too; for details please see
https://www.sogo.nu/bugs/view.php?id=5416#c15591
@landryb's findings and patch saved my day, thanks!
From the code I could find in SoGo this is not triggered by a encoding conversion. The most likely davEntityTag method looks like this:
- (id) davEntityTag { return [NSString stringWithFormat: @"\"gcs%.8d\"", version]; }
(where version is an unsigned int value)
But I don't see where this gets used together with the calendar-data. Could somebody with a bit of SoGo knowledge please explain how to reproduce this issue with pure GNUstep code?
@rfm could you please have a look here? I am a bit confused on what is actually going on here. While looking at the change that seems to cause all this, I noticed that the optimisation for the dataUsingEncoding: method is only present for the constant string class, not for the other ones. Was that on purpose? This should be unrelated to this issue, as the code in NSString looks correct to me.
(Drive-by comment: It seems like this would be grounds for adding a minimum unit test demonstrating the expected behavior, so we don't break this in the future..)
@ivucica you are correct. The problem is just that we don't know what features of NSString SOGo is using here.
@rfm this is a tests whether you now get mails for bug reports. Your help here is appreciated.
Well, it appears likely that something is putting a BOM in a string and that's then being output as utf-8, but we really need to find out where. I agree that a unit test to reproduce it would be good. NB. The only BOM related changes I recall in the last year or two were making gnustep-base behavior consistent with OSX behavior when converting to/from encodings, so ideally we need a testcase that can be run on OSX as well as on gnustep , to see where any problem is.
The place to look is the code generating that vcard. Perhaps it does something like converting bits of the vcard to NSData objects using NSUnicodeStringEncoding (which will produce a BOM at the start of each NSData object), then concatenating the NSData objects, then creating a string with embedded BOMs? Alternatively, perhaps in gnustep-base there is something that works like that somewhere? Either way, looking at the SoGo code generating the vcard string is a good starting point to see what is happening.
@cgx please can you have a look at this ? Your insight as a sogo developer would be helpful, since it affects sogo pretty badly.
In Alinto/sogo#324, there is a patch that looks good and solves the problem on the SOGo side.
fwiw, with sogo 5.8.0 i've tested that Alinto/sogo#324 properly fixed the issue i was seeing, so i'd close this issue as "not a gnustep-base regression" :) thanks !