libs-base icon indicating copy to clipboard operation
libs-base copied to clipboard

gnustep-base 1.28.0 triggers a regression in sogo

Open landryb opened this issue 3 years ago • 15 comments

cf https://www.sogo.nu/bugs/view.php?id=5416 for more details, on OpenBSD 7.0 i've found out that gnustep-base 1.28.0 triggers a wrong behaviour in sogo where xml content gets encoded from ICS content served by the database. This breaks end-users not being able to see events they just created from thunderbird and pushed to SOGo.

working curl caldav request with gnustep-base 1.27.0 gives:

...
<D:propstat><D:prop><C:calendar-data>BEGIN:VCALENDAR
...
END:VCALENDAR</C:calendar-data><D:getetag>&quot;gcs00000000&quot;</D:getetag>...

with gnustep-base 1.28.0 this gives:

<D:propstat><D:prop><C:calendar-data>&#65279;BEGIN:VCALENDAR
...
END:VCALENDAR</C:calendar-data><D:getetag>&#65279;&quot;gcs00000000&quot;</D:getetag>

note the extra &#65279; which is an xml-encoded representation of the UTF8 BOM character. I'm not sure where the gnustep-base code is called from SOGo, but i know it's a regression from the gnustep-base upgrade.

landryb avatar Oct 26 '21 14:10 landryb

looking, but i'm suspicious about bd5f2909e6edc8012a0a6e44ea1402dfbe1353a4 now.

landryb avatar Oct 26 '21 14:10 landryb

testing with sogo 5.2.0 built against gnustep-1.28.0 with bd5f290 reverted, the setup is working fine. So that's definitely a regression from bd5f290, to which i dunno if sogo should adapt ...

landryb avatar Oct 26 '21 14:10 landryb

It really would help to understand which encoding your code is requesting here. Most likely it is NSUnicodeStringEncoding with the dataUsingEncoding: method on NSString. You should be fine when using NSUTF16BigEndianStringEncoding.

fredkiefer avatar Oct 26 '21 15:10 fredkiefer

Looking into the original bug report you linked to I can see that this is about UTF-8 and there this makes no sense. As far as I am aware we do not add a BOM for UTF-8. What could happen is that there is already a BOM in the string. Now it really would help to understand how this data gets produced.

fredkiefer avatar Oct 26 '21 16:10 fredkiefer

From what i can understand (and i understand absolutely nothing to obj-c) getetag (mapped to davEntityTag in sope, https://github.com/inverse-inc/sope/search?q=getetag + https://github.com/inverse-inc/sogo/search?q=davEntityTag) and calendar-data (mapped to davCalendarData in sope, https://github.com/inverse-inc/sope/search?q=calendar-data + https://github.com/inverse-inc/sogo/search?q=davCalendarData) seem to be NSString. maybe @cgx would know ?

landryb avatar Oct 27 '21 06:10 landryb

I've been bitten by this, too; for details please see

https://www.sogo.nu/bugs/view.php?id=5416#c15591

@landryb's findings and patch saved my day, thanks!

mcmer avatar Oct 28 '21 13:10 mcmer

From the code I could find in SoGo this is not triggered by a encoding conversion. The most likely davEntityTag method looks like this: - (id) davEntityTag { return [NSString stringWithFormat: @"\"gcs%.8d\"", version]; } (where version is an unsigned int value)

But I don't see where this gets used together with the calendar-data. Could somebody with a bit of SoGo knowledge please explain how to reproduce this issue with pure GNUstep code?

fredkiefer avatar Oct 28 '21 18:10 fredkiefer

@rfm could you please have a look here? I am a bit confused on what is actually going on here. While looking at the change that seems to cause all this, I noticed that the optimisation for the dataUsingEncoding: method is only present for the constant string class, not for the other ones. Was that on purpose? This should be unrelated to this issue, as the code in NSString looks correct to me.

fredkiefer avatar Oct 30 '21 11:10 fredkiefer

(Drive-by comment: It seems like this would be grounds for adding a minimum unit test demonstrating the expected behavior, so we don't break this in the future..)

ivucica avatar Oct 31 '21 17:10 ivucica

@ivucica you are correct. The problem is just that we don't know what features of NSString SOGo is using here.

fredkiefer avatar Oct 31 '21 17:10 fredkiefer

@rfm this is a tests whether you now get mails for bug reports. Your help here is appreciated.

fredkiefer avatar Nov 16 '21 20:11 fredkiefer

Well, it appears likely that something is putting a BOM in a string and that's then being output as utf-8, but we really need to find out where. I agree that a unit test to reproduce it would be good. NB. The only BOM related changes I recall in the last year or two were making gnustep-base behavior consistent with OSX behavior when converting to/from encodings, so ideally we need a testcase that can be run on OSX as well as on gnustep , to see where any problem is.

rfm avatar Nov 17 '21 09:11 rfm

The place to look is the code generating that vcard. Perhaps it does something like converting bits of the vcard to NSData objects using NSUnicodeStringEncoding (which will produce a BOM at the start of each NSData object), then concatenating the NSData objects, then creating a string with embedded BOMs? Alternatively, perhaps in gnustep-base there is something that works like that somewhere? Either way, looking at the SoGo code generating the vcard string is a good starting point to see what is happening.

rfm avatar Nov 17 '21 09:11 rfm

@cgx please can you have a look at this ? Your insight as a sogo developer would be helpful, since it affects sogo pretty badly.

landryb avatar Nov 17 '21 10:11 landryb

In Alinto/sogo#324, there is a patch that looks good and solves the problem on the SOGo side.

SlavekB avatar Aug 16 '22 09:08 SlavekB

fwiw, with sogo 5.8.0 i've tested that Alinto/sogo#324 properly fixed the issue i was seeing, so i'd close this issue as "not a gnustep-base regression" :) thanks !

landryb avatar Feb 01 '23 15:02 landryb