ews-java-api
ews-java-api copied to clipboard
Invalid character reference 
Not exactly an issue with the EWS Library, but the XML Parser complains about the reference not being a correct character reference in one of the mails I had to read.
Which I worked around with the following patch. I perfectly understand if you don't want to add this to the library by default as this will silently parse any invalid XML. But probably a way could be added to instrument the stream and or XmlEventReader so we can setup behavior from outside of your library?
@@ -99,7 +101,14 @@ public class EwsXmlReader { XMLInputFactory inputFactory = XMLInputFactory.newInstance(); inputFactory.setProperty(XMLInputFactory.SUPPORT_DTD, false);
- return inputFactory.createXMLEventReader(stream);
- XMLEventReader reader = inputFactory.createXMLEventReader(stream);
- // IM: continue after fatal error to prevent "invalid character reference"
- XMLErrorReporter errorReporter =
-
(XMLErrorReporter) reader.getProperty(Constants.XERCES_PROPERTY_PREFIX + Constants.ERROR_REPORTER_PROPERTY);
- errorReporter.setFeature(Constants.XERCES_FEATURE_PREFIX + Constants.CONTINUE_AFTER_FATAL_ERROR_FEATURE, true);
- return reader; }
Not 100% sure if this is the same issue, but I've run into a case where the exchange server will send an XML 1.0 preamble with entities that are only valid in XML 1.1, specifically unicode control characters. Fixing the preamble resolves all the parse errors without having to ignore them.
I have patch for this but it currently introduces another dependency that probably isn't necessary.
Thanks @easel will you provide a PR for this?
@serious6 I've pushed what I've got so far at https://github.com/OfficeDev/ews-java-api/pull/409
I think it should be possible to remove the dependency on stream flyer, and also to create a failing test case, but I haven't completed either of those successfully yet.
Also, do you think this functionality should be switchable? I've been running in production with it on for over a year with no issues, but it's hard to say that there will be no side effects elsewhere.
I am also facing the same issue while calling item.load()
Exception in thread "main" microsoft.exchange.webservices.data.core.exception.service.remote.ServiceRequestException: The request failed. ParseError at [row,col]:[9,2543] Message: Character reference "&# at microsoft.exchange.webservices.data.core.request.SimpleServiceRequestBase.internalExecute(SimpleServiceRequestBase.java:74) at microsoft.exchange.webservices.data.core.request.MultiResponseServiceRequest.execute(MultiResponseServiceRequest.java:158) at microsoft.exchange.webservices.data.core.ExchangeService.internalLoadPropertiesForItems(ExchangeService.java:1324) at microsoft.exchange.webservices.data.core.service.item.Item.internalLoad(Item.java:193) at microsoft.exchange.webservices.data.core.service.ServiceObject.load(ServiceObject.java:384) at cots.sg.test.EWSBBEmailServices.readAndParseEmails(EWSBBEmailServices.java:163) at cots.sg.test.EWSBBEmailServices.main(EWSBBEmailServices.java:219) Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[9,2543] Message: Character reference "&#
Have the same problem calling item.load()
microsoft.exchange.webservices.data.core.exception.service.remote.ServiceRequestException: The request failed. ParseError at [row,col]:[9,6] Message: Zeichenreferenz "&# at microsoft.exchange.webservices.data.core.request.SimpleServiceRequestBase.internalExecute(SimpleServiceRequestBase.java:74) at microsoft.exchange.webservices.data.core.request.MultiResponseServiceRequest.execute(MultiResponseServiceRequest.java:158) at microsoft.exchange.webservices.data.core.ExchangeService.internalLoadPropertiesForItems(ExchangeService.java:1324) at microsoft.exchange.webservices.data.core.service.item.Item.internalLoad(Item.java:193) at microsoft.exchange.webservices.data.core.service.ServiceObject.load(ServiceObject.java:384) at microsoft.exchange.webservices.data.core.service.ServiceObject$load.call(Unknown Source)
I don't understand how the EWS api is useable at all with this bug open.
+1 on the pull request to make it into the next release.
I'm getting this one:
com.sun.org.apache.xerces.internal.xni.XNIException: Character reference "&#
Debugging it, it looks like the entity parsed is 
which is coming out of this fragment:
<t:InternetMessageHeader HeaderName="Thread-Topic"> Recall: Theorem / Comprehend Training - Session 2 - Tuesday, January 21, 2014</t:InternetMessageHeader>
The subject of this same message is this:
<t:Subject>Recall: Theorem / Comprehend Training - Session 2 - Tuesday, January 21, 2014</t:Subject>
How did a 
make it into this?
@beders the solution from @ben-thompson-ravn worked perfectly for me. How this critical fix hasn't made it into a release despite being available for 6+ months is beyond me.
any fix provided for this invalid xml character issue?
any update here?
Hi @celloni, did you already tried this: https://github.com/OfficeDev/ews-java-api/pull/409
Jan
Hey @OS-JaR Thanks for your answer, after a few tests the fix from #409 seems to work.
Hi @celloni,
if this fix doesn't work, you can try to use InvalidXmlCharacterModifier
or write a custom Modifier like
public class ExtendedInvalidXmlCharacterModifier implements Modifier
to replace invalid chars or even something like a bad-word-filter:
@Override
public AfterModification modify(StringBuilder characterBuffer, int firstModifiableCharacterInBuffer, boolean endOfStreamHit){
matcherInvalidChar.reset(characterBuffer);
matcherInvalidChar.region(firstModifiableCharacterInBuffer, characterBuffer.length());
int start = firstModifiableCharacterInBuffer;
while (matcherInvalidChar.find(start)){
start = onMatch(characterBuffer);
}
return factory.skipEntireBuffer(characterBuffer, firstModifiableCharacterInBuffer, endOfStreamHit);
}
and
protected int onMatch(StringBuilder characterBuffer)
{
characterBuffer.replace(matcherInvalidChar.start(), matcherInvalidChar.end(), "HERE IS A REPLACED BAD WORD");
return matcherInvalidChar.start() + replacement.length();
}
with
this.matcherInvalidChar = Pattern.compile("really bad word").matcher("");
//This is pseudo code, don't know if it works like charm
Jan
Thanks for your help @OS-JaR ! 👍
I successfully parsed mail with first screenshot characters in subject & body but failed to parse with seconf screenshot characters in subject. Any suggestion or help?. @OS-JaR @serious6 @easel
Yeah, don't use this library. I think MSFT has a new library available for MS Graph API
@beders MS Graph API for Java is available here: https://github.com/microsoftgraph/msgraph-sdk-java That would be ok for those who are a) starting a new app and b) using Office 365. It will not work with older versions of Exchange Server.
Graph API doesn't support Exchange on-premises, only Office 365 and Hybrid (Exchange Server 2016).
Yup, MSFT wants you to move to the cloud ASAP. Looks like this very irritating bug still hasn't been fixed. (I had to use my own fork to make it work. I still do)
What is the solution for those users who still working ews-java-api?
Keep in mind that the API for Office 365 won't support Basic Authentication for EWS to access Exchange Online after October 13th, 2020. It will still work tough for on-premises installations.
More details: https://techcommunity.microsoft.com/t5/exchange-team-blog/upcoming-changes-to-exchange-web-services-ews-api-for-office-365/ba-p/608055
In fact, it’s just the question of authentication. You can add support of OAuth 2.0 in EWS Java API and continue using EWS.
@celloni This has now been deferred to the second half of 2021.
https://developer.microsoft.com/en-us/office/blogs/deferred-end-of-support-date-for-basic-authentication-in-exchange-online/