NReadability icon indicating copy to clipboard operation
NReadability copied to clipboard

Invalid character.

Open mscottco opened this issue 10 years ago • 1 comments

Hi there,

I seem to be having an issue with selected articles which causes an error like ''', hexadecimal value 0x08, is an invalid character.' to occur.

An example article is http://www.lifehacker.com.au/2015/01/read-more-every-day-by-creating-reading-triggers/

This is running ASP.NET 4.0 and the NReadability package was downloaded via NUGet.

Current running code is: NReadability.NReadabilityWebTranscoder tc = new NReadability.NReadabilityWebTranscoder(); NReadability.WebTranscodingInput ti = new NReadability.WebTranscodingInput(url); NReadability.WebTranscodingResult tcr = tc.Transcode(ti); //Exception thrown on this line. Response.Write(tcr.ExtractedContent);

(however I've tried variations of different code, including that which is included in the readme)

I reliase this is due to incorrect tags being read by the Xml reader within NReadability however I do not seem to be able to work around this.

Suggestions?

mscottco avatar Jan 21 '15 07:01 mscottco

Hi,

maybe you could try to do some preprocessing on the article - for example remove invalid characters that are non-printable anyway.

On 21 January 2015 at 08:16, mscottco [email protected] wrote:

Hi there,

I seem to be having an issue with selected articles which causes an error like ''', hexadecimal value 0x08, is an invalid character.' to occur.

An example article is http://www.lifehacker.com.au/2015/01/read-more-every-day-by-creating-reading-triggers/

This is running ASP.NET 4.0 and the NReadability package was downloaded via NUGet.

Current running code is: NReadability.NReadabilityWebTranscoder tc = new NReadability.NReadabilityWebTranscoder(); NReadability.WebTranscodingInput ti = new NReadability.WebTranscodingInput(url); NReadability.WebTranscodingResult tcr = tc.Transcode(ti); //Exception thrown on this line. Response.Write(tcr.ExtractedContent);

(however I've tried variations of different code, including that which is included in the readme)

I reliase this is due to incorrect tags being read by the Xml reader within NReadability however I do not seem to be able to work around this.

Suggestions?

— Reply to this email directly or view it on GitHub https://github.com/marek-stoj/NReadability/issues/19.

marek-stoj avatar Jan 21 '15 18:01 marek-stoj