CsQuery icon indicating copy to clipboard operation
CsQuery copied to clipboard

CreateDocument throws an System.NullReferenceException when a html string is passed in

Open cartics opened this issue 11 years ago • 2 comments

Update: added stack trace

  1. Browse to an internet web page and save the web page to the disk as "aaa.txt".
  2. Invoke CQ.CreateDocumentFromFile with the file location succeeds, where as invoking, CQ.CreateDocument with the contents of the file, a Null Reference exception is thrown.

Sample code with bug:

var htmlFileName = @"courthouse.txt";
var htmlBody = File.ReadAllText(htmlFileName);
cq = CQ.CreateDocument(htmlBody);

Code which works with filename:

var cq = CQ.CreateDocumentFromFile(@"courthouse.txt");

Stack Trace:

Unhandled Exception: System.NullReferenceException: Object reference not set to an instance of an object.
   at CsQuery.HtmlParser.ElementFactory.Parse(Stream inputStream, Encoding encoding)
   at CsQuery.HtmlParser.ElementFactory.Create(Stream html, Encoding streamEncoding, HtmlParsingMode parsingMode, HtmlParsingOptions parsingOptions, DocType docType)
   at CsQuery.CQ.CreateNew(CQ target, Stream html, Encoding encoding, HtmlParsingMode parsingMode, HtmlParsingOptions parsingOptions, DocType docType)

   at CsQuery.CQ..ctor(String html, HtmlParsingMode parsingMode, HtmlParsingOptions parsingOptions, DocType docType)
   at CsQuery.CQ.CreateDocument(String html)

cartics avatar Oct 21 '14 19:10 cartics

Not able to reproduce this in a simple test, e.g.

... save amazon.com homepage as "c:\amazon.html"

var test = File.ReadAllText("c:\\amazon.html");
var cq = CQ.CreateDocument(test);
var cq2 = CQ.CreateDocumentFromFile(filename);

.. works fine. This must have something specifically to do with the content, perhaps with character set conversion when saving? I would take a look at the contents of the file as saved from your browser.

jamietre avatar Oct 21 '14 20:10 jamietre

I can repro this consistently with the following site: http://courthouseproperty.com/ Since I am not able to attach the text file, I can send the text file separately.

cartics avatar Oct 21 '14 20:10 cartics