HandsomeSoup Content missing for some websites.

The following URL doesn't seem to load any actual content (just metadata). Some pages on that site seem fine. Any idea what's up?

λ let url = "http://www.goodreads.com/book/show/301538.The_Darkness_That_Comes_Before?from_search=true"

λ runX $ fromUrl url
[NTree (XTag "/" [NTree (XAttr "http-Content-Length") [NTree (XText "386810") []],NTree (XAttr "http-Transfer-Encoding") [NTree (XText "chunked") []],NTree (XAttr "http-Set-Cookie") [NTree (XText "_session_id2=82884d397b7fcd985680433233ba3154; path=/; expires=Fri, 22-Aug-2014 04:20:14 GMT; HttpOnly") []],NTree (XAttr "http-X-Runtime") [NTree (XText "1.612029") []],NTree (XAttr "http-Cache-Control") [NTree (XText "max-age=0, private, must-revalidate") []],NTree (XAttr "http-ETag") [NTree (XText "\"d5ff33fa33ea6cd6c3f85076da8e4132\"") []],NTree (XAttr "http-X-UA-Compatible") [NTree (XText "IE=Edge,chrome=1") []],NTree (XAttr "http-X-Request-Id") [NTree (XText "0VR3CZ02NQRRFSJK9KT3") []],NTree (XAttr "http-Vary") [NTree (XText "User-Agent,Accept-Encoding") []],NTree (XAttr "http-Status") [NTree (XText "200 OK") []],NTree (XAttr "http-Content-Type") [NTree (XText "text/html; charset=utf-8") []],NTree (XAttr "transfer-Encoding") [NTree (XText "UTF-8") []],NTree (XAttr "transfer-MimeType") [NTree (XText "text/html") []],NTree (XAttr "http-Server") [NTree (XText "Server") []],NTree (XAttr "http-Date") [NTree (XText "Thu, 21 Aug 2014 22:20:14 GMT") []],NTree (XAttr "transfer-Version") [NTree (XText "HTTP/1.1") []],NTree (XAttr "transfer-Message") [NTree (XText "OK") []],NTree (XAttr "transfer-Status") [NTree (XText "200") []],NTree (XAttr "transfer-URI") [NTree (XText "http://www.goodreads.com/book/show/301538.The_Darkness_That_Comes_Before?from_search=true") []],NTree (XAttr "source") [NTree (XText "http://www.goodreads.com/book/show/301538.The_Darkness_That_Comes_Before?from_search=true") []]]) []]

Aug 21 '14 22:08 bobjflong

That's weird. Those are all the http headers, and content length is 386810, which means the whole page is being sent. Not sure where the body of the response is going.

Aug 22 '14 06:08 egonSchiele

I tried a workaround like this but looks like there's something up with parsing that document:

λ import Network.HTTP

λ html <- simpleHTTP (getRequest url) >>= getResponseBody
-- html looks correct

λ runX $ parseHtml html >>> css "span"
[]

Aug 22 '14 08:08 bobjflong

HandsomeSoup HandsomeSoup copied to clipboard

Content missing for some websites.

HandsomeSoup
HandsomeSoup copied to clipboard