CsQuery icon indicating copy to clipboard operation
CsQuery copied to clipboard

Maintain whitespace or insert whitespace with .Text()

Open nokturnal opened this issue 10 years ago • 2 comments

My original question was too vague, so let me try to refine it. I am attempting to obtain the text from a "set" of elements obtained using .NextUntil(). This text will be for a lucene index for searching. The issue I am having is that the Text() call correctly strips the html from the set but what I need is to insert some whitespace to maintain separation between words from different elements:

<p>this is a para</p><p>this is another para</p>

... becomes

this is a parathis is another para

... instead of the desired output of

this is a para this is another para

What is the best way to accomplish this?

Cheers :)

nokturnal avatar Jul 19 '13 20:07 nokturnal

Try the InnerText method -- ported from the nonstandard IE DOM element method, but I kept it around because it does something much like that. e.g.

CQ selection = myDom["some selector"];
string text = selection.First()[0].InnerText

Note that InnerText is a DomObject method, not a CQ method, so you need to access the element directly with the [0] in this example.

jamietre avatar Jul 29 '13 16:07 jamietre

Is there a general way to get all the inner text from any given CQ? I'm looking for a solution that can handle stuff like a CQ with multiple root elements, for example.

RudeySH avatar Jun 27 '18 09:06 RudeySH