jsonframe-cheerio icon indicating copy to clipboard operation
jsonframe-cheerio copied to clipboard

How to get text without nested children's texts

Open Micka33 opened this issue 4 years ago • 1 comments

How can I get just "This is some text"? and not "This is some textFirst span textSecond span text"?

<li id="listItem">
    This is some text
    <span id="firstSpan">First span text</span>
    <span id="secondSpan">Second span text</span>
</li>

Example:

let cheerio = require('cheerio');
let $ = cheerio.load(`
<li id="listItem">
    This is some text
    <span id="firstSpan">First span text</span>
    <span id="secondSpan">Second span text</span>
</li>`)

let jsonframe = require('jsonframe-cheerio')
jsonframe($)

let frame = {"text": "li#listItem"}
console.log( $('body').scrape(frame, { string: true } ))
// {
//   "text": "This is some text First span text Second span text"
// }

Micka33 avatar Mar 09 '20 18:03 Micka33

This does the trick.

let frame = {"text": "li#listItem < html || ([\\w\\s\\.\\d]+)<span"}

But it only works because I either have nothing or <span after my text. Is there a better builtin solution?

Micka33 avatar Mar 09 '20 18:03 Micka33