meteor-scrape icon indicating copy to clipboard operation
meteor-scrape copied to clipboard

When scraping a website it would be great to get back the HTML from within the article.

Open tosbourn opened this issue 10 years ago • 4 comments

Getting the content from the scraped page is excellent, one thing it lacks is formatting. When I am scraping articles (from a website I own) it would be great if I could display the content with elements such as <p> intact.

So right now you can call response.text and get the text, .title for the title etc. I would like .html to return the contents of .text but with the striped elements intact.

tosbourn avatar Apr 18 '15 19:04 tosbourn

Hey, we are in the process of customizing the used readability alogrithm ( #10 ), so this feature should be included in the next major version.

Anonyfox avatar Apr 19 '15 09:04 Anonyfox

:+1: Great stuff, I will keep an eye out!

tosbourn avatar Apr 19 '15 12:04 tosbourn

Waiting for this issue getting solved

theankitgaurav avatar Feb 27 '16 07:02 theankitgaurav

I am also waiting for this. It would be great to pull the html.

rschlack avatar May 16 '16 16:05 rschlack