meteor-scrape
meteor-scrape copied to clipboard
When scraping a website it would be great to get back the HTML from within the article.
Getting the content from the scraped page is excellent, one thing it lacks is formatting. When I am scraping articles (from a website I own) it would be great if I could display the content with elements such as <p> intact.
So right now you can call response.text and get the text, .title for the title etc. I would like .html to return the contents of .text but with the striped elements intact.
Hey, we are in the process of customizing the used readability alogrithm ( #10 ), so this feature should be included in the next major version.
:+1: Great stuff, I will keep an eye out!
Waiting for this issue getting solved
I am also waiting for this. It would be great to pull the html.