HTMLReader icon indicating copy to clipboard operation
HTMLReader copied to clipboard

Strip HTML

Open leolobato opened this issue 8 years ago • 3 comments

This is a feature suggestion.

Since the HTML is already parsed, maybe it would be possible to add a method which strips the HTML but keeps the line breaks?

I'm specially thinking about this to be used on watchOS 2 projects, where NSAttributedString can't be used to strip HTML and it was quite a popular solution.

leolobato avatar Aug 24 '15 20:08 leolobato

Sorry for the delay in getting back to you, and thank you for your patience!

Just to make sure I understand, how does the output you're looking for differ from simply taking an HTMLDocument instance's -textContent? Special handling of <br> tags?

Could you supply brief sample of a document and the stripped output you'd expect? It'd be perfect for a unit test.

I'm guessing this'll be a pretty easy feature to implement, so I'm excited to add it!

nolanw avatar Sep 04 '15 02:09 nolanw

@nolanw maybe like .strings and stripped_strings in BeutifulSoup?

eternalphane avatar Apr 15 '17 15:04 eternalphane

@EternalPhane looks about right. Still not sure if it solves the original issue though.

nolanw avatar Apr 18 '17 21:04 nolanw