rdrview icon indicating copy to clipboard operation
rdrview copied to clipboard

Convert to text

Open fetchinson opened this issue 3 years ago • 9 comments

Hi, rdrview is absolutely fantastic! The fastest and most relevant output I've come across from all the firefox readability based tools I've tried.

One new feature would be a great addition I think: convert the readable html output to text. Right now I'm using rdrview to get the readable html, output it with "-H" and use the links or lynx browser to dump the formatted text with the -dump option.

Would be nice if rdrview would have an option for outputting text.

In any case, thanks a lot for rdrview! (By the way I also had to throw away the sandbox stuff from the code because libseccomp would not compile on my system.)

fetchinson avatar Dec 30 '20 13:12 fetchinson

Hi, rdrview is absolutely fantastic! The fastest and most relevant output I've come across from all the firefox readability based tools I've tried.

Thanks! I'm glad to hear it.

Would be nice if rdrview would have an option for outputting text.

You can use mailcap for this purpose. Create a file under ~/.mailcap with a line such as the following:

text/html; /usr/bin/lynx -dump -force_html %s; copiousoutput; description=HTML Text; nametemplate=%s.html

This is the default on Debian (if lynx is installed). I wished sane mailcap defaults were more common, they are very useful.

(By the way I also had to throw away the sandbox stuff from the code because libseccomp would not compile on my system.)

Can you share any more details here? What's your system? If libseccomp is not always available I should do something to simplify the build in those cases. Or maybe just give up and use autoconf.

eafer avatar Dec 30 '20 17:12 eafer

(By the way I also had to throw away the sandbox stuff from the code because libseccomp would not compile on my system.)

Can you share any more details here? What's your system? If libseccomp is not always available I should do something to simplify the build in those cases. Or maybe just give up and use autoconf.

I have a very old fedora 17 installation, about 8 years old, and there are no updates anymore provided by redhat. I compile almost everything from source and the only time I run into trouble is if my glibc is too old and the code I'm trying to compile relies on newer glibc features, which does happen sometimes. With libseccomp I couldn't compile it, but it wasn't a glibc related problem, it through

system.c:461:16: error: ‘__NR_seccomp’ undeclared (first use in this function)

and after looking at the code for a while and googling around I couldn't figure out where __NR_seccomp should come from. So I gave up on libseccomp, but could easily compile your code by simply deleting everything which was sandbox related.

By the way, what's the downside of running it without a sandbox?

fetchinson avatar Dec 30 '20 20:12 fetchinson

I couldn't figure out where __NR_seccomp should come from.

That's the syscall number for seccomp(), it comes from the kernel headers. It was introduced in 3.17, so your kernel probably doesn't have it at all. Some of the seccomp stuff can also work with prctl(), but I guess libseccomp doesn't support that. If you want to build it, you need to upgrade to a more recent kernel.

By the way, what's the downside of running it without a sandbox?

The sandbox is a security measure, in case there are exploitable bugs in my code. Since you are using such an old distro, I'm guessing security is not a concern for you, so you can ignore it.

If you want, you can try to secure rdrview by runnning it as a separate unprivileged user. It's not the same as the sandbox, but it's probably good enough.

eafer avatar Dec 30 '20 21:12 eafer

Okay, thanks, you're right, security is not really an issue in my setup. In the Makefile you could introduce a setting to have the sandbox not compile at all if it's a problem for other people too. But like I said, it's easy to just delete those parts of the code which refer to the sandbox, so it's not a big issue.

fetchinson avatar Dec 30 '20 21:12 fetchinson

You can use mailcap for this purpose. Create a file under ~/.mailcap with a line such as the following:

text/html; /usr/bin/lynx -dump -force_html %s; copiousoutput; description=HTML Text; nametemplate=%s.html

Did you try this? Did it work for you? I ask so that I can close the issue.

eafer avatar Jan 02 '21 21:01 eafer

I had the same issue and was looking for an option to output text directly. (Easy way to scroll) Adding text/html; /usr/bin/lynx -dump -force_html %s; copiousoutput; description=HTML Text; nametemplate=%s.html to $HOME/.mailcap is a good solution for me. Maybe add this also the documentation?

sdsddsd1 avatar Feb 02 '21 11:02 sdsddsd1

I'd also like an easy option to have plain text output. Right now I have to pipe the outputted HTML into html2text, strip out the formatting marks then use that to print+TTS.

csehszlovakze avatar Dec 28 '22 19:12 csehszlovakze