unrtf icon indicating copy to clipboard operation
unrtf copied to clipboard

Should suppress header info

Open kbenoit opened this issue 8 years ago • 6 comments

I see no need to include the header info in the output, so either disable by default or make this an argument. Similar to

unrtf --text --quiet thefile.rtf

Ideally the output would not include any meta-information.

Note: These can be set in the .conf files.

kbenoit avatar Jun 09 '17 15:06 kbenoit

Edit: hmmm I can't get --quiet to work. At least it keeps printing header info. Does that work for you?

We can add an option to specify the conf files if that's helpful.

jeroen avatar Jun 09 '17 18:06 jeroen

@kbenoit I added a parameter conf_dir can you test this? I can't find the --quiet option that you mention.

jeroen avatar Jun 09 '17 18:06 jeroen

I'm not sure the ability to specify conf files will be interesting or comprehensible to most users... Here's the behaviours I'm getting with the command line options. All still have the header info, which is defined in the text.conf file. I'd suggest overwriting that to turn off the header info and font messages etc by default. If it's needed to be output, it could be a verbose = TRUE` call to the R function, and be printed as a console message rather than embedded in the file.

KBsMBP15:quanteda kbenoit$ unrtf --text ~/Downloads/Hungarian.rtf 
###  Translation from RTF performed by UnRTF, version 0.21.9 
### font table contains 0 fonts total
### invalid font number 0

-----------------
Nem tudjuk, mikor kezddhetnek meg Nagy-Britanni?val a kil?p?si t?rgyal?sok, csak azt, mikor kell befejezdni?k - k?z?lte Donald Tusk, az Eur?pai Tan?cs eln?ke.?Jean-Claude Juncker, az Eur?pai Bizotts?g eln?ke b?zik abban, hogy a brit parlamenti v?laszt?sok eredm?nye nem lesz hat?ssal a Brexitrl sz?l? t?rgyal?sokra, ?gy azok min?l hamarabb megkezddnek Nagy-Britannia az Eur?pai Uni? k?z?tt. A n?met k?l?gyminiszter szerint a brit v?laszt?s eredm?nye a Brexit elutas?t?s?t t?kr?zi.?

KBsMBP15:quanteda kbenoit$ unrtf --text --quiet ~/Downloads/Hungarian.rtf 
###  Translation from RTF performed by UnRTF, version 0.21.9 
### font table contains 0 fonts total
### invalid font number 0

-----------------
Nem tudjuk, mikor kezddhetnek meg Nagy-Britanni?val a kil?p?si t?rgyal?sok, csak azt, mikor kell befejezdni?k - k?z?lte Donald Tusk, az Eur?pai Tan?cs eln?ke.?Jean-Claude Juncker, az Eur?pai Bizotts?g eln?ke b?zik abban, hogy a brit parlamenti v?laszt?sok eredm?nye nem lesz hat?ssal a Brexitrl sz?l? t?rgyal?sokra, ?gy azok min?l hamarabb megkezddnek Nagy-Britannia az Eur?pai Uni? k?z?tt. A n?met k?l?gyminiszter szerint a brit v?laszt?s eredm?nye a Brexit elutas?t?s?t t?kr?zi.?

KBsMBP15:quanteda kbenoit$ unrtf --text --verbose  ~/Downloads/Hungarian.rtf 
This is UnRTF version 0.21.9
By Dave Davey, Marcos Serrou do Amaral and Arkadiusz Firus
Original Author: Zachary Smith
show_dirs: 1 directories
directory = /usr/local/Cellar/unrtf/0.21.9/share/unrtf/
Processing /Users/kbenoit/Downloads/Hungarian.rtf...
###  Translation from RTF performed by UnRTF, version 0.21.9 
### font table contains 0 fonts total
### invalid font number 0

-----------------
Nem tudjuk, mikor kezddhetnek meg Nagy-Britanni?val a kil?p?si t?rgyal?sok, csak azt, mikor kell befejezdni?k - k?z?lte Donald Tusk, az Eur?pai Tan?cs eln?ke.?Jean-Claude Juncker, az Eur?pai Bizotts?g eln?ke b?zik abban, hogy a brit parlamenti v?laszt?sok eredm?nye nem lesz hat?ssal a Brexitrl sz?l? t?rgyal?sokra, ?gy azok min?l hamarabb megkezddnek Nagy-Britannia az Eur?pai Uni? k?z?tt. A n?met k?l?gyminiszter szerint a brit v?laszt?s eredm?nye a Brexit elutas?t?s?t t?kr?zi.?
Done.
127 words were hashed.

kbenoit avatar Jun 10 '17 09:06 kbenoit

How do I turn that off by default? Want to have a look and do a PR?

jeroen avatar Jun 10 '17 10:06 jeroen

Happy to do that. I'll add more test files locally too. Might be a few days but I'll get to it soon.

kbenoit avatar Jun 10 '17 10:06 kbenoit

OK I'll push a version to CRAN now and then we can do a follow up release next week.

jeroen avatar Jun 10 '17 21:06 jeroen