unrtf
unrtf copied to clipboard
Should suppress header info
I see no need to include the header info in the output, so either disable by default or make this an argument. Similar to
unrtf --text --quiet thefile.rtf
Ideally the output would not include any meta-information.
Note: These can be set in the .conf files.
Edit: hmmm I can't get --quiet to work. At least it keeps printing header info. Does that work for you?
We can add an option to specify the conf files if that's helpful.
@kbenoit I added a parameter conf_dir can you test this? I can't find the --quiet option that you mention.
I'm not sure the ability to specify conf files will be interesting or comprehensible to most users... Here's the behaviours I'm getting with the command line options. All still have the header info, which is defined in the text.conf file. I'd suggest overwriting that to turn off the header info and font messages etc by default. If it's needed to be output, it could be a verbose = TRUE` call to the R function, and be printed as a console message rather than embedded in the file.
KBsMBP15:quanteda kbenoit$ unrtf --text ~/Downloads/Hungarian.rtf
### Translation from RTF performed by UnRTF, version 0.21.9
### font table contains 0 fonts total
### invalid font number 0
-----------------
Nem tudjuk, mikor kezddhetnek meg Nagy-Britanni?val a kil?p?si t?rgyal?sok, csak azt, mikor kell befejezdni?k - k?z?lte Donald Tusk, az Eur?pai Tan?cs eln?ke.?Jean-Claude Juncker, az Eur?pai Bizotts?g eln?ke b?zik abban, hogy a brit parlamenti v?laszt?sok eredm?nye nem lesz hat?ssal a Brexitrl sz?l? t?rgyal?sokra, ?gy azok min?l hamarabb megkezddnek Nagy-Britannia az Eur?pai Uni? k?z?tt. A n?met k?l?gyminiszter szerint a brit v?laszt?s eredm?nye a Brexit elutas?t?s?t t?kr?zi.?
KBsMBP15:quanteda kbenoit$ unrtf --text --quiet ~/Downloads/Hungarian.rtf
### Translation from RTF performed by UnRTF, version 0.21.9
### font table contains 0 fonts total
### invalid font number 0
-----------------
Nem tudjuk, mikor kezddhetnek meg Nagy-Britanni?val a kil?p?si t?rgyal?sok, csak azt, mikor kell befejezdni?k - k?z?lte Donald Tusk, az Eur?pai Tan?cs eln?ke.?Jean-Claude Juncker, az Eur?pai Bizotts?g eln?ke b?zik abban, hogy a brit parlamenti v?laszt?sok eredm?nye nem lesz hat?ssal a Brexitrl sz?l? t?rgyal?sokra, ?gy azok min?l hamarabb megkezddnek Nagy-Britannia az Eur?pai Uni? k?z?tt. A n?met k?l?gyminiszter szerint a brit v?laszt?s eredm?nye a Brexit elutas?t?s?t t?kr?zi.?
KBsMBP15:quanteda kbenoit$ unrtf --text --verbose ~/Downloads/Hungarian.rtf
This is UnRTF version 0.21.9
By Dave Davey, Marcos Serrou do Amaral and Arkadiusz Firus
Original Author: Zachary Smith
show_dirs: 1 directories
directory = /usr/local/Cellar/unrtf/0.21.9/share/unrtf/
Processing /Users/kbenoit/Downloads/Hungarian.rtf...
### Translation from RTF performed by UnRTF, version 0.21.9
### font table contains 0 fonts total
### invalid font number 0
-----------------
Nem tudjuk, mikor kezddhetnek meg Nagy-Britanni?val a kil?p?si t?rgyal?sok, csak azt, mikor kell befejezdni?k - k?z?lte Donald Tusk, az Eur?pai Tan?cs eln?ke.?Jean-Claude Juncker, az Eur?pai Bizotts?g eln?ke b?zik abban, hogy a brit parlamenti v?laszt?sok eredm?nye nem lesz hat?ssal a Brexitrl sz?l? t?rgyal?sokra, ?gy azok min?l hamarabb megkezddnek Nagy-Britannia az Eur?pai Uni? k?z?tt. A n?met k?l?gyminiszter szerint a brit v?laszt?s eredm?nye a Brexit elutas?t?s?t t?kr?zi.?
Done.
127 words were hashed.
How do I turn that off by default? Want to have a look and do a PR?
Happy to do that. I'll add more test files locally too. Might be a few days but I'll get to it soon.
OK I'll push a version to CRAN now and then we can do a follow up release next week.