email2pdf
email2pdf copied to clipboard
Add --input-encoding switch and document potential need to use with procmail
Hi again,
I've one strange problem. I'm using email2pdf with procmail. Everything works fine except with one specific email. There is no PDF output and i cannot find any hints at the syslog. It just stops with "INFO 83 Output file name is:/home/my/path". However it works if i run it from the commandline "outside" procmail.
I don't know how to debug to find the problem.
Thank you for any hint...
Chris
Chris, so, firstly - having you tried invoking email2pdf with the -vv option? That will maximise the debugging output so we can see what's going on. The next step after that line 83 is to read in the email from stdin (I assume you are not using the -i/-inputFile option?), so I would guess that maybe procmail isn't passing it in correctly or completely.
What does your procmailrc look like? Can you reduce it and/or your email down to a small test case that shows the problem?
Hi,
you re right - i'm using stdin. "-vv" and cleaning up the procmail config shows me now the following error: At least ... an error message ;-)
Traceback (most recent call last): File "/opt/email2pdf/email2pdf-master/email2pdf", line 560, in call_main main(sys.argv, syslog_handler, syserr_handler) File "/opt/email2pdf/email2pdf-master/email2pdf", line 85, in main input_data = get_input_data(args) File "/opt/email2pdf/email2pdf-master/email2pdf", line 196, in get_input_data for line in sys.stdin: File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3306: ordinal not in range(128)
Chris
Chris, OK, that's promising, thanks. I think what's happening here is the procmail is giving the input data of the email to email2pdf in a different encoding from what it's expecting.
I've just pushed up a change in commit fa03d4c65 that should show what email2pdf thinks the encoding is (assuming -vv
is still being used). Could you please pull that change down, then retry, both inside and outside procmail? You should see lines like this:
DEBUG: System preferred encoding is: XYZ
It would be helpful to know what XYZ is, both inside and outside procmail.
Also, what operating system are you using? Could you tell me the output of the command locale
? It should look something like:
LANG="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_CTYPE="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_ALL=
It seems you are right...
procmail:
DEBUG: System preferred encoding is: ANSI_X3.4-1968
commandline
DEBUG: System preferred encoding is: UTF-8
locale LANG=en_US.UTF-8 LANGUAGE=en_US:en LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=
Thank you Chris
OK, do you have a simplified copy of your procmailrc (that still exhibits the problem) that you can share in here? I'm not that familiar with procmail, so I'm not sure why the character encoding is coming out differently. I would guess it's still passing the same mail body into email2pdf, which is why you are seeing the original error.
Its now very simple - may be thats the problem?
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
LOGFILE=/tmp/log.`date +%y-%m-%d`
:0
*^content-Type:
{
:0c:
| /opt/email2pdf/email2pdf-master/email2pdf -d /tmp/test --headers -vv
}
Yeah, I can't see any reason why that should be changing the system encoding, given what I know about procmail (which isn't a huge amount).
You should not need to do this, but one workaround might be to set an additional environment variable at the top of your procmailrc:
PYTHONIOENCODING=utf-8
Can you try that please? If that works, I can come up with a more elegant way of forcing/specifying the input encoding as an argument to email2pdf.
Its works! Thats great - thank you so much, As it seems to be a promail issue its fine for me to set this variable...
Chris
Chris, OK. Technically that will affects email2pdf's output as well (i.e. errors and log messages), so when I get a moment I will add a switch to force the encoding more correctly. But what you have should probably work for now.
Note to self:
- Add a '--input-encoding' switch, along with tests, for both stdin and file input.
- Document potentially strange interaction with procmail.
Thank you again for your efforts!
No problem, thanks for your support in making email2pdf better!