bigbang icon indicating copy to clipboard operation
bigbang copied to clipboard

collect_mail.py from IETF collects empty .mail files

Open sbenthall opened this issue 2 years ago • 2 comments

Something is quite wrong with the IETF data collection procees.

$ python bin/collect_mail.py -u https://www.ietf.org/mail-archive/text/dns-security/
['2008-05.mail',
 '2008-06.mail',
 '2008-07.mail',
 '2008-08.mail',
 '2008-09.mail',
 '2008-10.mail',
 '2008-11.mail',
 '2008-12.mail']

So far so good, but then:























archives/dns-security/2008-09.mail (END)

So no data is getting collected.

sbenthall avatar Apr 08 '22 20:04 sbenthall

The mail collection script is downloading all the .mail files from this page:

https://www.ietf.org/mail-archive/text/dns-security/

But these .mail files are empty; the data is actually in the .txt files

sbenthall avatar Apr 08 '22 21:04 sbenthall

This is likely related to the fact that dns-security is a deprecated working group

https://www.ietf.org/mail-archive/text/dns-security/dns-security.200003.txt

sbenthall avatar Apr 08 '22 21:04 sbenthall