rfc_reader icon indicating copy to clipboard operation
rfc_reader copied to clipboard

Use a safer extraction method for untarring the archive

Open MartijnBraam opened this issue 9 years ago • 3 comments

This code replaces the extractall call with custom extraction code. The .tar.gz from rfc-editor.org seems to contain unneeded .pdf files and for some reason has the executable bit set for some files. The new code only extracts the .txt files to the rfc directory and also shields your homedir from a broken/malicious tar file (since .tar filenames can contain ../ in the path)

Also see the warning in the python docs for extractall: https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall

MartijnBraam avatar Jun 13 '16 12:06 MartijnBraam

What is wrong with the original method ?

monsieurh avatar Jun 13 '16 12:06 monsieurh

It is essentially downloading a tar file from somewhere on the internet and extract it with setting executable permissions. If the server is compromised then another .tar file could be uploaded that replaces any file I have permissions on in my whole laptop with anything and have it set filesystem permissions.

Also this doesn't extract the .pdf file which the program doesn't use so it saves some space. It also provides a convenient location to gzip the extracted files to save space (since the folder is taking halve a gig of space, which is not nice on a 16GB chromebook). This is also one of the reasons manpages are stored gzipped in your local filesystem.

MartijnBraam avatar Jun 13 '16 12:06 MartijnBraam

Fair point. I'll merge it as soon as I'll have time to write a test for it. Earlier if you attend to it, of course ;)

monsieurh avatar Jun 14 '16 20:06 monsieurh