polyfile icon indicating copy to clipboard operation
polyfile copied to clipboard

Some PDF streams not detected

Open samcowger opened this issue 6 years ago • 0 comments

If I run polyfile on an off-the-shelf PDF, it will generally detect all streams. If I use Mac's Preview to highlight/annotate the file, in this case 5 times, a number of streams are added. (Many of them appear to be ICC color spaces, and oddly, many are identical - 20, seemingly, in this case.) polyfile is unable to detect most newly-added streams:

> ls -lh optician*
-rw-r--r--@ 1 sam  staff   515K Jan  8 10:41 optician-with-annots.pdf
-rw-r--r--@ 1 sam  staff   370K Jan  8 10:41 optician.pdf
> strings optician.pdf | grep -c endstream
158
> polyfile -q optician.pdf | jq | grep -c "\"type\": \"EndStream\""
158
> strings optician-with-annots.pdf | grep -c endstream
198
> polyfile -q optician-with-annots.pdf | jq | grep -c "\"type\": \"EndStream\""
158

The files in question: optician-with-annots.pdf optician.pdf

samcowger avatar Jan 08 '20 19:01 samcowger