ImageMagick icon indicating copy to clipboard operation
ImageMagick copied to clipboard

Trim trailing space in ascii PBM files

Open ace-dent opened this issue 6 months ago • 3 comments

Is your feature request related to a problem? Please describe.

When saving an ASCII / plain text PBM file, we are limited to 70 characters per line. For binary images, we can have up to 35 pixels and maintain 1 line of cells = 1 row of pixels. However, currently we add a trailing whitespace at the end of each line. I don't believe this extra space serves any function, and reduces our limit to 34 pixels.

Describe the solution you'd like

We should consider removing the trailing white space when writing out pixel values. It may be worth reviewing other NetPBM files too.

Describe alternatives you've considered

It is possible to script removal of these spaces in bash, etc. A user is unlikely to realise this is required.

Additional context

Minimal test case:

magick -size 1x1 xc:black -depth 1 -compress None pbm:- \
  | sed 's/[[:space:]]*$/❗️/' \
  > "TEST.pbm" # Strip trailing whitespace from rows

ace-dent avatar Jun 13 '25 12:06 ace-dent

SED Remove Leading or Trailing Spaces

sed 's/^[ ]*//'    leading

sed 's/[ ]*$//'    trailing

but how does one know whether the trailing space or leading space is not intended by the user. So having the IM code change the way NetPBM handles that may be an issue

fmw42 avatar Jun 13 '25 15:06 fmw42

When saving an ASCII / plain text PBM file, we are limited to 70 characters per line.

Who is limited to 70 characters per line? What does the limiting? Not ImageMagick.

However, currently we add a trailing whitespace at the end of each line.

Yes, IM appends a space character to each output value, so there is a space at the end of each line.

It is possible to script removal of these spaces in bash, etc. A user is unlikely to realise this is required.

What requires users to remove the space? They can if they want, of course, but why is it required?

snibgo avatar Jun 13 '25 18:06 snibgo

@snibgo - thanks for comments.

The PBM Specification:

No line should be longer than 70 characters.

This may be specified for all of the plain text (/ASCII) NetPBM formats. I'm sure it's just an old convention for displaying in 80char terminals? However...

Whitespace (blanks, TABs, CRs, LFs).

– and–

White space in the raster section is ignored.

Which suggests that any file readers will strip all white space, including the newline LF character... i.e. the 70char row wrapping would never affect decoding. As a User, keeping the current behaviour (1 raster line = 1 row of pixels) is very convenient! (Previously discussed)

@fmw42 - Thanks; that's what I'm doing.

I'm removing the trailing whitespace, as archiving very small pixel art (and hundreds of images), it does bloat the file size unnecessarily. It only occurs in the raster lines, and I can understand why (we always add space after outputting a digit).No trailing spaces are present in the file header. It is not required and seems like an easy fix... but I can accept it may be a legacy 'feature' that could break other people's workflow... but unlikely...

ace-dent avatar Jun 19 '25 11:06 ace-dent

Thank you for reporting the issue. We have successfully reproduced it and are actively working on a patch to resolve it. You can expect this patch to be merged into the main GIT branch, later today. As part of our commitment to quality, this fix will also be included in the upcoming beta releases of ImageMagick by tomorrow. Your patience and feedback are greatly appreciated.

urban-warrior avatar Jun 23 '25 13:06 urban-warrior

Thanks @urban-warrior ...

Looking over the recent commit, I think there may be a problem!

It was discussed here and in issue #294, that there is no need to limit the scan lines to 70 characters. The new commit seems to add a limit of 70 pixels... Perhaps that should be reconsidered?

ace-dent avatar Jun 23 '25 21:06 ace-dent

In practical terms, the 70 character limit helps ensure that image data remains easy to parse and edit manually, and that software tools processing these formats can handle the input reliably without needing to manage excessively long lines. It's not a strict technical requirement, but a widely respected guideline. Given that, we previously supported 2048 characters which is an arbitrary length we choose out of the ether. What do you recommend?

urban-warrior avatar Jun 23 '25 23:06 urban-warrior