chapel icon indicating copy to clipboard operation
chapel copied to clipboard

[Feature Request]: FormattedIO: support for formatted arrays of numbers

Open ninivert opened this issue 1 year ago • 6 comments

Summary of Feature

Description:

Format strings for numbers would able to be applied to arrays of numbers, as a "broadcast" of the scalar format string.

Is this a blocking issue with no known work-arounds? no

Work-around is to call writef multiple times with a for-loop. (Or might be possible to generate one large format string, and unpack a large tuple containing the values of the array in a single writef(largeFormatString, (...arrayAsTuple)) call)

The documentation https://chapel-lang.org/docs/modules/standard/IO/FormattedIO.html does not seem to mention array formatting.

Code Sample

use Random, OS;

config const n = 3, m = 2;

var A: [1..n, 1..m] real;

fillRandom(A, 42);
A[1,1] = 0.0;
A[3,2] = 0.0;  // let's throw off alignment of default format

// default print does not align the columns nicely:
writeln("A=\n", A);
// A=
// 0.0 0.896538
// 0.347376 0.95269
// 0.903904 0.0

// custom, column-aligned print:
writeln("A=");
for (i,j) in A.domain do
  writef("%6.3dr" + if j == A.domain.dim(1).last then "\n" else "", A(i,j));
// A=
//  0.000 0.897
//  0.347 0.953
//  0.904 0.000

// this feature request would allow the following syntax:
writef("A=\n %6.3dr", A);
// currently throws:
// A=
//  uncaught SystemError: Invalid argument: Argument type mismatch in argument 0 (in fileWriter.writef(fmt:string) with path "/dev/pts/12" offset 50)
//   formatArrays.chpl:14: thrown here
//   formatArrays.chpl:14: uncaught here

Example using numpy, where the formatter is specified on the array elements, and "broadcasted" to the entire array print:

>>> import numpy as np
>>> A = np.array([[0.0, 0.896538], [0.347376, 0.95269], [0.903904, 0.0]])
>>> with np.printoptions(formatter={'float': '{:6.3f}'.format}):
...     print(A)
... 
[[ 0.000  0.897]
 [ 0.347  0.953]
 [ 0.904  0.000]]

ninivert avatar Apr 12 '24 11:04 ninivert

Thanks for filing! I agree that this would be a useful thing to have

lydia-duncan avatar Apr 12 '24 18:04 lydia-duncan

In talking with one of my colleagues (Michael) about this, we were thinking a good way to accomplish this could involve adding/modifying one of the serializers to set the precision generally. Have you had a chance to play with the serializer/deserializer feature? (Note that if it seemed useful, we'd expect to provide a separate serializer/deserializer generally for users rather than having users rely on writing their own)

lydia-duncan avatar Apr 15 '24 16:04 lydia-duncan

Apologies for the late answer, but yes looking through the docs the serializer feature looks very interesting for this !

If I understand correctly, this would allow us to wrap the for ... do writef(...), however I'm wondering if all those writef calls could be optimized (although I'm not sure of the inner workings of writef ; and arguably one should only print small arrays for debugging, in which speed is not crucial).

The idea of constructing the large format string and using tuple destructuring could be used with ArrayHelper.writeBulkElements, but I'm wondering how that would work concretely. From my understanding, we'd need to construct a large tuple (and corresponding format string) containing array elements (say, rows of a matrix). However tuple sizes should be known at compile time (and array sizes can change at runtime), so that might be a problem ?

ninivert avatar Apr 18 '24 15:04 ninivert

It would definitely allow you to wrap the for ... do writef(...) call, but if the only reason you're using writef is for the precision setting then it could also allow you to use write or writeln instead of writef, since Chapel's generics allows the easy mixing of various types in the same write call.

I tend to not use writef myself as often in Chapel because I've typically found the base writeln powerful enough for my needs. But I also have spent most of my technical career here so didn't have as much time to form habits around format strings prior to writing Chapel code, and mostly don't write the sorts of code that would benefit from formatting. Which is to say, I'm curious if your use case lends itself to something like writeln("some text ", myInt, " ", myReal); or if it relies a lot more heavily on the other features of writef in a way that varies between calls.

The idea of constructing the large format string and using tuple destructuring could be used with ArrayHelper.writeBulkElements, but I'm wondering how that would work concretely. From my understanding, we'd need to construct a large tuple (and corresponding format string) containing array elements (say, rows of a matrix). However tuple sizes should be known at compile time (and array sizes can change at runtime), so that might be a problem ?

I'm not sure I'm quite following. It sounds like you have a particular use case in mind here, perhaps? I think understanding the motivation behind what you're asking might help me answer the question better.

lydia-duncan avatar Apr 19 '24 15:04 lydia-duncan

I'm curious if your use case lends itself to something like writeln("some text ", myInt, " ", myReal);

The use cases I was mostly thinking of was pretty-printing matrices (aligned columns, truncated precision) for debugging purposes, something like writef("A = %6.3dr", A);. Currently writeln seems to truncate the representation of numbers as much as possible, up to what seems to be 5 digits :(writeln(0.0) gives 0.0, writeln(7.0/3.0) gives 2.33333). I'm not sure what you mean by also using write or writeln ? The use-case I was thinking of requires the use of format strings.

I'm not sure I'm quite following.

Sorry about the confusion, I haven't taken the time to fully form an example. I was more concerned with the performance of having to call writef on every scalar, but it's probably not that important.

ninivert avatar Apr 23 '24 10:04 ninivert

The use cases I was mostly thinking of was pretty-printing matrices (aligned columns, truncated precision) for debugging purposes, something like writef("A = %6.3dr", A);. Currently writeln seems to truncate the representation of numbers as much as possible, up to what seems to be 5 digits :(writeln(0.0) gives 0.0, writeln(7.0/3.0) gives 2.33333).

Ah, I see! That makes sense.

I'm not sure what you mean by also using write or writeln ? The use-case I was thinking of requires the use of format strings.

I was mostly trying to think of ways to reduce what you're having to write, especially if there's repetition due to not having the array support. But it looks like you're using enough format string features that it's probably a moot point.

Sorry about the confusion, I haven't taken the time to fully form an example. I was more concerned with the performance of having to call writef on every scalar, but it's probably not that important.

Ah, okay. That's an interesting point, I don't know that we've done a study on the performance impact of calling writef independently versus in joined batches. My hope is that it would be negligible, but that could be worth verifying.

lydia-duncan avatar Apr 23 '24 22:04 lydia-duncan

Noting that a somewhat related idea for specifying precision in writef calls was discussed here: https://github.com/chapel-lang/chapel/issues/19906#issuecomment-1497965334

jeremiah-corrado avatar May 23 '24 20:05 jeremiah-corrado

https://github.com/chapel-lang/chapel/pull/25532 added a new precisionSerializer, that allows you to specify how floating point values are printed out by a fileWriter.

It supports the requested behavior described above with the following code:

use IO, Random, PrecisionSerializer;

config const n = 3, m = 2;

var A: [1..n, 1..m] real;

fillRandom(A, 42);
A[1,1] = 0.0;
A[3,2] = 0.0;

const ps = new precisionSerializer(precision=3, padding=6);

stdout.withSerializer(ps).writeln(A);

which will print:

 0.000  0.897
 0.347  0.953
 0.904  0.000

I believe this issue can be closed with that change, please feel free to re-open if that's not the case.

jeremiah-corrado avatar Jul 15 '24 14:07 jeremiah-corrado

Thanks for the work here, @jeremiah-corrado!

And thanks for filing this, @ninivert—let us know if you have any feedback!

bradcray avatar Jul 22 '24 18:07 bradcray

This is exactly what I was looking for ! Many thanks ! :tada:

ninivert avatar Jul 26 '24 09:07 ninivert