chapel
chapel copied to clipboard
[Feature Request]: FormattedIO: support for formatted arrays of numbers
Summary of Feature
Description:
Format strings for numbers would able to be applied to arrays of numbers, as a "broadcast" of the scalar format string.
Is this a blocking issue with no known work-arounds? no
Work-around is to call writef multiple times with a for-loop.
(Or might be possible to generate one large format string, and unpack a large tuple containing the values of the array in a single writef(largeFormatString, (...arrayAsTuple)) call)
The documentation https://chapel-lang.org/docs/modules/standard/IO/FormattedIO.html does not seem to mention array formatting.
Code Sample
use Random, OS;
config const n = 3, m = 2;
var A: [1..n, 1..m] real;
fillRandom(A, 42);
A[1,1] = 0.0;
A[3,2] = 0.0; // let's throw off alignment of default format
// default print does not align the columns nicely:
writeln("A=\n", A);
// A=
// 0.0 0.896538
// 0.347376 0.95269
// 0.903904 0.0
// custom, column-aligned print:
writeln("A=");
for (i,j) in A.domain do
writef("%6.3dr" + if j == A.domain.dim(1).last then "\n" else "", A(i,j));
// A=
// 0.000 0.897
// 0.347 0.953
// 0.904 0.000
// this feature request would allow the following syntax:
writef("A=\n %6.3dr", A);
// currently throws:
// A=
// uncaught SystemError: Invalid argument: Argument type mismatch in argument 0 (in fileWriter.writef(fmt:string) with path "/dev/pts/12" offset 50)
// formatArrays.chpl:14: thrown here
// formatArrays.chpl:14: uncaught here
Example using numpy, where the formatter is specified on the array elements, and "broadcasted" to the entire array print:
>>> import numpy as np
>>> A = np.array([[0.0, 0.896538], [0.347376, 0.95269], [0.903904, 0.0]])
>>> with np.printoptions(formatter={'float': '{:6.3f}'.format}):
... print(A)
...
[[ 0.000 0.897]
[ 0.347 0.953]
[ 0.904 0.000]]
Thanks for filing! I agree that this would be a useful thing to have
In talking with one of my colleagues (Michael) about this, we were thinking a good way to accomplish this could involve adding/modifying one of the serializers to set the precision generally. Have you had a chance to play with the serializer/deserializer feature? (Note that if it seemed useful, we'd expect to provide a separate serializer/deserializer generally for users rather than having users rely on writing their own)
Apologies for the late answer, but yes looking through the docs the serializer feature looks very interesting for this !
If I understand correctly, this would allow us to wrap the for ... do writef(...), however I'm wondering if all those writef calls could be optimized (although I'm not sure of the inner workings of writef ; and arguably one should only print small arrays for debugging, in which speed is not crucial).
The idea of constructing the large format string and using tuple destructuring could be used with ArrayHelper.writeBulkElements, but I'm wondering how that would work concretely.
From my understanding, we'd need to construct a large tuple (and corresponding format string) containing array elements (say, rows of a matrix).
However tuple sizes should be known at compile time (and array sizes can change at runtime), so that might be a problem ?
It would definitely allow you to wrap the for ... do writef(...) call, but if the only reason you're using writef is for the precision setting then it could also allow you to use write or writeln instead of writef, since Chapel's generics allows the easy mixing of various types in the same write call.
I tend to not use writef myself as often in Chapel because I've typically found the base writeln powerful enough for my needs. But I also have spent most of my technical career here so didn't have as much time to form habits around format strings prior to writing Chapel code, and mostly don't write the sorts of code that would benefit from formatting. Which is to say, I'm curious if your use case lends itself to something like writeln("some text ", myInt, " ", myReal); or if it relies a lot more heavily on the other features of writef in a way that varies between calls.
The idea of constructing the large format string and using tuple destructuring could be used with
ArrayHelper.writeBulkElements, but I'm wondering how that would work concretely. From my understanding, we'd need to construct a large tuple (and corresponding format string) containing array elements (say, rows of a matrix). However tuple sizes should be known at compile time (and array sizes can change at runtime), so that might be a problem ?
I'm not sure I'm quite following. It sounds like you have a particular use case in mind here, perhaps? I think understanding the motivation behind what you're asking might help me answer the question better.
I'm curious if your use case lends itself to something like
writeln("some text ", myInt, " ", myReal);
The use cases I was mostly thinking of was pretty-printing matrices (aligned columns, truncated precision) for debugging purposes, something like writef("A = %6.3dr", A);.
Currently writeln seems to truncate the representation of numbers as much as possible, up to what seems to be 5 digits :(writeln(0.0) gives 0.0, writeln(7.0/3.0) gives 2.33333).
I'm not sure what you mean by also using write or writeln ? The use-case I was thinking of requires the use of format strings.
I'm not sure I'm quite following.
Sorry about the confusion, I haven't taken the time to fully form an example. I was more concerned with the performance of having to call writef on every scalar, but it's probably not that important.
The use cases I was mostly thinking of was pretty-printing matrices (aligned columns, truncated precision) for debugging purposes, something like
writef("A = %6.3dr", A);. Currentlywritelnseems to truncate the representation of numbers as much as possible, up to what seems to be 5 digits :(writeln(0.0)gives0.0,writeln(7.0/3.0)gives2.33333).
Ah, I see! That makes sense.
I'm not sure what you mean by also using
writeorwriteln? The use-case I was thinking of requires the use of format strings.
I was mostly trying to think of ways to reduce what you're having to write, especially if there's repetition due to not having the array support. But it looks like you're using enough format string features that it's probably a moot point.
Sorry about the confusion, I haven't taken the time to fully form an example. I was more concerned with the performance of having to call
writefon every scalar, but it's probably not that important.
Ah, okay. That's an interesting point, I don't know that we've done a study on the performance impact of calling writef independently versus in joined batches. My hope is that it would be negligible, but that could be worth verifying.
Noting that a somewhat related idea for specifying precision in writef calls was discussed here: https://github.com/chapel-lang/chapel/issues/19906#issuecomment-1497965334
https://github.com/chapel-lang/chapel/pull/25532 added a new precisionSerializer, that allows you to specify how floating point values are printed out by a fileWriter.
It supports the requested behavior described above with the following code:
use IO, Random, PrecisionSerializer;
config const n = 3, m = 2;
var A: [1..n, 1..m] real;
fillRandom(A, 42);
A[1,1] = 0.0;
A[3,2] = 0.0;
const ps = new precisionSerializer(precision=3, padding=6);
stdout.withSerializer(ps).writeln(A);
which will print:
0.000 0.897
0.347 0.953
0.904 0.000
I believe this issue can be closed with that change, please feel free to re-open if that's not the case.
Thanks for the work here, @jeremiah-corrado!
And thanks for filing this, @ninivert—let us know if you have any feedback!
This is exactly what I was looking for ! Many thanks ! :tada: