Please take a look.

This PR adds a CFormat function in gonum/mat to format complex dense matrices; this function is intended to complement the current Format function. Relevant issue #641 and a relevant PR #1414, although that PR appears to have stalled. Further, PR #1414 takes the approach of wholesale duplicating format.go, which may not be preferred.

Summary:

Following https://github.com/gonum/gonum/pull/1414#issuecomment-666282217, an objective was to avoid code duplication while implementing a formatter for complex matrices. To summarize the proposed approach: value formatters are added for formatting floats and complex numbers, which are then used during matrix formatting. Since these value formatters satisfy the fmt.Formatter interface, fmt.State can be passed, offering simplification in several areas (e.g., re-construction of a format string is no longer necessary). As it would be inefficient to spin up a a new formatter for each element in a matrix, a value formatter holds a buffer for reuse.

To tackle widthing, a state type is proposed (satisfying fmt.State) that can be fit to a matrix and that stores information on column widths, using the existing widther type. As before, iterating through matrix elements is required. After a call to At, a state will respond to subsequent Width calls with the relevant column width. Recommendations toward improving implementation are appreciated.

Notwithstanding the need for additional internal types, this feels simpler to me on the whole and I believe this approach should make extending to new syntaxes or data types easier, if desired. Thanks for reviewing -- will address feedback/comments.

Bench:

I've included a few benchmarks below. These compare the current and proposed implementation when formatting 10x10, 100x100 and 1000x1000 dense matrices.

implementation	benchmark	iter	time	bytes	alloc
current	BenchmarkFormat10	36134	33063 ns/op	2693 B/op	16 allocs/op
current	BenchmarkFormat100	358	3747403 ns/op	1672025 B/op	137 allocs/op
current	BenchmarkFormat1000	3	334034817 ns/op	163557800 B/op	1066 allocs/op
proposed	BenchmarkFormat10	22086	54554 ns/op	2886 B/op	19 allocs/op
proposed	BenchmarkFormat100	170	7072273 ns/op	1670752 B/op	49 allocs/op
proposed	BenchmarkFormat1000	2	697675398 ns/op	163542896 B/op	80 allocs/op

Aug 12 '21 15:08 wamuir

Codecov Report

Merging #1707 (5e086de) into master (85ca896) will increase coverage by 0.03%. The diff coverage is 92.59%.

:exclamation: Current head 5e086de differs from pull request most recent head 6ce02c8. Consider uploading reports for the commit 6ce02c8 to get more accurate results

@@            Coverage Diff             @@
##           master    #1707      +/-   ##
==========================================
+ Coverage   73.26%   73.29%   +0.03%     
==========================================
  Files         510      511       +1     
  Lines       60132    60152      +20     
==========================================
+ Hits        44056    44090      +34     
+ Misses      13584    13581       -3     
+ Partials     2492     2481      -11

Impacted Files	Coverage Δ
mat/format_value.go	`84.41% <84.41%> (ø)`
mat/format.go	`98.67% <98.21%> (+8.88%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 85ca896...6ce02c8. Read the comment docs.

Aug 13 '21 07:08 codecov-commenter

just a drive-by comment: could you run the benchmarks before/after the proposed optimization a couple of times (say 10-20) and show the result of benchstat? with the limited statistics you presented in the commit message, I get this rather poor statistical power result:

$> benchstat ./ref.txt ./new.txt 
name        old time/op    new time/op    delta
Format10      33.1µs ± 0%    54.6µs ± 0%   ~     (p=1.000 n=1+1)
Format100     3.75ms ± 0%    7.07ms ± 0%   ~     (p=1.000 n=1+1)
Format1000     334ms ± 0%     698ms ± 0%   ~     (p=1.000 n=1+1)

name        old alloc/op   new alloc/op   delta
Format10      2.69kB ± 0%    2.89kB ± 0%   ~     (p=1.000 n=1+1)
Format100     1.67MB ± 0%    1.67MB ± 0%   ~     (p=1.000 n=1+1)
Format1000     164MB ± 0%     164MB ± 0%   ~     (p=1.000 n=1+1)

name        old allocs/op  new allocs/op  delta
Format10        16.0 ± 0%      19.0 ± 0%   ~     (p=1.000 n=1+1)
Format100        137 ± 0%        49 ± 0%   ~     (p=1.000 n=1+1)
Format1000     1.07k ± 0%     0.08k ± 0%   ~     (p=1.000 n=1+1)

ie:

$> go test -run=NONE -bench=Format -benchmem -count=20 |& tee ref.txt
## apply optimizations
$> go test -run=NONE -bench=Format -benchmem -count=20 |& tee new.txt
$> benchstat ./ref.txt ./new.txt

thanks.

PS: benchstat is available here:

golang.org/x/perf/benchstat

Aug 13 '21 07:08 sbinet

@wamuir Thank you for doing this. I've only taken a brief look, but it looks like nice work so far. I hope to be able to have a deeper look over the weekend.

Aug 13 '21 07:08 kortschak

just a drive-by comment: could you run the benchmarks before/after the proposed optimization a couple of times (say 10-20) and show the result of benchstat? with the limited statistics you presented in the commit message, I get this rather poor statistical power result:

$> benchstat ./ref.txt ./new.txt 
name        old time/op    new time/op    delta
Format10      33.1µs ± 0%    54.6µs ± 0%   ~     (p=1.000 n=1+1)
Format100     3.75ms ± 0%    7.07ms ± 0%   ~     (p=1.000 n=1+1)
Format1000     334ms ± 0%     698ms ± 0%   ~     (p=1.000 n=1+1)

name        old alloc/op   new alloc/op   delta
Format10      2.69kB ± 0%    2.89kB ± 0%   ~     (p=1.000 n=1+1)
Format100     1.67MB ± 0%    1.67MB ± 0%   ~     (p=1.000 n=1+1)
Format1000     164MB ± 0%     164MB ± 0%   ~     (p=1.000 n=1+1)

name        old allocs/op  new allocs/op  delta
Format10        16.0 ± 0%      19.0 ± 0%   ~     (p=1.000 n=1+1)
Format100        137 ± 0%        49 ± 0%   ~     (p=1.000 n=1+1)
Format1000     1.07k ± 0%     0.08k ± 0%   ~     (p=1.000 n=1+1)

ie:

$> go test -run=NONE -bench=Format -benchmem -count=20 |& tee ref.txt
## apply optimizations
$> go test -run=NONE -bench=Format -benchmem -count=20 |& tee new.txt
$> benchstat ./ref.txt ./new.txt

thanks.

PS: benchstat is available here:

* [golang.org/x/perf/benchstat](https://golang.org/x/perf/benchstat)

Thanks @sbinet, I wasn't aware of benchstat. Cool. Output is below. I added -benchtime=1x, because this appears to otherwise perform tests using the pooled mean? If true, I'd think the dispersion of time/op (±) is otherwise understated. I guess that's a nit on go test for not having a flag to output variance (or a nit on me for not being able to find it).

name	old time/op	new time/op	delta
Format10	44.2µs ±17%	73.9µs ±38%	+67.07% (p=0.000 n=20+18)
Format100	3.28ms ± 3%	7.55ms ± 5%	+130.21% (p=0.000 n=20+18)
Format1000	329ms ± 1%	764ms ± 2%	+132.17% (p=0.000 n=20+20)

Increase in clock time is statistically significant. @kortschak provided review comments, which I'll work through over the weekend and then rerun the bench. One of these comments is to use micro-benchmarking and I expect this will offer greater insight into differences. Addressing his comments may also close the current gap (e.g., his comment regarding fmt.Fprint and reflect).

Aug 13 '21 14:08 wamuir

Most recent commit addresses review comments, except for the switch to sub-benchmarks. I'll need to push a second commit to update the benchmarks. I attempted some further simplification of the format functions and I think this was productive. I also added comments along the way. Updated benchstat output is below.

name	old time/op	new time/op	delta
Format10	44.9µs ±10%	62.9µs ±31%	+40.08% (p=0.000 n=18+20)
Format100	3.31ms ± 1%	4.27ms ± 1%	+29.20% (p=0.000 n=18+20)
Format1000	326ms ± 1%	426ms ± 1%	+30.41% (p=0.000 n=20+20)

Aug 14 '21 15:08 wamuir

name old time/op new time/op delta

Format/10/General 52.6µs ±17% 68.8µs ±18% +30.74% (p=0.000 n=19+19)

Format/10/DotByte 49.5µs ±21% 67.9µs ±17% +37.19% (p=0.000 n=19+20)

Format/10/Excerpt 50.3µs ±11% 66.1µs ±21% +31.40% (p=0.000 n=19+20)

Format/10/Prefix 49.4µs ±10% 66.4µs ±17% +34.54% (p=0.000 n=17+20)

Format/10/Squeeze 53.0µs ±22% 69.7µs ±14% +31.49% (p=0.000 n=20+20)

Format/10/MATLAB 51.5µs ±12% 66.4µs ±12% +28.79% (p=0.000 n=20+19)

Format/10/MATLAB# 43.0µs ±18% 42.9µs ±23% ~ (p=0.967 n=20+20)

Format/10/Python 50.4µs ±13% 64.0µs ± 9% +26.99% (p=0.000 n=20+19)

Format/10/Python# 42.4µs ±37% 43.1µs ±30% ~ (p=0.757 n=20+20)

Format/100/General 3.13ms ± 7% 4.43ms ± 9% +41.42% (p=0.000 n=17+20)

Format/100/DotByte 3.94ms ± 1% 4.35ms ± 8% +10.32% (p=0.000 n=19+20)

Format/100/Excerpt 4.00ms ± 1% 4.54ms ± 6% +13.45% (p=0.000 n=20+20)

Format/100/Prefix 4.00ms ± 1% 5.61ms ± 1% +40.23% (p=0.000 n=20+16)

Format/100/Squeeze 4.42ms ± 1% 5.35ms ±18% +21.01% (p=0.000 n=14+20)

Format/100/MATLAB 4.68ms ±15% 6.46ms ±13% +37.92% (p=0.000 n=20+20)

Format/100/MATLAB# 2.91ms ± 3% 2.90ms ± 2% ~ (p=0.247 n=20+20)

Format/100/Python 6.33ms ±16% 6.38ms ±35% ~ (p=0.890 n=20+20)

Format/100/Python# 3.56ms ±13% 1.69ms ± 1% -52.45% (p=0.000 n=20+19)

Format/1000/General 306ms ± 0% 405ms ± 0% +32.08% (p=0.000 n=19+20)

Format/1000/DotByte 303ms ± 0% 392ms ± 0% +29.54% (p=0.000 n=20+20)

Format/1000/Excerpt 308ms ± 1% 405ms ± 0% +31.76% (p=0.000 n=20+20)

Format/1000/Prefix 307ms ± 0% 404ms ± 0% +31.67% (p=0.000 n=20+20)

Format/1000/Squeeze 307ms ± 0% 405ms ± 0% +31.83% (p=0.000 n=20+20)

Format/1000/MATLAB 307ms ± 0% 405ms ± 0% +31.99% (p=0.000 n=19+20)

Format/1000/MATLAB# 164ms ± 1% 164ms ± 1% ~ (p=0.561 n=19+19)

Format/1000/Python 308ms ± 1% 406ms ± 1% +31.86% (p=0.000 n=20+18)

Format/1000/Python# 164ms ± 1% 164ms ± 1% ~ (p=0.604 n=20+19)

[Geo mean] 3.74ms 4.43ms +18.29%

name	old time/op	new time/op	delta
Format/10/General	52.6µs ±17%	68.8µs ±18%	+30.74% (p=0.000 n=19+19)
Format/10/DotByte	49.5µs ±21%	67.9µs ±17%	+37.19% (p=0.000 n=19+20)
Format/10/Excerpt	50.3µs ±11%	66.1µs ±21%	+31.40% (p=0.000 n=19+20)
Format/10/Prefix	49.4µs ±10%	66.4µs ±17%	+34.54% (p=0.000 n=17+20)
Format/10/Squeeze	53.0µs ±22%	69.7µs ±14%	+31.49% (p=0.000 n=20+20)
Format/10/MATLAB	51.5µs ±12%	66.4µs ±12%	+28.79% (p=0.000 n=20+19)
Format/10/MATLAB#	43.0µs ±18%	42.9µs ±23%	~ (p=0.967 n=20+20)
Format/10/Python	50.4µs ±13%	64.0µs ± 9%	+26.99% (p=0.000 n=20+19)
Format/10/Python#	42.4µs ±37%	43.1µs ±30%	~ (p=0.757 n=20+20)
Format/100/General	3.13ms ± 7%	4.43ms ± 9%	+41.42% (p=0.000 n=17+20)
Format/100/DotByte	3.94ms ± 1%	4.35ms ± 8%	+10.32% (p=0.000 n=19+20)
Format/100/Excerpt	4.00ms ± 1%	4.54ms ± 6%	+13.45% (p=0.000 n=20+20)
Format/100/Prefix	4.00ms ± 1%	5.61ms ± 1%	+40.23% (p=0.000 n=20+16)
Format/100/Squeeze	4.42ms ± 1%	5.35ms ±18%	+21.01% (p=0.000 n=14+20)
Format/100/MATLAB	4.68ms ±15%	6.46ms ±13%	+37.92% (p=0.000 n=20+20)
Format/100/MATLAB#	2.91ms ± 3%	2.90ms ± 2%	~ (p=0.247 n=20+20)
Format/100/Python	6.33ms ±16%	6.38ms ±35%	~ (p=0.890 n=20+20)
Format/100/Python#	3.56ms ±13%	1.69ms ± 1%	-52.45% (p=0.000 n=20+19)
Format/1000/General	306ms ± 0%	405ms ± 0%	+32.08% (p=0.000 n=19+20)
Format/1000/DotByte	303ms ± 0%	392ms ± 0%	+29.54% (p=0.000 n=20+20)
Format/1000/Excerpt	308ms ± 1%	405ms ± 0%	+31.76% (p=0.000 n=20+20)
Format/1000/Prefix	307ms ± 0%	404ms ± 0%	+31.67% (p=0.000 n=20+20)
Format/1000/Squeeze	307ms ± 0%	405ms ± 0%	+31.83% (p=0.000 n=20+20)
Format/1000/MATLAB	307ms ± 0%	405ms ± 0%	+31.99% (p=0.000 n=19+20)
Format/1000/MATLAB#	164ms ± 1%	164ms ± 1%	~ (p=0.561 n=19+19)
Format/1000/Python	308ms ± 1%	406ms ± 1%	+31.86% (p=0.000 n=20+18)
Format/1000/Python#	164ms ± 1%	164ms ± 1%	~ (p=0.604 n=20+19)
[Geo mean]	3.74ms	4.43ms	+18.29%

Aug 14 '21 20:08 wamuir

This PR has gone a bit stale. @wamuir are you still interested in submitting this change?

Aug 01 '23 10:08 vladimir-ch

@vladimir-ch Yes. I think this needs another review. I can make revisions.

Aug 01 '23 15:08 wamuir

gonum
gonum copied to clipboard

mat: format complex dense matrices

Codecov Report

gonum gonum copied to clipboard

mat: format complex dense matrices

Codecov Report

gonum
gonum copied to clipboard