LiveSPICE
LiveSPICE copied to clipboard
Command line parameters parsing in Test, Gaussian elimination benchmarks.
Hi. I've tested how memory layout of arrays affects performance of the Gaussian elimination method in the inner loop of the simulation lambda and here are the results:
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1110 (21H1/May2021Update)
Intel Core i7-8550U CPU 1.80GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET SDK=6.0.100-preview.6.21355.2
[Host] : .NET 6.0.0 (6.0.21.35212), X64 RyuJIT
Job-WMMCQC : .NET 6.0.0 (6.0.21.35212), X64 RyuJIT
InvocationCount=1 UnrollFactor=1
Method | M | Mean | Error | StdDev | Ratio | RatioSD | CacheMisses/Op | BranchMispredictions/Op | BranchInstructions/Op | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|---|
CompactColumnMajor | 12 | 168.1 ms | 3.26 ms | 3.62 ms | 1.09 | 0.03 | 5,895,782 | 3,725,107 | 292,493,722 | - | 336 B |
CompactRowMajor | 12 | 170.8 ms | 3.35 ms | 4.47 ms | 1.10 | 0.04 | 5,845,524 | 3,755,540 | 268,238,848 | - | 336 B |
ArrayOfArrays | 12 | 155.1 ms | 3.09 ms | 3.68 ms | 1.00 | 0.00 | 6,609,769 | 3,571,795 | 257,758,787 | - | 336 B |
ArrayOfArraysVectorized | 12 | 135.8 ms | 2.64 ms | 4.10 ms | 0.88 | 0.04 | 6,657,404 | 2,985,282 | 198,694,853 | - | 336 B |
MathNetSolve | 12 | 312.0 ms | 6.08 ms | 9.46 ms | 2.02 | 0.09 | 10,237,689 | 4,685,561 | 467,613,872 | 49000.0000 | 210,400,336 B |
I've checked also how well perform MathNet's LU factorization, and it's significantly slower. It might be superior in cases where A matrix is constant, which is unfortunately not our case. Looking at the results I've found, that only vectorized method performs about 10% better than current implementation. I would be happy to se how it performs on a different machine, because I'm currently testing it on my laptop and thermal throttling might skew the results a little bit ;) Sidenote: messing with the equation-order in MNA, seems to have bigger influence on the performance, and sometimes stability of the solution. I'm suspecting, that it might be related to some cache-misses, but this requires more investigation ;)
I've also managed to fix this little TODO:
// TODO: Make these command line arguments.
using fresh beta version of System.CommandLine package ;)
Really nice work, thanks for this! I have a few suggestions on the benchmarks that might affect the results.
For one of the benchmarks, it would be nice to actually try calling
https://github.com/dsharlet/LiveSPICE/blob/0d59564ea4ab6dc64c0e7e9d16dadb2d050e4ae4/Circuit/Simulation/Simulation.cs#L453 so we see how the currently used method compares.
It's the one called 'ArrayOfArrays' with the ratio of 1.00 😉
Right, I understand it's the same implementation strategy, but it's hard to be sure it's actually the same :) (now and in the future).
Right, I understand it's the same implementation strategy, but it's hard to be sure it's actually the same :) (now and in the future).
I think I've got it 😄 I've added current implementation.
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1288 (21H1/May2021Update)
Intel Core i7-8550U CPU 1.80GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET SDK=6.0.100-rc.1.21458.32
[Host] : .NET 5.0.10 (5.0.1021.41214), X64 RyuJIT
Job-IUEOEC : .NET 5.0.10 (5.0.1021.41214), X64 RyuJIT
InvocationCount=1 UnrollFactor=1
Method | M | Mean | Error | StdDev | Ratio | RatioSD | BranchInstructions/Op | CacheMisses/Op | BranchMispredictions/Op | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|---|
CompactColumnMajor | 12 | 161.7 ms | 2.47 ms | 2.19 ms | 1.18 | 0.03 | 292,431,462 | 5,869,841 | 3,769,003 | - | - |
CompactRowMajor | 12 | 147.2 ms | 2.63 ms | 2.33 ms | 1.08 | 0.03 | 267,941,751 | 5,904,725 | 3,291,887 | - | - |
Current | 12 | 135.6 ms | 2.68 ms | 3.58 ms | 1.00 | 0.00 | 257,833,697 | 6,623,519 | 3,595,141 | - | - |
ArrayOfArrays | 12 | 142.9 ms | 2.80 ms | 4.59 ms | 1.06 | 0.04 | 257,580,038 | 6,666,079 | 3,448,328 | - | - |
ArrayOfArraysVectorized | 12 | 116.2 ms | 2.31 ms | 3.09 ms | 0.86 | 0.03 | 205,933,773 | 6,618,399 | 3,122,586 | - | - |
MathNetSolve | 12 | 283.7 ms | 5.58 ms | 9.01 ms | 2.08 | 0.09 | 466,561,477 | 9,956,233 | 4,458,830 | 49000.0000 | 210,400,000 B |