Andrew Robbins comments

Results 113 comments of


                                            Andrew Robbins

Added scaffolding for Oryon arch as in Snapdragon X Elite

Extremely unscientific runthrough: Stock Rblas.dll: ``` & "C:\Program Files\R-aarch64\R-4.5.2\bin\Rscript.exe" .\benchmark\scripts\R\deig.R && & "C:\Program Files\R-aarch64\R-4.5.2\bin\Rscript.exe" .\benchmark\scripts\R\dgemm.R && & "C:\Program Files\R-aarch64\R-4.5.2\bin\Rscript.exe" .\benchmark\scripts\R\dsolve.R From 128 To 2048 Step=128 Loops=1 SIZE Flops Time 128x128...

Added scaffolding for Oryon arch as in Snapdragon X Elite

Gonna be completely honest here-I can't quite tell. Looks like there's some sizes for which it performs better and some for which it is worse. Any recs for drilling down...

Added scaffolding for Oryon arch as in Snapdragon X Elite

3.30.0dev ``` ➜ & "C:\Program Files\R-aarch64\R-4.5.2\bin\Rscript.exe" .\benchmark\scripts\R\deig.R From 128 To 2048 Step=128 Loops=10 SIZE Flops Time 128x128 : 6988.76 MFlops 0.080000 sec 256x256 : 9516.61 MFlops 0.470000 sec 384x384 :...

Added scaffolding for Oryon arch as in Snapdragon X Elite

I think there's definitely _something_ here, judging by the decent improvement at certain matrix sizes, but this is not it judging by the degraded performance at *other* matrix sizes. May...

Added scaffolding for Oryon arch as in Snapdragon X Elite

.....I had an idea. This is an 8-wide chip, neoverse is 5-wide. I wonder what happens if i run the VORTEX target (which is 7-wide and should be otherwise compatible....

Added scaffolding for Oryon arch as in Snapdragon X Elite

Scratch that, it would do nothing, as there's no difference.

Added scaffolding for Oryon arch as in Snapdragon X Elite

Yeah-and even if there is optimization here (and there almost certanily is) I don't even know that the cache sizes are an improvement.

Added scaffolding for Oryon arch as in Snapdragon X Elite

> Probably needs larger loops to get more stable benchmark results. I do have an Oryon system on loan from Qualcomm, it's just that I'm away from it at the...

Added scaffolding for Oryon arch as in Snapdragon X Elite

> I'd _guess_ a hundred instead of ten should help Will report back. With bonus ArmPL for comparison.

Added scaffolding for Oryon arch as in Snapdragon X Elite

So, it turns out the issue was mostly that running BLAS on 12 cores _well_ exceeds the heat capacity of my laptop. Fixed that one. Anyway: Seems that there's a...