sarah quiñones comments

Results 105 comments of


                                            sarah quiñones

`zip` does not satisfy the semantic requirements of `bidirectional_iterator` when the ranges have different lengths

one potential solution would be to "align" the end iterators (and maybe cache the result?) when `end` is computed, if the ranges are all sized. if the ranges aren't sized,...

internal iteration for `&mut I`

well, these don't look like the most promising results ^^'

internal iteration for `&mut I`

are there tests that do?

internal iteration for `&mut I`

rustc-perf seems to take forever on my machine and i can't display the results after it's finished. so that doesn't seem like a good option for me :/

internal iteration for `&mut I`

thanks for the tips! i managed to get it working thanks to your help. it seems that the biggest culprit was inlining the ops::function wrappers. but even without it i...

loads and stores with vectors can read/write more than the vector size

it's not just sizes smaller than the smallest native size. for example, the above code generates the correct instructions if we use `float64`, but not with `float64` (loads/stores 4 `double`s)...

Proposal: Provide nalgebra wrappers to `faer` factorizations

i like the idea of leaving it as an experimental feature for now, since faer is relatively new as a library and the api might still change in the future....

replace matrixmultiply with gemm

matmul benchmark results on my machine after enabling `gemm`: ``` mat100_mul_mat100 time: [39.299 µs 39.346 µs 39.398 µs] change: [-4.7112% -4.5478% -4.3846%] (p = 0.00 < 0.05) Performance has improved....

replace matrixmultiply with gemm

the CI seems to be failing on cuda but i'm not sure what's causing it. it says it can't find `aligned-vec = "0.5"` but it's right there in crates.io https://crates.io/crates/aligned-vec

replace matrixmultiply with gemm

so, gemm is based on the BLIS papers, which can be found here https://github.com/flame/blis#citations it's more or less a faithful implementation of the algorithms described in the fourth paper. i'm...