Sameer Agarwal
Sameer Agarwal
Currently we use CG or more precisely CGNR for solving problems with general sparsity. LSQR and more recently LSRN are reputed to have better numerical behavior and should be used...
Michael Saunders makes the argument that MINRES maybe better suited for solving linear systems where the aim is to reduce residual rather than the quadratic form associated with it. This...
Here is a minimal repro. Lets called it `trace_fmt_repro.x`. ``` pub const TWENTYFIVE = u8:25; pub fn foo() -> u8 { let _ = trace_fmt!("{}", TWENTYFIVE); u8:1 } #![test] fn...
Current, one can construct a literal s8:128, even though 128 cannot fit in a signed 8 bit integer. Infact what happens is that the bit pattern of 128 is used...
Currently if you have a variable `x` that is a `u8` and has value 3, printing it using `trace_fmt!("{:b}",x) `will print out `11`. It would be nice to be able...
With @DmitriyKorchemkin's ParallelFor changes and C++17 being the standard model we are compiling with there is no reason to keep these two backends in place. They are not really tested...
Right multiplication should be fairly straightforward to speedup, since we can process each row block in parallel. LeftMultiplication will require a scatter & gather.
Currently we package googletest and gmock using a now obsolete method which generated single headers from the repo. This is not supported anymore. The modern instructions for using googletest are...
https://cmake.org/cmake/help/latest/module/FeatureSummary.html
IDENTITY, JACOBI, SCHUR_JACOBI and SCHUR_POWER_SERIES_EXPANSION preconditioner should all be possible to run on the GPU relatively easily. The key idea is to export E, F, inverse(E'E), F'F and inverse(F'F), schur_jacobi...