rulinalg In place transposition for Matrix (and MatrixSliceMut)

From @AtheMathmo on July 6, 2016 22:10

We currently have a transpose(&self) -> Matrix<T> method. It would be nice if we could do this operation in place, i.e. in_place_transpose(&mut self).

We can achieve this by swapping places in the data. We could do this fairly easily by utilizing the existing slice::swap method. This is acceptable but will include unnecessary bound checking, to speed things up we could use the underlying technique from that swap method:

unsafe {
    let pa: *mut T = self.get_unchecked_mut(i);
    let pb: *mut T = self.get_unchecked_mut(j);
    ptr::swap(pa, pb);
}

It would be nice to have this method in place for both Matrix and MatrixSliceMut. The Matrix implementation could simply call self.as_mut_slice().in_place_transpose().

To resolve this issue

In place transposition for square Matrix.
In place transposition for square MatrixSliceMut.
Tests for both.

Bonus points

Using ptr::swap to avoid bounds checking.
Explore vectorization and cache-line utilization.
Explore more advanced techniques.
Use permutations to transpose non-square matrices.

Copied from original issue: AtheMathmo/rusty-machine#97

Jul 12 '16 02:07 AtheMathmo

These slides provide an overview of a good (and relatively simple) implementation: http://www.netlib.org/utk/people/JackDongarra/CCDSC-2014/talk35.pdf

There is a paper linked in the article which may be a valuable read.

Edit: I think I'll try implementing this paper. It looks relatively simple to get working!

Aug 05 '16 19:08 AtheMathmo

I've got an initial (messy, slow) implementation of the algorithm in the linked paper here: https://github.com/theotherphil/rulinalg/commit/4f2355992487856c1e02662627efa9d9f4c875fa

I'll tidy and speed it up before creating a pull request.

May 28 '17 21:05 theotherphil

Hmmm. I've just found this: https://athemathmo.github.io/2016/08/29/inplace-transpose.html

Looks like you've already done this! But it doesn't appear to have been merged.

May 29 '17 12:05 theotherphil

Hey @theotherphil ! Sorry that I'm slow to reply.

Yes I had already tried implementing this and you're right that it is slower than out-of-place transpose, which is expected. I had not merged it yet due to the need for in-place asm and/or some 128 bit integer work. Once these are easier to support on all systems I'd be happy to bring the old branch up to master and merge.

Sorry that I didn't comment sooner to let you know.

May 29 '17 13:05 AtheMathmo

I was googling for 128 bit integer support in Rust when I found your article! It looks like 128 bit integers should be getting stabilised soon. Fingers crossed.

May 29 '17 13:05 theotherphil

The fast divide trick is already covered by an existing (nightly-only) crate. Might be worth reusing that when u128 is stabilised. https://github.com/fulmicoton/fastdivide

Jun 17 '17 22:06 theotherphil

rulinalg rulinalg copied to clipboard

In place transposition for Matrix (and MatrixSliceMut)

To resolve this issue

Bonus points

rulinalg
rulinalg copied to clipboard