rulinalg icon indicating copy to clipboard operation
rulinalg copied to clipboard

In place transposition for Matrix (and MatrixSliceMut)

Open AtheMathmo opened this issue 9 years ago • 6 comments

From @AtheMathmo on July 6, 2016 22:10

We currently have a transpose(&self) -> Matrix<T> method. It would be nice if we could do this operation in place, i.e. in_place_transpose(&mut self).

We can achieve this by swapping places in the data. We could do this fairly easily by utilizing the existing slice::swap method. This is acceptable but will include unnecessary bound checking, to speed things up we could use the underlying technique from that swap method:

unsafe {
    let pa: *mut T = self.get_unchecked_mut(i);
    let pb: *mut T = self.get_unchecked_mut(j);
    ptr::swap(pa, pb);
}

It would be nice to have this method in place for both Matrix and MatrixSliceMut. The Matrix implementation could simply call self.as_mut_slice().in_place_transpose().

To resolve this issue

  1. In place transposition for square Matrix.
  2. In place transposition for square MatrixSliceMut.
  3. Tests for both.

Bonus points

  1. Using ptr::swap to avoid bounds checking.
  2. Explore vectorization and cache-line utilization.
  3. Explore more advanced techniques.
  4. Use permutations to transpose non-square matrices.

Copied from original issue: AtheMathmo/rusty-machine#97

AtheMathmo avatar Jul 12 '16 02:07 AtheMathmo

These slides provide an overview of a good (and relatively simple) implementation: http://www.netlib.org/utk/people/JackDongarra/CCDSC-2014/talk35.pdf

There is a paper linked in the article which may be a valuable read.

Edit: I think I'll try implementing this paper. It looks relatively simple to get working!

AtheMathmo avatar Aug 05 '16 19:08 AtheMathmo

I've got an initial (messy, slow) implementation of the algorithm in the linked paper here: https://github.com/theotherphil/rulinalg/commit/4f2355992487856c1e02662627efa9d9f4c875fa

I'll tidy and speed it up before creating a pull request.

theotherphil avatar May 28 '17 21:05 theotherphil

Hmmm. I've just found this: https://athemathmo.github.io/2016/08/29/inplace-transpose.html

Looks like you've already done this! But it doesn't appear to have been merged.

theotherphil avatar May 29 '17 12:05 theotherphil

Hey @theotherphil ! Sorry that I'm slow to reply.

Yes I had already tried implementing this and you're right that it is slower than out-of-place transpose, which is expected. I had not merged it yet due to the need for in-place asm and/or some 128 bit integer work. Once these are easier to support on all systems I'd be happy to bring the old branch up to master and merge.

Sorry that I didn't comment sooner to let you know.

AtheMathmo avatar May 29 '17 13:05 AtheMathmo

I was googling for 128 bit integer support in Rust when I found your article! It looks like 128 bit integers should be getting stabilised soon. Fingers crossed.

theotherphil avatar May 29 '17 13:05 theotherphil

The fast divide trick is already covered by an existing (nightly-only) crate. Might be worth reusing that when u128 is stabilised. https://github.com/fulmicoton/fastdivide

theotherphil avatar Jun 17 '17 22:06 theotherphil