h5cpp
h5cpp copied to clipboard
writing an armadillo matrix produces a transposed matrix in h5
Consider this code:
int main()
{
arma::mat M = arma::linspace(1, 10, 10);
M.reshape(2, 5);
std::cout << "my M shape is:"
<< "\n";
M.brief_print();
auto h = h5::create("test.h5", H5F_ACC_TRUNC);
h5::write(h, "/test", M);
std::cout << "reading back..."
<< "\n";
M = h5::read<arma::mat>("test.h5", "/test");
std::cout << "my M shape is:"
<< "\n";
M.brief_print();
return 0;
}
That give this output:
my M shape is:
[matrix size: 2x5]
1.0000 3.0000 5.0000 7.0000 9.0000
2.0000 4.0000 6.0000 8.0000 10.0000
reading back...
my M shape is:
[matrix size: 5x2]
1.0000 6.0000
2.0000 7.0000
3.0000 8.0000
4.0000 9.0000
5.0000 10.0000
That looks wrong in h5dump:
HDF5 "test.h5" {
GROUP "/" {
DATASET "test" {
DATATYPE H5T_IEEE_F64LE
DATASPACE SIMPLE { ( 5, 2 ) / ( 5, 2 ) }
DATA {
(0,0): 1, 2,
(1,0): 3, 4,
(2,0): 5, 6,
(3,0): 7, 8,
(4,0): 9, 10
}
}
}
}
What I expected is:
my M shape is:
[matrix size: 2x5]
1.0000 3.0000 5.0000 7.0000 9.0000
2.0000 4.0000 6.0000 8.0000 10.0000
reading back...
my M shape is:
[matrix size: 2x5]
1.0000 3.0000 5.0000 7.0000 9.0000
2.0000 4.0000 6.0000 8.0000 10.0000
Which is fine in h5dump too:
HDF5 "test.h5" {
GROUP "/" {
DATASET "test" {
DATATYPE H5T_IEEE_F64LE
DATASPACE SIMPLE { ( 2, 5 ) / ( 2, 5 ) }
DATA {
(0,0): 1, 2, 3, 4, 5,
(1,0): 6, 7, 8, 9, 10
}
}
}
}
To have what I expected I must revert the commit "arma matrix rows/cols swapped to correct C/Fortran order mismatch" (be0a1918156433457ded8e33206ba74438e107e4)
Can you confirm that something is wrong in the library or I'm doing something stupid? :)
Great find! Hmmm there is more to this, get to it in a sec... The top one shouldn't be happening: write - read == 0
whereas the h5dump case is not quite straightforward.
The first case is a clear bug. Thanks for letting me know! The second one is complicated: Linear algebra libraries mark the layout with transpose
flag but HDF5 doesn't have this feature, leading to:
- give up zero copy (performance) and copy each dataset from col major to row major. That is lot's of copying! Considering that the majority of scientific platforms are colmajor. In cases where not enough memory is left then it will fail
- as a feature request: convince the HDFGroup to add a flag marking
transpose
Will take a closer look when I can, until I am adding @gheber to the conversation. best: steve
Ok, let me know if you need some tests from me. Then I wish HDFGroup consider to add a transpose flag! In the meanwhile I'll transpose the data before writing them, or I'll do the trick of reverting be0a191