TileDB
TileDB copied to clipboard
how to efficiently read float array from tiledb into double buffer
right now I have to read it into a temporary float buffer
vector<float> buf(nrow * ncol);
query.set_buffer("mat", buf);
Then manually copy over to the double buffer
arma::Mat<double> data(nrow, ncol);
for(int i = 0; i < nrow * ncol; i++)
data.memptr()[i] = buf[i];
The copying is a significant overhead based on my profiling. I wonder if there is better way. Btw, libhdf5's read API supports read h5 data into any arbitrary type,
dataset.read(data.memptr(), PredType::NATIVE_FLOAT ,memspace, dataspace);
and I believe it is doing the conversion internally, but it is a lot faster than my own copying.
@mikejiang, depending on what guarantees armadillo makes (or does not), pulling the data.memptr()
access out of the loop may help in the interim.
Thanks @ihnorton . But that doesn't make difference based on the testing.
Just in case: I assume you know Armadillo is perfectly happy with float too ?
Depending on what you are doing you could possibly do the computation in float and then only propage the final result to, say, R, which only knows double.
@eddelbuettel You are right, returning the result to R is the exact reason why I need to do double conversion :)
Just a quick guess then because Conrad is also more than mildly performance obsessed:
- Instantiate a
arma::fvec fv(n)
, pass that to TileDB viaquery.set_buffer("name", fv)
. - Cast the result:
arma::vec dv = arma::conv_to<vec>::from(fv)
;
Untested code, but you get the idea. But there may be tricks in the conversion (I haven't looked). Worst case it performs like your loop -- but the you still saved an explicit loop :)
(Actually, instantiate the standard STL vector, use its known memptr to create the arma vector via the advanced constructors 'trusting the allocation'. That gives you an arma::fvec, so then you could cast later. )
arma::conv_to
does seem to be a little faster (and indeed cleaner). Thanks , @eddelbuettel !