TileDB icon indicating copy to clipboard operation
TileDB copied to clipboard

how to efficiently read float array from tiledb into double buffer

Open mikejiang opened this issue 4 years ago • 6 comments

right now I have to read it into a temporary float buffer

vector<float> buf(nrow * ncol);
query.set_buffer("mat", buf);

Then manually copy over to the double buffer

arma::Mat<double> data(nrow, ncol);
for(int i = 0; i < nrow * ncol; i++)
		data.memptr()[i] = buf[i];

The copying is a significant overhead based on my profiling. I wonder if there is better way. Btw, libhdf5's read API supports read h5 data into any arbitrary type,

dataset.read(data.memptr(), PredType::NATIVE_FLOAT ,memspace, dataspace);

and I believe it is doing the conversion internally, but it is a lot faster than my own copying.

mikejiang avatar May 04 '20 21:05 mikejiang

@mikejiang, depending on what guarantees armadillo makes (or does not), pulling the data.memptr() access out of the loop may help in the interim.

ihnorton avatar May 04 '20 22:05 ihnorton

Thanks @ihnorton . But that doesn't make difference based on the testing.

mikejiang avatar May 05 '20 00:05 mikejiang

Just in case: I assume you know Armadillo is perfectly happy with float too ?

Depending on what you are doing you could possibly do the computation in float and then only propage the final result to, say, R, which only knows double.

eddelbuettel avatar May 05 '20 00:05 eddelbuettel

@eddelbuettel You are right, returning the result to R is the exact reason why I need to do double conversion :)

mikejiang avatar May 05 '20 00:05 mikejiang

Just a quick guess then because Conrad is also more than mildly performance obsessed:

  • Instantiate a arma::fvec fv(n), pass that to TileDB via query.set_buffer("name", fv).
  • Cast the result: arma::vec dv = arma::conv_to<vec>::from(fv);

Untested code, but you get the idea. But there may be tricks in the conversion (I haven't looked). Worst case it performs like your loop -- but the you still saved an explicit loop :)

(Actually, instantiate the standard STL vector, use its known memptr to create the arma vector via the advanced constructors 'trusting the allocation'. That gives you an arma::fvec, so then you could cast later. )

eddelbuettel avatar May 05 '20 01:05 eddelbuettel

arma::conv_to does seem to be a little faster (and indeed cleaner). Thanks , @eddelbuettel !

mikejiang avatar May 05 '20 03:05 mikejiang