Francesc Alted

Results 320 comments of Francesc Alted

Yes, I think the suggestion by Andrew of using the plugin method would work best for your needs. The Blosc HDF5 repo should support this already, see: https://github.com/Blosc/hdf5/blob/master/src/blosc_plugin.c Hope this...

Yes, Erik made very good points here. If what you want is archivability and compatibility with the standard HDF5 library, then you have to use what it supports by default...

Yes, I can reproduce this, but I don't have time right now to look into it. If you have time and want to try to do some debug on this,...

Yup, that's curious. Here it is what valgrind is saying about this: ``` ==4648== Warning: set address range perms: large range [0x3a04b040, 0x69b3b840) (undefined) ==4648== Warning: set address range perms:...

And here it is a workaround: ``` import bcolz import numpy as np ca = bcolz.carray(np.arange(1e8).astype(int)) print np.fromiter(bcolz.eval('ca > 2').wheretrue(), dtype=int) ``` Provided that this works I suspect this is...

And this: ``` import numpy as np ca = np.arange(3e8).astype(int) _ = np.array(np.where(ca > 2)[0]) ``` creates this output: ``` Traceback (most recent call last): File "segfault2.py", line 4, in...

My hunch is that the OP issue is that numpy tries to materialize the result from iterator completely before creating the final array, and this is consuming more memory. Why...

Nope, the next is also failing: ``` import bcolz import numpy as np ca = bcolz.carray(np.arange(10).astype(int)) _ = np.array(bcolz.eval('ca > 2').wheretrue()) ``` and valgrind is issuing the same error, so...

Yes, that can very much be the reason. The thing is that len() over an iterator makes non-sense, so this needs a fix (e.g. the iterators returning something that is...

Yeah, a PR on that would be great. Thanks in advance!