ARF1
ARF1
Benchmarking with c-blosc 1.5 is on hold as it depends on resolving issue Blosc/c-blosc#92. (The c-blosc 1.5 series does not compile or produces runtime failures on my machine.)
@esc This PR arose from the effort to parallelise factorisation in bquery. I have not worked on it recently but plan to get back to it soonish. C-blosc 1.5.4 does...
@waylonflinn I am not completely sure that the bottleneck this was supposed to circumvent was real. There were issues with my test database as well as with c-blosc at the...
@waylonflinn Thanks. That actually looks very useful to me too! Is there a list of bcolz-based projects anywhere? More and more seem to be popping up. More to the point,...
@FrancescAlted With regular `range` you of course loose the ability to process your chunks in parallel. I think @CartVaartjes was asking for a silber-bullet: parallel processing of chunks, in-order results,...
@FrancescAlted > Would you mind to add some benchmarks in the 'bench/' directory showing the advantage of this approach? I would be happy to. I just need to clarify what...
@FrancescAlted On reflection, I probably was not as clear as I could have been: when you speak of "this approach", do you mean - the column-major (vs. row-major) result array...
I would be quite interested in this as well. Rationale: pandas & bcolz string columns is currently extremely slow, as each entry in the string column is converted to a...
This issue would probably benefit from being considered together with the possibility of introducing a pandas `out_flavor` to ctable (though not necessarily for carray). (See #176) A pandas `out_flavor` would...
@FrancescAlted Thanks for taking the time to respond. I appreciate the desire to keep things simple and maintainable. **Categorical dtype issue:** As you suggested I currently use compression to deal...