Carst Vaartjes comments

Results 14 comments of


                                            Carst Vaartjes

ctable takes 16 hours (and still running) saving to disk - a better way??

Another thing that we do, that if the characters are recurring (think things such as product codes) we create an unique integer for each code, save the combination in a...

Generation of high number of files on disk

You can give a target length when generating a new bcolz ctable or carray. There's an explanation in the docs, but basically every carray is split into small chunks, with...

Generation of high number of files on disk

No you cannot encapsulate it, there are many advantages to the smaller chunks (performance, adding chunks when streaming data into bcolz etc) The expected length parameter will make the carray...

Dependency updates

Hi! not yet, i have to ask one of my more c-savvy colleagues to help me out here i think :/

introduce chunk._getitem_r(...), a thread-safe version of chunk._getitem(...)

One newbie but serious question (it's something i'm struggling with in bquery): how do you ensure the order of chunk processing when putting them in parallel? Because I can imagine...

fail to pip install bcolz in python 2.7 64 bit environment

which os? i don't really have issues on linux myself, and for bquery i test linux + os/x and they work fine. Is it windows? See also https://github.com/visualfabriq/bquery/blob/master/.travis.yml Edit: ```...

bcolz support for parallel writes

You could write to separate ctables with fixed chunk lengths in each process and then combine them in the end, changing the naming and internal carray dicts + appending the...

ctable.append does not accept a pandas.DataFrame

In our (still be open sourced) query engine, we do this: ``` def df_to_bcolz(import_df, path_name, expected_len=None, debug=False): import_df = df_to_natural_name(import_df) if not os.path.exists(path_name): if not debug: print 'Creating ctable directory'...

ctable.append does not accept a pandas.DataFrame

my example code works for appending without an index error, but normally i do not use indexes in pandas except for monotonic selections. do you have a complex index?