Carst Vaartjes
Carst Vaartjes
Another thing that we do, that if the characters are recurring (think things such as product codes) we create an unique integer for each code, save the combination in a...
I will look into it this weekend!
You can give a target length when generating a new bcolz ctable or carray. There's an explanation in the docs, but basically every carray is split into small chunks, with...
No you cannot encapsulate it, there are many advantages to the smaller chunks (performance, adding chunks when streaming data into bcolz etc) The expected length parameter will make the carray...
Hi! not yet, i have to ask one of my more c-savvy colleagues to help me out here i think :/
One newbie but serious question (it's something i'm struggling with in bquery): how do you ensure the order of chunk processing when putting them in parallel? Because I can imagine...
which os? i don't really have issues on linux myself, and for bquery i test linux + os/x and they work fine. Is it windows? See also https://github.com/visualfabriq/bquery/blob/master/.travis.yml Edit: ```...
You could write to separate ctables with fixed chunk lengths in each process and then combine them in the end, changing the naming and internal carray dicts + appending the...
In our (still be open sourced) query engine, we do this: ``` def df_to_bcolz(import_df, path_name, expected_len=None, debug=False): import_df = df_to_natural_name(import_df) if not os.path.exists(path_name): if not debug: print 'Creating ctable directory'...
my example code works for appending without an index error, but normally i do not use indexes in pandas except for monotonic selections. do you have a complex index?