parallel-pandas
parallel-pandas copied to clipboard
Parallel processing on pandas with progress bars
When using p_apply from a groupby operation, like this example: `df2 = df.groupby("col1", as_index=False)["col1", "col2", "col3"].p_apply(my_func, parm1=val1, parm2=val2).reset_index(drop=True)` I get a FutureWarning from Pandas: `parallel_pandas/core/parallel_groupby.py:41: FutureWarning: DataFrameGroupBy.grouper is deprecated and...
so in p_apply I call this function that uses numpy or whatever 3rd party package, and it says the package doesn't exist?
Awesome package, lifesaver! This looks copied from mode and probably incorrect? https://github.com/dubovikmaster/parallel-pandas/blob/ec0ce813f62b3576f52ae755d35b29d7e75b3d74/parallel_pandas/core/parallel_dataframe.py#L571
 a min(spit_size, ...) should be added in order to avoid errors when the split size is larger than the number of groups in split_by_col in chunk_apply otherwise array_split returns...
I did not find the parallel version of DataFrame.pivot_table, I was wandering if it's possible to parallelise that function too.
Hi, I would like to ask you whether a way to redirect pr_bar to a new logger or not. If can, please show me the way. Thank you very much.