He Kaisheng

Results 9 issues of He Kaisheng

## What do these changes do? ## Related issue number This PR implements a method that send small objects directly to receiver without split them into small blocks. Fixes #xxxx...

type: enhancement
to be backported
mod: storage

Web sometimes hang when task is running, the stack is below: ``` Current thread 0x00007fce86d99740 (most recent call first): File "/opt/conda/lib/python3.8/logging/__init__.py", line 1069 in flush File "/opt/conda/lib/python3.8/logging/__init__.py", line 1089 in...

type: bug
mod: web

## What do these changes do? There is a barrier now that reducers start to fetch mapper data after all mappers done, this PR try to prefetch map data once...

type: enhancement
to be backported
mod: task service
mod: subtask service
shuffle

**Describe the bug** `groupby` failed when using categorical columns with `as_index=False`. **To Reproduce** ``` Python In [14]: a = pd.DataFrame({'a':['a','b', 'c'] * 5, 'b': ['d', 'e', 'f'] * 5, 'c':...

type: bug
mod: dataframe

**Describe the bug** Mars integrates some deep learning frameworks(PyTorch, TensorFlow), these frameworks usually need to set some environments for distributed training, `TF_CONFIG` for TensorFlow, `MASTER_ADDR` for PyTorch. We use `ctx.get_worker_addresses()`...

type: bug
mod: learn

Now we implements GroupBy.nunique using GroupBy.transform, some optimizations could be applied to reduce intermediate data size.

type: enhancement
mod: dataframe
task: medium

I tried to build wheel from source, error raised when run `python setup.py bdist_wheel` as below: ``` Python [ 53%] Building CXX object CMakeFiles/opencc_clib.dir/src/py_opencc.cpp.o In file included from /home/OpenCC/deps/pybind11-2.5.0/include/pybind11/pytypes.h:12:0, from...

**Is your feature request related to a problem? Please describe.** Implements `classification_report` for classification metrics.(https://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html)

good first issue
type: feature
mod: learn
task: easy

**Describe the bug** Failed to execute `Series.drop_duplicates`. ``` Python In [75]: a = md.DataFrame(np.random.rand(10, 2), columns=['a', 'b'], chunk_size=2) In [76]: a['a'].drop_duplicates().execute() 0%| | 0/100 [00:00

type: bug
good first issue
mod: dataframe
pr welcome
task: easy