bigflow
bigflow copied to clipboard
Baidu Bigflow is an interface that allows for writing distributed computing programs and provides lots of simple, flexible, powerful APIs. Using Bigflow, you can easily handle data of any scale. Bigfl...
Could bigflow support either SQL or a familiar DataFrame API to query structured data.
在对key group_by之后,希望可以方便做求均值,求方差,排序再遍历这样的操作; 希望可以提供类似这样的内置函数
join 函数在数据量多大的时候程序会挂掉?
While running bigflow program, I find it will output some byproducts, e.g., entity-* .flume ... After several times, it will make the folder in a mess. Could you please put...

## Definitions Structure input formats specifically mean [ORC](https://github.com/apache/orc) file and [Parquet](https://github.com/apache/parquet-format) file. ## Current Status Bigflow on DCE supports ORC file(only reading) and Parquet file with its own loader as...
 As illustrated in this picture, `pip install readline` failed but the build continued. The building scripts should be improved to build readline success and to detect this kind of...
Continuous integration is required for bigflow, and we should have a system to support continuos integration. Maybe we can use travis-ci.org, or teamcity, or we can set up a jenkins...
Read/write InputFormat/OutputFormat, SerDe from/to Hive Metastore. Read/write data from/to Hive Table or Partition.
So that user can use it in any version of Python.