pash
pash copied to clipboard
Dspash fixes
Fixes many bugs and reliability problems:
- Support multiple inputs in hdfs cat "hdfs dfs -cat $IN1 $IN2 .."
- Fix bugs with multi sink scripts
- Support HDFS blocks that are not stored in subdir0 locally
- Merge graph on one worker on the first aggregator (Performance improvement)
- Use processes instead of threads in workers (fixed hanging issue)
- Fixes other bugs and edge cases in graph splitting
- Setup and Installation scripts improvements
OS:ubuntu-20.04 Tue Jun 7 17:59:51 UTC 2022 intro: 2/2 tests passed. interface: 34/34 tests passed. compiler: 54/54 tests passed. agg: 109/109 tests passed.
OS:ubuntu-18.04 Tue Jun 7 18:00:15 UTC 2022 intro: 2/2 tests passed. interface: 34/34 tests passed. compiler: 54/54 tests passed. agg: 109/109 tests passed.
OS = Debian 10 CPU = Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz Ram = 15752 Hash = 226346ab Kernel= Linux 4.15.0-167-generic x86_64
| benchmark | tests | passed | failed | untested | unresolved | unsupported | not_in_use | other_status |
|---|---|---|---|---|---|---|---|---|
| posix | 494 | 375 | 41 | 31 | 6 | 40 | 1 | 0 |
| intro | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
| interface | 34 | 34 | 0 | 0 | 0 | 0 | 0 | 0 |
| compiler | 54 | 54 | 0 | 0 | 0 | 0 | 0 | 0 |
| aggregator | 109 | 109 | 0 | 0 | 0 | 0 | 0 | 0 |
I would prefer redirecting these changes to a different branch (other than future). Let's have a dspash-future branch that we will work on temporarily, and when everything stabilizes, we can push to the normal future (to avoid stalling on PRs now). We can still make PRs to this dspash-future branch.