graphkit icon indicating copy to clipboard operation
graphkit copied to clipboard

FIX: refactor Network and DAG SOLVER to fix bad pruning

Open ankostis opened this issue 6 years ago • 4 comments

This PR refactors the DAG-solver to prune correctly when intermediate data are given as input.

  • Needed code from, and merged with: #23 (ordered sets) & #18 (prune unsatisfied operations) and #18 (for travis to work on <PY3.4).
  • Significant refactorings explained below.
  • STILL minor bug in PARALLEL PIN, but new failing code detecting them in 2 TCs is DISABLED:
    • test_pruning_with_given_intermediate_and_asked_out()
    • test_unsatisfied_operations_same_out()

ENH: NEW DAG SOLVER

  • Pruning now works by breaking incoming provide-links to any given intermedediate inputs and retrofitting satisfaction detector to include those operations without outputs due to these broken links.
  • Fix #24 by not pruning useful operations when intermediate data given.
  • From #25 the 1st case is now ok also when no outputs asked.

REFACT: Network class COMPILE+COMPUTE:

  • Read the doc-only c570a18 commit to understand changes.
  • Renamed/Refactored:
    --- net.steps --> net.execution_plan.
       ,-- _find_necessary_steps() ------------.
    ---+-- (old)compile() ---------------------+--> (new)compile() + _solve_dag()         
       `-- _collect_unsatisfied_operations --'
    
  • compile() became the master function invoking _solve_dag() & _build-execution_plan(), and do the caching.
  • WIP/refact show_layers() to allow yielding string-list (not just printing).
  • refact(net): move check if value in asked outputs before cache-evicting it when building DelInstructs in execution-plan; compute methods don't need outputs anymore.
  • TCs: speed up parallel/multihtread TCs by reducing delays & repetitions.
  • refact: network adopted stray functions for parallel processing - they all work on net.graph attribute; pave the way for a separate class.
  • refact: consistent use of networkx API when indexing dag.nodes views.
  • enh: add log message when deleting in parallel (in par with sequential code).
  • Raise AssertionError when invalid class in plan. it's a logic error, not a language type-error.
  • refact: var-renames, if-then-else simplifications, pythonisms.
  • doc: A lot!

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

ankostis avatar Oct 03 '19 13:10 ankostis

Codecov Report

Merging #26 into master will increase coverage by 0.44%. The diff coverage is 85.35%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #26      +/-   ##
==========================================
+ Coverage   77.87%   78.31%   +0.44%     
==========================================
  Files           5        5              
  Lines         348      392      +44     
==========================================
+ Hits          271      307      +36     
- Misses         77       85       +8
Impacted Files Coverage Δ
graphkit/functional.py 93.82% <100%> (+0.07%) :arrow_up:
graphkit/base.py 75.34% <69.23%> (-4.03%) :arrow_down:
graphkit/network.py 73.16% <86.02%> (+2.95%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e70718b...fb1b074. Read the comment docs.

codecov-io avatar Oct 04 '19 02:10 codecov-io

Can merge now:

  • STILL buggy PIN on PARALLEL, but code in 2 failing TCs is DISABLED:
    • test_pruning_with_given_intermediate_and_asked_out()
    • test_unsatisfied_operations_same_out()
  • refact(net): move check if value in asked outputs before cache-evicting it when building DelInstructs in execution-plan; compute methods don't need outputs anymore.
  • TCs: speed up parallel/multihtread TCs by reducing delays & repetitions.
  • refact: network adopted stray functions for parallel processing - they all work on net.graph attribute; pave the way for a separate class.
  • refact: consistent use of networkx API when indexing dag.nodes views.
  • enh: add log message when deleting in parallel (in par with sequential code).
  • refact: var-renames, if-then-else simplifications, pythonisms.
  • doc: A lot!

ankostis avatar Oct 04 '19 22:10 ankostis

$ date
Tue 08 Oct 2019 07:00:21 PM EEST


$ cloc --by-file-by-lang  graphkit 
       5 text files.
       5 unique files.                              
       0 files ignored.

github.com/AlDanial/cloc v 1.82  T=0.01 s (525.9 files/s, 119687.0 lines/s)
------------------------------------------------------------------------------------
File                                  blank        comment           code
------------------------------------------------------------------------------------
graphkit/network.py                     145            260            260
graphkit/functional.py                   51             75             89
graphkit/base.py                         38             86             84
graphkit/__init__.py                      3              3              5
graphkit/modifiers.py                     8             29              2
------------------------------------------------------------------------------------
SUM:                                    245            453            440
------------------------------------------------------------------------------------

-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                           5            245            453            440
-------------------------------------------------------------------------------
SUM:                             5            245            453            440
-------------------------------------------------------------------------------

ankostis avatar Oct 08 '19 16:10 ankostis

Just pushed x1 TC for multithreading-executions (originally given n huyng#1 and repeated in #31), it passes ok.

ankostis avatar Oct 15 '19 12:10 ankostis