reproman icon indicating copy to clipboard operation
reproman copied to clipboard

NF(WiP): SpecObject .find and .__iadd__

Open yarikoptic opened this issue 5 years ago • 4 comments

Sits on top of #273. Only the last commit (6e9178f3652ec1d7ced44dbbed71c16ec050e1f7 to be rewritten during WiP) is pertinent to this PR ATM

Should make it possible to "join" multiple specifications, and also avoid duplicates while retracing.

Seems to run without crashing on a sample but I do not think it works correctly yet

yarikoptic avatar May 16 '19 17:05 yarikoptic

Codecov Report

Merging #418 into master will decrease coverage by 4.8%. The diff coverage is 77.77%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #418      +/-   ##
==========================================
- Coverage   89.48%   84.68%   -4.81%     
==========================================
  Files         148      148              
  Lines       12112    12282     +170     
==========================================
- Hits        10839    10401     -438     
- Misses       1273     1881     +608
Impacted Files Coverage Δ
reproman/distributions/venv.py 89.55% <100%> (+0.4%) :arrow_up:
reproman/distributions/redhat.py 94.54% <100%> (+0.03%) :arrow_up:
reproman/distributions/debian.py 95.3% <100%> (ø) :arrow_up:
reproman/distributions/conda.py 94.19% <100%> (+0.02%) :arrow_up:
reproman/distributions/vcs.py 95.85% <100%> (+0.02%) :arrow_up:
reproman/interface/tests/test_diff.py 100% <100%> (ø) :arrow_up:
reproman/interface/diff.py 95.77% <100%> (+0.06%) :arrow_up:
reproman/formats/reproman.py 80.16% <22.22%> (-10.22%) :arrow_down:
reproman/distributions/base.py 81.63% <72.89%> (-7.56%) :arrow_down:
reproman/interface/retrace.py 89.76% <75%> (-5.48%) :arrow_down:
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 62084cb...db188c2. Read the comment docs.

codecov[bot] avatar Sep 12 '19 23:09 codecov[bot]

Codecov Report

Merging #418 into master will decrease coverage by 4.94%. The diff coverage is 77.91%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #418      +/-   ##
==========================================
- Coverage   89.54%   84.59%   -4.95%     
==========================================
  Files         148      148              
  Lines       12132    12296     +164     
==========================================
- Hits        10863    10402     -461     
- Misses       1269     1894     +625
Impacted Files Coverage Δ
reproman/distributions/venv.py 89.55% <100%> (+0.4%) :arrow_up:
reproman/distributions/redhat.py 94.54% <100%> (+0.03%) :arrow_up:
reproman/distributions/debian.py 95.3% <100%> (ø) :arrow_up:
reproman/distributions/conda.py 94.19% <100%> (+0.02%) :arrow_up:
reproman/distributions/vcs.py 95.85% <100%> (+0.02%) :arrow_up:
reproman/interface/tests/test_diff.py 100% <100%> (ø) :arrow_up:
reproman/interface/diff.py 95.77% <100%> (+0.06%) :arrow_up:
reproman/formats/reproman.py 80.16% <22.22%> (-10.22%) :arrow_down:
reproman/distributions/base.py 82% <73.07%> (-7.19%) :arrow_down:
reproman/interface/retrace.py 89.76% <75%> (-5.48%) :arrow_down:
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2e39954...3fffc18. Read the comment docs.

codecov[bot] avatar Sep 12 '19 23:09 codecov[bot]

On the call, @chaselgrove asked if anyone had a snippet for triggering the duplicate distributions in order to test this PR. I wasn't able to find a snippet posted anywhere (e.g., to generate linked traces in gh-367). If my understanding of the issue is correct, the necessary condition is for a tracer to add a new file to the list of unknown files that will be picked up by a tracer on the next iteration. Because of this, the order of the tracers in retrace.py matters. The current order in retrace.py is DebTracer, RPMTracer, CondaTracer, VenvTracer, VCSTracer, DockerTracer, SingularityTracer.

On a Debian system, here's a snippet that triggers two Debian distribution entries:

cd $(mktemp -d --tmpdir reproman-retrace-XXXXXXX)
virtualenv --python=python3 venv
. ./venv/bin/activate
reproman retrace /bin/sh $PWD/venv/lib/python3.7/abc.py

So for the first pass, the DebTracer adds a debian distribution entry because it picked up dash from /bin/sh, then VenvTracer adds a venv distribution based on encountering abc.py. In the virtual environment, abc.py is a link to /usr/lib/python3.7/abc.py, so the VenvTracer passes that file back as an unknown file. On the next iteration DebTracer picks it up, and adds a second debian distribution.


On a non-Debian system I think this is harder to trigger. The easiest thing I came up with so far is to put VCSTracer before VenvTracer:

diff --git a/reproman/interface/retrace.py b/reproman/interface/retrace.py
index 62f1be60d..597b327aa 100644
--- a/reproman/interface/retrace.py
+++ b/reproman/interface/retrace.py
@@ -244,6 +244,6 @@ def get_tracer_classes():
     from reproman.distributions.vcs import VCSTracer
     from reproman.distributions.docker import DockerTracer
     from reproman.distributions.singularity import SingularityTracer
-    Tracers = [DebTracer, RPMTracer, CondaTracer, VenvTracer, VCSTracer,
+    Tracers = [DebTracer, RPMTracer, CondaTracer, VCSTracer, VenvTracer,
         DockerTracer, SingularityTracer]
     return Tracers

Then pass reproman retrace a file from a git repo and a file from a virtual environment that has an editable package for that git repo:

cd $(mktemp -d --tmpdir reproman-retrace-XXXXXXX)

git clone https://github.com/docker/docker-py

virtualenv --python=python3 venv
. ./venv/bin/activate
pip install -e docker-py

reproman retrace $PWD/venv/lib/python3.7/site-packages/six.py $PWD/docker-py/setup.py

I see the distribution for the virtual environment sandwiched between two git distributions.

output
[...]
2019-11-07 15:37:55,707 [INFO] Entering iteration #1 over Tracers 
2019-11-07 15:37:55,874 [INFO] DebTracer: 0 envs with 2 other files remaining 
2019-11-07 15:37:55,876 [INFO] RPMTracer: 0 envs with 2 other files remaining 
2019-11-07 15:37:55,897 [INFO] CondaTracer: 0 envs with 2 other files remaining 
2019-11-07 15:37:55,981 [INFO] VCSTracer: 1 envs with 1 other files remaining 
2019-11-07 15:37:57,308 [INFO] VenvTracer: 1 envs with 1 other files remaining 
2019-11-07 15:37:57,354 [INFO] SingularityTracer: 0 envs with 1 other files remaining 
2019-11-07 15:37:57,354 [INFO] Entering iteration #2 over Tracers 
2019-11-07 15:37:57,356 [INFO] RPMTracer: 0 envs with 1 other files remaining 
2019-11-07 15:37:57,364 [INFO] CondaTracer: 0 envs with 1 other files remaining 
2019-11-07 15:37:57,438 [INFO] VCSTracer: 1 envs with 0 other files remaining 
2019-11-07 15:37:57,438 [INFO] No more changes or files to track.  Exiting the loop 
# ReproMan Environment Configuration File
# This file was created by ReproMan 0.2.0 on 2019-11-07 15:37:57.438721
version: 0.0.1
distributions:
- name: git
  packages:
  - path: /tmp/reproman-retrace-2GSWKtd/docker-py
    files:
    - setup.py
    root_hexsha: 8e0b54109ad2f1a6c4d4972675d523240038bf44
    branch: master
    hexsha: a0b9c3d0b38abd4af1880ca3dde2845556dd2f70
    describe: 3.7.1-159-ga0b9c3d
    tracked_remote: origin
    remotes:
      origin:
        contains: true
        url: https://github.com/docker/docker-py
- name: venv
  path: /usr/bin/virtualenv
  venv_version: 15.1.0
  environments:
  - path: /tmp/reproman-retrace-2GSWKtd/venv
    python_version: 3.7.3
    packages:
    - name: certifi
      version: 2019.9.11
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
    - name: chardet
      version: 3.0.4
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
    - name: docker
      version: 4.1.0.dev0
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/docker-py
      editable: true
    - name: idna
      version: '2.8'
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
    - name: pip
      version: 19.3.1
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
    - name: pkg-resources
      version: 0.0.0
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
    - name: requests
      version: 2.22.0
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
    - name: setuptools
      version: 41.6.0
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
    - name: six
      version: 1.13.0
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
      files:
      - lib/python3.7/site-packages/six.py
    - name: urllib3
      version: 1.25.6
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
    - name: websocket-client
      version: 0.56.0
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
    - name: wheel
      version: 0.33.6
      local: true
      location: /tmp/reproman-retrace-2GSWKtd/venv/lib/python3.7/site-packages
- name: git
  packages:
  - path: /tmp/reproman-retrace-2GSWKtd/docker-py
    root_hexsha: 8e0b54109ad2f1a6c4d4972675d523240038bf44
    branch: master
    hexsha: a0b9c3d0b38abd4af1880ca3dde2845556dd2f70
    describe: 3.7.1-159-ga0b9c3d
    tracked_remote: origin
    remotes:
      origin:
        contains: true
        url: https://github.com/docker/docker-py

kyleam avatar Nov 07 '19 20:11 kyleam

@chaselgrove will you have some time in the near future to address @kyleam 's comment?

yarikoptic avatar Nov 29 '19 16:11 yarikoptic