reprozip icon indicating copy to clipboard operation
reprozip copied to clipboard

Identify language-specific packages (name/version)

Open remram44 opened this issue 6 years ago • 4 comments

When a runtime with a package manager is used, we should try to identify which packages are being used. For example, for Python, record the package name and version for site packages.

This could be done for:

  • Python site packages
  • Ruby gems
  • R (maybe)
  • Java (not interpreted, no run-time package manager, but JARs include version information)

remram44 avatar Jul 05 '19 15:07 remram44

For Python: pip freeze is able to identify installed packages. However this runs in the "target" interpreter, not ReproZip's. We should probably read from the filesystem instead.

pip uses distlib to do this, which cites a variety of PEPs:

  • PEP 241: replaced by PEP 314. Metadata 1.0 format for PKG-INFO file in sdists, and .dist-info/METADATA files
  • PEP 314: Metadata 1.1 format
  • PEP 345: Metadata 1.2 format
  • PEP 566: Metadata 2.1 format
  • PEP 376: On-disk layout of packages and metadata (.dist-info, *.egg-info)
  • PEP 386: replaced by PEP 440. Version number and version requirements format, irrelevant
  • PEP 426: meant to replace PEP 345, but withdrawn in favor of PEP 566
  • PEP 440: Version number and version requirements format, irrelevant

So there doesn't really seem to be competing standards or formats. Reading .dist-info/METADATA or .egg-info/PKG-INFO (PEP 376) should give all the information we want (in PEP 566 format), though really the version number is in the folder name already.

remram44 avatar Jul 05 '19 15:07 remram44

I thought of highlighting a possible pitfall in this task. Some packages have a single release for Py2 and Py3, where certain features are made unavailable for Py2 users. But the dependency tracker, based on how you plan to implement it, might encounter an issue (such as with Sumatra). More here.

Looking forward to see this functionality implemented within ReproZip.

appukuttan-shailesh avatar Jul 30 '19 12:07 appukuttan-shailesh

We should make sure to record the Python version as well then, thanks.

remram44 avatar Jul 30 '19 17:07 remram44

This needs a change in the config file format.

Currently it's a flat list of files, implicitly meant for whatever that distribution's default package manager is:

packages:
  - name: "libc6"
    version: "2.31-0ubuntu9.3"
    size: 13563904
    packfiles: true
    meta: {"section": "libs"}
    files:
      # Total files used: 3.80 MB
      # Installed package size: 12.94 MB
      - "/lib/i386-linux-gnu/ld-2.31.so" # 176.40 KB
      - "/lib/ld-linux.so.2" # Link to /lib/i386-linux-gnu/ld-2.31.so
      - "/lib/x86_64-linux-gnu/ld-2.31.so" # 186.99 KB
  - name: "libexpat1"
    version: "2.2.9-1build1"
    size: 410624
    packfiles: true
    meta: {"section": "libs"}
    files:
      # Total files used: 178.28 KB
      # Installed package size: 401.00 KB
      - "/lib/x86_64-linux-gnu/libexpat.so.1" # Link to /lib/x86_64-linux-gnu/libexpat.so.1.6.11
      - "/lib/x86_64-linux-gnu/libexpat.so.1.6.11" # 178.28 KB

We can either add fields to each package stating which package manager & environment it's for, or make it a nested list environment->package:

packages:
  - package_manager: dpkg
    environment: /
    packages:
      - name: "libc6"
        version: "2.31-0ubuntu9.3"
        size: 13563904
        packfiles: true
        meta: {"section": "libs"}
        files:
          # Total files used: 3.80 MB
          # Installed package size: 12.94 MB
          - "/lib/i386-linux-gnu/ld-2.31.so" # 176.40 KB
          - "/lib/ld-linux.so.2" # Link to /lib/i386-linux-gnu/ld-2.31.so
          - "/lib/x86_64-linux-gnu/ld-2.31.so" # 186.99 KB
  - package_manager: python
    environment: /home/vagrant/venv
    python: "3.8"
    packages:
      - name: "urllib3"
        version: "1.26.4"
        size: 12345
        packfiles: true
        files:
          # Total files used: 678 KB
          # Installed package size: 1.5 MB
          - /home/vagrant/venv/lib/python3.8/site-packages/urllib3/response.py # 28 KB

remram44 avatar Jul 08 '21 20:07 remram44