python-tuf Indexing targets with TUF is hard if using delegations

The Importance of a Target Index

First, let's be clear about this: if you're looking for the trustworthy target info (hash, length, etc.) for a target whose name you know, TUF has that clearly solved for you: the algorithms are in the specification (item 4.5 here) and the reference implementation provides Updater.get_one_valid_target_info() implementing them. This works fine with or without using delegations.

However, an important part of many update systems is finding out what is available in the first place. There are reasonable arguments for using TUF to find target listings:

The security of a target index is important, because incorrect listings pose security risks:
- Removed items: Damage can be done if, when a user wants to know what targets are available on a repository, items are removed (DoS). One could prevent important updates or reduce the visibility of some package.
- (Added items: Note that regardless of how this listing is obtained, the addition of other items cannot be particularly harmful as long as the item 4.5 procedure from the spec is followed (e.g. Updater.get_one_valid_target_info()). You can lose cycles looking for them, but that's about it.)
TUF has to include target lists anyway, in each role.

Obtaining a Target Index from TUF

Without Delegations

If you do not use delegations, the listing of targets on the repository is easy to obtain and verify. The same protection provided for an individual target is extended for the listing of targets itself, with all of TUF's guarantees. This is because the list exists in one location, the top-level Targets role, and can always be verified using the information in the Root metadata. Just refreshing top-level metadata automatically gets you a verified index of targets, secured by the Targets role via Root.

With Delegations

If you do use delegations, getting a listing of available targets on the repository is more involved and has more complex security implications.

All Roles

If you don't already know what role lists the targets you might be interested in, you would have to walk the graph from Targets to obtain and verify all delegated roles to fill your list. (We can and should provide that in a function, to replace the deprecated Updater.all_targets(), and the algorithm should be added to the specification IMO so that no implementer produces something like the current insecure all_targets(). Note also that this is not very efficient.) The security of such an all-targets query is not the same as the no-delegations security provided above, where the list is secured by Targets via Root. Instead, using delegations, there's a generally-weaker Snapshot role guarantee that you are not unknowingly missing role files or updates to role files, and a guarantee from each delegated role file itself that it has not added or excluded targets.

Specified Role(s)

If you do already know what delegated role lists the targets you're interested in, then you can jump to that role and that role is essentially providing you a listing service. That service then has the following ill-defined security:

Removed items:
- The Targets role again does not protect you from removed items.
- Your protection from missing items now comes down to the Snapshot role telling you what version of the delegated role you should have, and if you already had that version of the role, you can tell if it has changed (presumably maliciously). There is no longer any well-defined Targets or even delegated-targets role protection for the listing itself. The latter is because there is no well-defined way to verify the delegated targets role itself, since it doesn't come in the context of walking the delegation graph in search of a specific target. You can establish your own expectations for how the role you directly sought has been signed, but that is not part of the TUF spec, even if it seems intuitive in a particular case; there are edge cases with results that are not defined. (If this is not clear, please ask and I'll elaborate with an example.)
Added items:
- Because you should still be walking the delegation graph for each target that remains in the list you receive, you will still have the same protection from maliciously added items in the target listing.

Jan 15 '19 17:01 awwad

(If this is not clear, please ask and I'll elaborate with an example.)

Please provide an example. I understand the rest and think the snapshot role is fine for protection in most scenarios.

Jan 15 '19 17:01 JustinCappos

@trishankatdatadog

I'm finishing up rewriting all_targets and targets_of_role now (no longer deprecated 🎉). targets_of_role isn't really interesting, and I'm content with the way I've rewritten it (docstring here).

There are two ways I could go about all_targets, though.

In both models, there will be two steps:

First, get a list of target names.
Second, perform the full get_one_valid_targetinfo pre-order DFS on each.

The two options pertain to step one, and the security guarantees provided for inclusion of target names. I can take two approaches:

1A. Fetch unverified targets roles, grab the targets names from each, then discard them and continue to step 2.

OR

1B. Walk the whole delegation graph, verifying every delegation as you go, and produce a list of target names.

Since step 2 will protect you from attackers' adding targets, the distinction is in what credentials are required for an attacker to de-list things. For 1A, a mirror without keys can de-list targets by providing junk metadata, without detection. For 1B, missing roles are noticed, so nobody can be quietly deprived of their ability to list targets.

I'm going the 1B direction. That results in this docstring for all_targets. LMK if you actually find 1A more appealing .

    """
    <Purpose>
      Obtain every verified target info listed by the repository.

      This method does that in two stages:

        1. Walk the targets delegation graph, beginning at the top-level
           Targets role, harvesting all target names listed by each targets
           role it traverses.

        2. Then -- after the full traversal yields a list of target names --
           call get_one_valid_targetinfo on each target name to get the
           verified target info for that target, walking the delegation graph
           as necessary.  This is done rather than reproducing the security
           functionality of get_one_valid_targetinfo during step 1 along the
           way.

      This is not efficient, but neither is it catastrophically inefficient,
      and it avoids dangerous code reproduction of core TUF code from
      get_one_valid_targetinfo.  (A more efficient single-traversal procedure
      involves a good bit of overhead tracking all delegation properties from
      delegations traversed to reach each node -- so that it can make decisions
      about every target it sees -- and it would be easy for a maintainer to
      allow its behavior to deviate from get_one_valid_targetinfo.)

      This method does not update top-level metadata (Root, Timestamp,
      Snapshot, Targets).  It WILL, however, if necessary, update delegated
      targets metadata in the process of walking the delegation graph, if and
      only if (1) a needed delegated targets metadata has not previously been
      obtained or (2) the currently-trusted version of Snapshot indicates that
      that delegated targets role is out of date (has a lower version number
      than Snapshot expects).

    <Arguments>
      None.

    <Exceptions>
      tuf.exceptions.RepositoryError:
        If the metadata for the 'targets' role is missing from
        the 'snapshot' metadata.

      tuf.exceptions.UnknownRoleError:
        If one of the top-level roles could not be found in the role database.

    <Side Effects>
      The metadata for target roles is updated and stored.

    <Returns>
     A list of targets, conformant to
     'tuf.formats.LABELED_FILEINFOS_SCHEMA'.
    """

May 06 '19 16:05 awwad

@awwad Great stuff, and I'm on board with you about 1B. I think it should work, but I haven't thought about the algorithm yet, should be feasible

May 06 '19 16:05 trishankatdatadog

Just upgraded to 0.12.2, the all_targets method is still there but deprecated. I'm not using delegations so is it okay to carry on using it?

Feb 26 '20 00:02 davedoesdev

You should be safe if you use no targets role delegations, but be careful to keep it that way. :)

Feb 26 '20 00:02 awwad

I think I've ended up in the same results in #1995 as awwad did here: exposing delegated roles list of targetpaths is unsafe unless each targetpath is confirmed to be delegated to the role. Doing that confirmation might still be worth the trouble.

I'll close this issue, please add relevant comments to #1995 if you think there is something not covered there

Dec 02 '22 13:12 jku

(and just to document the current situation: ngclient does not expose the metadata to callers and does not implement all_targets() or targets_of_role() for these reasons).

Dec 02 '22 13:12 jku

(and just to document the current situation: ngclient does not expose the metadata to callers and does not implement all_targets() or targets_of_role() for these reasons).

Yes: it was very much a bad idea, unfortunately.

Dec 02 '22 18:12 trishankatdatadog

something like targets_of_role() could still make sense with caveats:

role name can't be the sole argument -- no way to know what the delegation tree is if you do that
every returned target must be confirmed to really be delegated to the role

last 3 commits in WIP branch in https://github.com/jku/python-tuf/commits/list-targets . The API is not immediately obvious... but it seems to solve the problem for specific cases at least -- repos like sigstore could use this

Dec 03 '22 10:12 jku