python-tuf icon indicating copy to clipboard operation
python-tuf copied to clipboard

How much metadata behaviour should be implemented on their classes?

Open lukpueh opened this issue 5 years ago • 6 comments

The TUF refactor aims at a more idiomatic use of OOP (see #1112). As such it seems reasonable to implement metadata entity behaviour as instance methods on the corresponding classes.

However, not all users of the metadata model need the same behaviour. For instance, a TUF client is likely to only need Metadata.verify and read access on all metadata object attributes, but none of the methods to modify attributes, which are needed by a repository library/tool, (e.g. sign, bump_version, bump_expiration, delegate, add_keys, add_targets, etc.).

This question about how to draw the line is especially important if unneeded functionality requires 3rd party dependencies, which we would have to vendor on a client.

Some possible approaches (brain storming):

  • Use classes exclusively for attributes (except maybe for methods on sslib's Metadata and Signed classes, such as sign, verify, canonicalise) and implement all behaviour on metadata user specific controller classes, e.g. Repository, Client (or something like Project/Developer/Workspace for PEP480.
  • Use Subclasses, e.g. RepositoryTargets, ClientTargets, etc...
  • Expose all methods to all users of the model, but handle missing optional dependencies. E.g. raise something like an UnsupportedLibraryError if a client calls Metadata.sign but does not have the underlying optional 3rd-party dependencies (see https://github.com/secure-systems-lab/securesystemslib/issues/179)

Premise: Find a balance between OOP purism and pragmatism

lukpueh avatar Sep 10 '20 08:09 lukpueh

Related discussion: Is there a need for library functions that should be implemented neither on a metadata class nor on a controller class?

lukpueh avatar Sep 10 '20 08:09 lukpueh

dropping some thoughts here ...

On one hand the general preference should be towards composition over inheritance, in this case the controller classes option. On the other hand, Client needs only a subset of all metadata methods, meaning there is a common base functionality between Client and Repository which repository extends which hints toward inheritance. However, Signed class already has one level of inheritance hierarchy which I find meaningful and I don't want to suggest ruining.

Having said that maybe something like option one:

Use classes exclusively for attributes (except maybe for methods on sslib's Metadata and Signed classes, such as sign, verify, canonicalise)

plus a base Controller class that can be extended by a RepositoryController, if needed, seems like the best compromise.

I like option three too,

Expose all methods to all users of the model, but handle missing optional dependencies. E.g. raise something like an UnsupportedLibraryError if a client calls Metadata.sign but does not have the underlying optional 3rd-party dependencies

mainly for its simplicity but I can't add much knowledge on potential dependencies issues

sechkova avatar Nov 20 '20 12:11 sechkova

Related discussion: Is there a need for library functions that should be implemented neither on a metadata class nor on a controller class?

In the current code there are a lot of different functions performing hashing. Probably this can be done using directly securesystemslib but if such a pattern starts to appear again, maybe a helper/utility/wrapper class? for the record I don't like going in this direction

sechkova avatar Nov 20 '20 12:11 sechkova

To answer Luke's question, my 🔥 take is 💯

trishankatdatadog avatar Nov 21 '20 05:11 trishankatdatadog

From client implementation experience: The metadata API should definitely handle:

  • verify metadata signature with threshold
  • verify metadata file hash/length
  • verify target file hash/length

The rest is just nice-to-have functionality, mostly easier accessors to existing attributes. Examples:

  • timestamp.signed.meta["snapshot.json"].version is "correct" but also completely unnecessary as "snapshot.EXT" is the only valid key in meta... so we could have timestamp.signed.snapshot_meta.version (which is a lot easier to statically type check)
  • Key object does not contain keyid but when keys are used, a keyid is usually needed: we could include keyid in the key object (while of course preserving the current serialized format)

jku avatar May 26 '21 07:05 jku

The stable release of the Metadata API lets me assume that we have made a decision about this.

I wonder if we should track it in an ADR. It might keep us from cramming more functionality into the API when designing a repository library (#1136). Opinions?

lukpueh avatar Mar 17 '22 10:03 lukpueh