specification icon indicating copy to clipboard operation
specification copied to clipboard

Are targets lengths optional or not?

Open trishankatdatadog opened this issue 4 years ago • 48 comments

In Section 5.6.2 of the current spec, we say:

5.6.2. Otherwise, download the target (up to the number of bytes specified in the targets metadata), and verify that its hashes match the targets metadata. (We download up to this number of bytes, because in some cases, the exact number is unknown. This may happen, for example, if an external program is used to compute the root hash of a tree of targets files, and this program does not provide the total size of all of these files.)

But in Section 4.5 of the same spec, we say:

TARGETS is an object whose format is the following:

   { TARGETPATH : {
         "length" : LENGTH,
         "hashes" : HASHES,
         ("custom" : CUSTOM) }
     , ...
   }

So, are targets lengths optional or not?

trishankatdatadog avatar Dec 04 '20 11:12 trishankatdatadog

Which of the two quotes paragraphs suggests optionality?

lukpueh avatar Dec 04 '20 11:12 lukpueh

Which of the two quotes paragraphs suggests optionality?

The first one. However, the second one says "custom" is optional, but not "length".

trishankatdatadog avatar Dec 04 '20 11:12 trishankatdatadog

But the first one says

... download the target (up to the number of bytes specified in the targets metadata), and verify that its hashes match the targets metadata. (We download up to this number of bytes

To me that does not suggest optionality.

lukpueh avatar Dec 04 '20 11:12 lukpueh

But I don't really understand what an external program that is used to compute the root hash of a tree of targets files has to do with anything. Do you remember why that was added to the specification, @trishankatdatadog?

lukpueh avatar Dec 04 '20 11:12 lukpueh

To me that does not suggest optionality.

The optionality is in "because in some cases, the exact number is unknown."

Do you remember why that was added to the specification?

IIRC it was to allow using Merkle trees for targets files themselves. Cc @erickt @patrickvacek

trishankatdatadog avatar Dec 04 '20 11:12 trishankatdatadog

To me that does not suggest optionality.

The optionality is in "because in some cases, the exact number is unknown."

That does not make the length in targets optional, it just says that the number might not be exact, it does not say that the number might not be there.

Do you remember why that was added to the specification?

IIRC it was to allow using Merkle trees for targets files themselves. Cc @erickt @patrickvacek

I suggest we remove it then, because nowhere else in the specification do we mention Merkle trees (yet).

lukpueh avatar Dec 04 '20 11:12 lukpueh

Interestingly, if you look at the reference client, it requires targets lengths to be exact, not just an upper bound:

1 -> 2 -> 3

trishankatdatadog avatar Dec 04 '20 12:12 trishankatdatadog

That does not surprise me, just like the specification (except for the confusing note discussed above), the reference implementation client does not implement any "merkle tree root hash for target hash" logic.

lukpueh avatar Dec 04 '20 12:12 lukpueh

The surprise is in the mismatch between spec and client. Now we need to decide which one to change...

trishankatdatadog avatar Dec 04 '20 12:12 trishankatdatadog

I'm not so worried about "up to number of bytes" vs. "exact number of bytes". The purpose of the length check is to prevent downloading too much data, for which both < and == seem equally suited. That said, I am not opposed to aligning spec and reference implementation there. If we do so we should also check snapshot and targets download. In the spec we also say "up to number of bytes" for these, but we probably do "exact number of bytes" in the reference implementation.

Regardless, we should remove the trailing two sentences in that paragraph, because there is no reason to confuse the reader with irrelevant use cases.

~~(We download up to this number of bytes, because in some cases, the exact number is unknown. This may happen, for example, if an external program is used to compute the root hash of a tree of targets files, and this program does not provide the total size of all of these files.)~~

lukpueh avatar Dec 04 '20 12:12 lukpueh

In the spec we also say "up to number of bytes" for these, but we probably do "exact number of bytes" in the reference implementation.

Precisely part of the reason for raising this issue. For all metadata files, we may never have the exact size, so we need the <=. For target files, we need to be clear about whether it is always ==, or <= is allowed.

trishankatdatadog avatar Dec 04 '20 12:12 trishankatdatadog

We need exact target lengths in bytes for Fuchsia. We use a content addressed file system on Fuchsia called blobfs, where each file (we call them blobs at this point) is tracked by a merkle root, and a package, which we call a meta.far really is just another blob that's happens to be an archive that contains a list of paths to merkle roots (plus some other stuff that's not important for this discussion). blobfs is a write-once file system for security purposes. Since we only have the roots and not the full tree, blobfs needs to know the exact size in order to tell when to compute the tree to see if we fetched it correctly. In a meta.far, we list the exact blob size for each blob in the package (*), but we still need to know the size of the meta.far in order to write it correctly into blobfs. So we use the TUF target length for this.

So if the target length isn't exact, I suppose we could add the proper length in the custom field, but that seems a little redundant.

I don't quite understand though the argument for this value to not be exact. As I read 5.6.2, it seem the use case is for targets that aren't actually pointing at a file, but instead computed over a set of other target files. If this is the case, then that might be more appropriately stored as a separate field in a targets role, rather than trying to overload the target notion.

(*): this is a bit of a lie, we currently get the blob size from the server content-length field, but we're going to be fixing this soon.

erickt avatar Dec 04 '20 18:12 erickt

Thanks for the detailed explanation, Erick! So the upper instead of exact bound on a targets file size must have been for Uptane, no? @JustinCappos If so, we should document this in an ADR...

trishankatdatadog avatar Dec 04 '20 18:12 trishankatdatadog

In Uptane, the targets file size has to be an exact match so that it is the same for the director and image repos (see step 10 in full verification).

I think the upper limit should be fine in terms of security, but if no one is using it, we can remove it to eliminate any confusion.

mnm678 avatar Dec 04 '20 20:12 mnm678

For all metadata files, we may never have the exact size.

Why, @trishankatdatadog? The spec says:

LENGTH is the integer length in bytes of the metadata file at METAPATH.

It does say that the value is optional, yes, but to me that does not mean it is optionally accurate. I would expect it to be the exact size if provided.

The main problem here is the parenthesized bit in the client workflow that claims that the number may not be exact. If we remove that there should be no ambiguity about the accuracy of the length field, even if the client workflow does a <= check. <= might just be an artifact of good coding practice.

lukpueh avatar Dec 07 '20 12:12 lukpueh

It does say that the value is optional, yes, but to me that does not mean it is optionally accurate. I would expect it to be the exact size if provided.

I'm pretty sure we did this for Uptane. Would you please confirm or deny, @pattivacek?

trishankatdatadog avatar Dec 08 '20 12:12 trishankatdatadog

It does say that the value is optional, yes, but to me that does not mean it is optionally accurate. I would expect it to be the exact size if provided.

I'm pretty sure we did this for Uptane. Would you please confirm or deny, @pattivacek?

Uptane removed the requirement to check length and instead relies on version and hash. See for example here. IIRC the change was made because the lengths provided no additional security. I think there is some history of the discussion somewhere for the curious.

Note that there is this: "Download up to Z number of bytes [...]. The value for Z is set by the implementer. For example, Z may be tens of kilobytes." However, that doesn't seem relevant here.

pattivacek avatar Dec 08 '20 12:12 pattivacek

Uptane removed the requirement to check length and instead relies on version and hash. See for example here. IIRC the change was made because the lengths provided no additional security. I think there is some history of the discussion somewhere for the curious.

We don't need to check the length of the targets metadata (which is optional in the snapshot metadata), but we do ask to check the length of a target itself:

When checking hashes, the ECU SHOULD additionally check that the length of the image matches the length listed in the metadata.

trishankatdatadog avatar Dec 08 '20 13:12 trishankatdatadog

We don't need to check the length of the targets metadata (which is optional in the snapshot metadata), but we do ask to check the length of a target itself:

When checking hashes, the ECU SHOULD additionally check that the length of the image matches the length listed in the metadata.

Ah yes, sorry. Missed the context there. Anyway, I'm still not aware of a case where you wouldn't know the exact length.

pattivacek avatar Dec 08 '20 13:12 pattivacek

Ah yes, sorry. Missed the context there. Anyway, I'm still not aware of a case where you wouldn't know the exact length.

I'm almost certain we added this text for someone's automotive use case. That text sounds so oddly specific to their use case. I'm ok with it, but we just need to decide whether to keep it or not.

trishankatdatadog avatar Dec 08 '20 13:12 trishankatdatadog

Maybe theupdateframework/tuf@c5deaa3 helps to refresh your memory, @trishankatdatadog?

lukpueh avatar Dec 08 '20 13:12 lukpueh

Maybe theupdateframework/tuf@c5deaa3 helps to refresh your memory, @trishankatdatadog?

But of course, the enemy is always thine own self...

I plead the fifth.

(No, but srsly, does anyone have strong objections about this? That sounds like a valid use case, even if I don't remember exactly who asked for it.)

trishankatdatadog avatar Dec 08 '20 13:12 trishankatdatadog

does anyone have strong objections about this? That sounds like a valid use case, even if I don't remember exactly who asked for it.

:) No objections myself, although it seems a bit weird. I don't really get it, but it probably isn't going to break any security guarantees. What I want to know is how you could know the hash but not the length.

pattivacek avatar Dec 08 '20 14:12 pattivacek

I think I have expressed my strong opinion regarding the seemingly unrelated and apparently unremembered use case, and my not so strong opinion regarding == vs. <= above. I'm happy to provide a corresponding PR.

lukpueh avatar Dec 08 '20 14:12 lukpueh

:) No objections myself, although it seems a bit weird. I don't really get it, but it probably isn't going to break any security guarantees. What I want to know is how you could know the hash but not the length.

Beats me. IIRC this implementor depends completely on this external tool which gives you a root hash but not the length. It is an odd constraint.

trishankatdatadog avatar Dec 08 '20 14:12 trishankatdatadog

I think I have expressed my strong opinion regarding the seemingly unrelated and apparently unremembered use case, and my not so strong opinion regarding == vs. <= above. I'm happy to provide a corresponding PR.

Yes, we certainly need to update either way. If ==, then we need to say it's exact, not up to. If <=, we need to make the length optional in the targets metadata schema. We will also need to check with go-tuf, rust-tuf, tough, etc.

trishankatdatadog avatar Dec 08 '20 14:12 trishankatdatadog

What I want to know is how you could know the hash but not the length

e.g. an ostree commit hash, OTA Connect's Target type OSTREE?

mike-sul avatar Dec 08 '20 14:12 mike-sul

If <=, we need to make the length optional in the targets metadata schema. We will also need to check with go-tuf, rust-tuf, tough, etc.

I don't follow. If it's <=, that just means the number is inexact, not that it isn't required.

e.g. an ostree commit hash, OTA Connect's Target type OSTREE?

Ha! Good call. Yes, of course, and that's probably where this came from. I was interpreting it differently, but yes, for OSTree we don't know the actual length.

pattivacek avatar Dec 08 '20 14:12 pattivacek

I agree with @pattivacek that <= does not imply that the number is optional. And strictly speaking it doesn't imply that the number is inexact either. All it says is that the match does not have to be exact.

lukpueh avatar Dec 08 '20 14:12 lukpueh

Ha! Good call. Yes, of course, and that's probably where this came from. I was interpreting it differently, but yes, for OSTree we don't know the actual length.

Well, it has to be optional for this reason. And if it's optional, we make the check <= if it isn't there, and can make it == if it is there.

trishankatdatadog avatar Dec 08 '20 14:12 trishankatdatadog