epub-specs icon indicating copy to clipboard operation
epub-specs copied to clipboard

Made the XML conformance section more succinct

Open iherman opened this issue 3 years ago • 4 comments
trafficstars

This PR is a replacement for #2450, which was mistakenly branched off on an earlier state of the spec.

The PR also adds some text to the security section (also anticipating some WG discussion on #2447 to happen on 2022-10-06).

See:

iherman avatar Oct 06 '22 14:10 iherman

The issue was discussed in a meeting on 2022-10-07

  • no resolutions were taken
View the transcript

2. XML security risk: expansion of internal parsed entities (issue epub-specs#2447)

See github issue epub-specs#2447.

See github issue epub-specs#2433.

Murata Makoto: old issue; more than 20 years ago.
… related to external parsed entities issue 2433.
… this issue is about INTERNAL parsed entities.
… doesn't use external identifiers or URLs.
… these are defined in internal DTD subsets.

Dave Cramer: See description of the attack.

Murata Makoto: if an internal reference references a different internal reference, things can get out of hand quickly.
… some EPUB reading systems ignore rec of XML and ignore internal parsed entities. This is non-conformant but avoids security issue.
… what should we do? Prohibit internal DTD subsets?.
… which gets rid of internal parsed entities..
… can exissting RSes be destryeed by malicious content?.

See github pull request epub-specs#2451.

Ivan Herman: thanks Makoto for raising the issue.
… the problem is that removing the possibility to use internal entities is something we can't do because of existing epub docs that might use this.
… we cannot invalidate existing content per our charter.
… as part of a pull request that makoto and I did together is a separate section.
… in the security section of RS spec.

Ivan Herman: See proposed text in the PR's security section.

Ivan Herman: which say that RS should be aware of this problem and should deal with it..
… so we should draw attention to the issue.

Dave Cramer: I don't think we can or should forbid internal entities.
… the problem has been around 20 years and EPUB is still functional.
… it's a security vulnerability, we may need to warn people in our security section.
… ordering RSs to follow a specific algorithm might be overkill.
… let's warn people.

Murata Makoto: Basically, not our problem.

Brady Duga: this is a known problem in XML. Are there known solutions?.
… do they have mitigations? Like in libxml?.

Dave Cramer: "Defenses against this kind of attack include capping the memory allocated in an individual parser if loss of the document is acceptable, or treating entities symbolically and expanding them lazily only when (and to the extent) their content is to be used.".

Ivan Herman: makoto may know about libraries.
… makoto made some sample epubs, and we tested them.
… I tested on Thorium and Apple Books.
… both reacted by rejecting the EPUB.
… without identifying the problem.
… we suspect they don't accept internal entities.

Murata Makoto: I'm not aware of general solutions.
… XML WG was aware of this issue, but had no good solutions.
… thorium doesn't reject publication, it ignores the entity but not the entire publication.
… this behavior is non-conformant.

Dave Cramer: I don't think it's entirely fair to say that this implementation is non-conformant because it's not deploying my vulnerability.
… not fair to RS.

Wendy Reid: it's like the viewport discussion.
… we want to avoid putting burdens on readers.
… does the fail quietly solution work?.
… is that a bad thing?.
… I think because we are running out of time.
… people should look at the PR.
… there are other things in the PR.
… then we move on.

Dave Cramer: I think we add something to the security section.
… it hasn't been a problem in the past, internal entities aren't usually a problem unless malicious.
… what are the consequences of a malicious use.
… the consequences seem small.
… good enough to say "XML has a known vulnerability:.

Brady Duga: I agree putting this in the security section, maybe mentioning other xml vulnerabilities.
… any attack like this can jeopardize personal information.
… although no one has bothered to do this.

Dave Cramer: Some of this is also outside the scope of the WG, we can't ask for people to patch the OS they are running dev machines on.

Wendy Reid: consensus: let's look people look at Ivan's PR.
… comment on it.
… ten minutes remaining.
… two more topics.

iherman avatar Oct 07 '22 15:10 iherman

@murata2makoto @mattgarrish we got stalled... are we ready to merge this?

Note that the WG has briefly discussed the addition on the security section on its last call and we may get input from there, too. But at least the three of us should be aligned...

iherman avatar Oct 11 '22 08:10 iherman

Seems fine to me now.

mattgarrish avatar Oct 12 '22 14:10 mattgarrish

@murata2makoto I intend to merge this on the week end unless you have some further comments.

iherman avatar Oct 14 '22 04:10 iherman

Sorry for my belated response. This PR looks fine to me.

murata2makoto avatar Oct 15 '22 06:10 murata2makoto