epubcheck icon indicating copy to clipboard operation
epubcheck copied to clipboard

Report XML Processing Instructions looking like XML declarations (typos) as USAGE?

Open chadwood opened this issue 7 years ago • 7 comments

Tested in opf and html files:

Passes validation:

<?xxml version="1.0" encoding="utf-8" standalone="yes"?>
<?xl version="1.0" encoding="utf-8" standalone="no"?>

chadwood avatar May 04 '18 18:05 chadwood

There's actually nothing technically wrong here. You can have processing instructions to start a file, and when the xml declaration is misnamed it just becomes a regular old processing instruction.

The XML declaration doesn't really do anything in EPUB, either, since version 1.0 and UTF-8 are the defaults, so the lack of a correct xml declaration isn't going to negatively affect the rendering.

I haven't heard of anyone releasing EPUBs with processing instructions still in their documents, though, so it probably would be harmless to warn if any are found in the document prolog (ignoring actual xml declarations, of course).

mattgarrish avatar May 05 '18 00:05 mattgarrish

So XML parsers are supposed to ignore this?

Note: That's not sarcasm

On Fri, May 4, 2018, 8:31 PM Matt Garrish [email protected] wrote:

There's actually nothing technically wrong here. You can have processing instructions to start a file, and when the xml declaration is misnamed it just becomes a regular old processing instruction.

The XML declaration doesn't really do anything in EPUB, either, since version 1.0 and UTF-8 are the defaults, so the lack of a correct xml declaration isn't going to negatively affect the rendering.

I haven't heard of anyone releasing EPUBs with processing instructions still in their documents, though, so it probably would be harmless to warn if any are found in the document prolog (ignoring actual xml declarations, of course).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IDPF/epubcheck/issues/842#issuecomment-386765770, or mute the thread https://github.com/notifications/unsubscribe-auth/ANTgmrltp-kKfppRACFDUQz_F5GTrTHGks5tvPLDgaJpZM4TzDm- .

chadwood avatar May 05 '18 01:05 chadwood

Such a PI is certainly correct as a part of XML. In my understanding, neither WHATWG HTML nor W3C HTML 5.2 disallows the use of such PIs. So, such PIs are perfectly legitimate, as I see it. WHATWG DOM allows PIs. I thus conclude that your example (<?xl ...>) will appear in the DOM tree as a PI, but is not used for rendering.

murata2makoto avatar May 05 '18 02:05 murata2makoto

neither WHATWG HTML nor W3C HTML 5.2 disallows the use of such PIs

No, but they're also generally useless. XML style sheet PIs haven't gone anywhere in epub, so other than relics from internal processes and applications what function would they serve? In all cases, there's no harm in telling someone they're an odd choice in the output.

But I would keep the warning at the info level if implemented, as they are legitimate.

mattgarrish avatar May 05 '18 13:05 mattgarrish

So isn't this an enhancement and not a bug?

rkwright avatar Sep 11 '18 18:09 rkwright

So isn't this an enhancement and not a bug?

I agree.

But I would keep the warning at the info level if implemented, as they are legitimate.

But there currently is no warning. So what to do? Create one? I don't think it's really necessary...

tofi86 avatar Sep 11 '18 20:09 tofi86

Yeah, I would be tempted to just do nothing on that one: EpubCheck is a validation tool, not a best practice tool.

That said, if several people upvote this issue, I guess it doesn't harm to implement that as an INFO-level. But if the usage is rare, I doubt it's worth implementing it?

rdeltour avatar Sep 11 '18 20:09 rdeltour