exist icon indicating copy to clipboard operation
exist copied to clipboard

[BUG] fn:doc should raise an error if the resource cannot be retrieved

Open joewiz opened this issue 3 years ago • 6 comments

Describe the bug

The spec for the fn:doc function says that the processor should raise an error if the URI passed in via the $uri parameter cannot be retrieved.

eXist does not raise an error.

Expected behavior

The spec says:

A dynamic error is raised err:FODC0002 if the resource cannot be retrieved or cannot be parsed successfully as XML.

That error is defined as follows:

Raised by fn:doc, fn:collection, and fn:uri-collection to indicate that either the supplied URI cannot be dereferenced to obtain a resource, or the resource that is returned is not parseable as XML.

Given a query like doc("foo") (where doc-available("foo") returns false()), Saxon raises an I/O error and BaseX raises the expected FODC0002 error:

Stopped at /Users/joe/workspace/file, 1/4: [FODC0002] Resource '/Users/joe/workspace/foo' does not exist.

Only eXist fails to raise an error. It returns an empty sequence.

To Reproduce

xquery version "3.1";

module namespace t="http://exist-db.org/xquery/test";

declare namespace test="http://exist-db.org/xquery/xqsuite";

declare
    %test:assertError("err:FODC0002")
function t:test() {
    doc("foo")
};

Context (please always complete the following information):

  • OS: macOS 11.3.1
  • eXist-db version: eXist 5.3.0-SNAPSHOT 895a475f22a85883ad40fb18d077c2c418446dac 20210520055838
  • Java Version: OpenJDK 1.8.0_292 (liberica-jdk8-full)

Additional context

  • How is eXist-db installed? built from source, started via DMG
  • Any custom changes in e.g. conf.xml? none

joewiz avatar May 20 '21 17:05 joewiz

I could imagine this fix breaking a lot of stuff that's out there ;-) So maybe a 6.0.0

adamretter avatar May 20 '21 17:05 adamretter

I agree with 6.0.0 but I hope there won’t be that many instances around there relying on the broken behavior.

duncdrum avatar May 20 '21 17:05 duncdrum

I am reading the spec differently: there is always the possibility of returning an empty-sequence (see signature), and there is room for implementation dependent behavior: 'One possible processing model for this function is as follows:' and then there are things that may not be spec dependent, but rather implementation dependent. Also: isnt it the case that saxon and others also return an empty sequence if a doc cannot be found? I think it would be nice if there is something to be found, but it appears to not be parseable as an xml document. I agree with @adamretter that this is going to break in a lot of places where an empty sequence is expected by the code. I'd call it a feature rather than a bug.

PieterLamers avatar May 25 '21 07:05 PieterLamers

there is always the possibility of returning an empty-sequence

The function only returns an empty sequence if $uri is an empty sequence.

Also: isnt it the case that saxon and others also return an empty sequence if a doc cannot be found?

No. As I mentioned in the original post:

Given a query like doc("foo") (where doc-available("foo") returns false()), Saxon raises an I/O error and BaseX raises the expected FODC0002 error.

Regarding the implementation-dependent behavior, I agree, the spec leaves room for implementations to define the behavior of fn:doc(). I guess I could revise my original post from saying "the spec says the processor should raise an error" to "the spec says that the processor may raise an error".

Indeed, under error conditions, the spec clearly says:

A dynamic error may be raised err:FODC0005 if $uri is not a valid URI reference.

I guess I would also say that the spec's suggestion here for implementations (which Saxon and BaseX follow) makes just as good sense for eXist too. The language provides ample means such as doc-available() and try-catch to prevent URI errors from halting execution of code.

I agree that this would need a major version.

joewiz avatar May 25 '21 14:05 joewiz

Case in point for the need for a major version for such changes: https://github.com/eXist-db/existdb-packageservice/issues/19.

joewiz avatar May 25 '21 21:05 joewiz

Thanks for your comments @joewiz . I've tried a few tricks with Saxon running the following (10.3EE, via Oxygen):

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="3.0">
  <xsl:template match="/">
    <xsl:variable name="test" as="document-node()?" select="doc('file:///C:\temp\test.xml')"/>
    <xsl:sequence select="exists($test)"/>
  </xsl:template>
</xsl:stylesheet>

which returns 'true'. Running the same with <xsl:sequence select="$test"/> returns C:\Temp\test.xml (The system cannot find the file specified). So Saxon is not consistently reporting a breaking error. But that's another discussion (and maybe I am not testing correctly). I just checked in Mike Kay's XSLT 2.0 and XPath 2.0, and there too it is simple: the implementation should throw an error for doc() if it can not be mapped to one or more document-nodes with a given $url. I am just so used to the current eXist behavior that I took it for a feature.

PieterLamers avatar May 26 '21 11:05 PieterLamers