sdk
sdk copied to clipboard
What characters are allowed in an import/export/part URI?
Now (see SDK versions below) we have the following (the results are the same for Linux and Windows):
import 'some_lib.dart' as l1; // Ok
import 'some_lib.dart#' as l2; // Ok
import 'some_lib.dart?' as l3; // Ok
import 'some_lib.dart?x' as l4; // Analyzer: Ok, Runtime:Unsupported operation: Cannot extract a file path from a URI with a query component
import 'some_lib.dart?#' as l5; // Ok
import 'some_lib.dart#x' as l6; // Analyzer: Target of URI doesn't exist, Runtime:Unsupported operation: Cannot extract a file path from a URI with a fragment component
import 'some_lib.dart:' as l7; // Analyzer: Invalid URI syntax, Runtime: Couldn't parse URI: Illegal scheme character.
import 'some_lib.dart:x' as l8; // Analyzer: Invalid URI syntax, Runtime: Couldn't parse URI: Illegal scheme character.
// Ok above means that there is no error the analyzer and at run time the library is imported and accessible.
Specification doesn't specify how a URI should be parsed.
any relative URI is interpreted as relative to the location of the current library. All further interpretation of URIs is implementation dependent.
I think, that cases l1, l7 and l8 are obvious. But what should we expect in the cases l2-l6?
Dart SDK version: 3.6.0-326.0.dev (dev) (Fri Oct 4 21:02:43 2024 -0700) on "windows_x64"
Dart SDK version: 3.6.0-edge.5eacff3ab8f9b6418c86d71e9eacc5bcb7b6ce32 (main) (Tue Oct 8 05:44:20 2024 +0000) on "linux_x64"
Summary: The issue concerns the allowed characters in import/export/part URIs in Dart. The user is confused about the behavior of URIs with query and fragment components, as the specification doesn't explicitly define their parsing. They want clarification on the expected behavior for these cases.
I think there are some discrepancies, but a good starting point would be that a Dart <uri> follows https://www.rfc-editor.org/rfc/rfc3986.
@lrhn, does the RFC differ from the Dart notion of a URI? I think you mentioned something a while ago.
Agree that l1, l7 and l8 are obvious.
The Dart Uri class follows RFC3986, and equality is up to path/case/escape-normalization (because it eagerly normalizes those).
The parsing escapes invalid input characters, so it's a permissive parser. That means that
an import URI may be accepted even if it's not be a valid RFC 3986 URI, if it can be converted into one by escaping.
It should definitely accept all RFC 3986 URIs.
The Dart SDK uses the Dart Uri class for implementing the language <uri> semantics, and that should be fine.
They might also do something to convert a URI to a file path afterwards, and that may be doing more than just calling File.fromUri.
At least, from what is shown here, I guess the analyzer might remove queries at some point.
The behavior likely depends on whether the containing file has a file: or package: URI, because that changes the kind of URI the relative import URI resolves to.
package: URI: If I run a version of the example above from inside lib/ (dart lib/testuri.dart), it has a package: URI, and I get errors on every line after the first valid one. The errors are:
Error: Invalid package URI 'package:testpkg/testuri.dart#':
Invalid argument (packageUri): Package URIs must not have a fragment part: Instance of '_SimpleUri'.
Error: Invalid package URI 'package:testpkg/testuri.dart?':
Invalid argument (packageUri): Package URIs must not have a query part: Instance of '_SimpleUri'.
...
That's what I would have expected too. The analyzer only fails on l2, l5 and l6, saying that the target does not exist, but accepts l3 and l4. (Seems it ignores a query, but not a fragment.)
file: URI: Running the same file inside bin/, I get the same behavior you see from the analyzer,
and from the front end (dart bin/testuri.dart) I get compiler crashes.
Unsupported operation: Cannot extract a file path from a URI with a query component
If I run it with dart compile js instead, it doesn't crash, and consistently says that it cannot convert a URI
with a fragment or query into a file path.
So same as you again. (I expect you commented lines out to find the crashers.)
So, how should it work?
The CFE compiler should not crash. That's a given. (Since dart compile js doesn't crash, and dart run and dart compile exe does, it looks like a VM issue. @mkustermann)
There is no use for a fragment in a Dart file URI. At least until Dart starts understanding an input format where it makes sense referencing a fragment, like collection.zip#lib/foo.dart.
We can choose to always ignore and discard fragments. That would be a problem if we ever get zip-reading, but if we don't, it's probably safe. But that also means we have to defined whether foo.dart and foo.dart# is "the same URI" for defining library identity. It isn't today. The Dart Uri class can distinguish an empty fragment from an absent one.
If every source file must have a file system path, then queries make no sense, and should probably also be errors.
They are perfectly reasonable for http URIs, like http://mysource.example.com/source?lib/foo.dart, but it's not something that can be converted to a file path. We should fail instead of silently removing or accepting queries, if we cannot retain them in the actual file path.
My suggesting is:
-
Both CFE and analyzer should treat fragments as errors (even empty ones) in every source URI. Just say "A Dart source URI cannot contain a fragment." and fail compiling.
-
They should treat both queries and fragments as errors for all package URIs. "A package URI cannot have a query or fragment".
-
And they should treat queries as errors if they ever need to convert the URI to a file path. (Which is likely all the time.) "A URI with a query does not refer to a file", or something.
It should make no difference for valid programs, and it will be consistent about what is invalid programs.
I'm not sure any of this is specified behavior, but we should be consistent. (The conversion from URI to file path is unspecified, but the equality of URIs is not, so even if the tools allow importing files from URIs with queries or fragments, they should be imported as different libraries.)
So same as you again. (I expect you commented lines out to find the crashers.)
Yes, of course.
I have the same results for package:package_name/uri. But I have different results for file:.... My testing shows that after file: only an absolute path is accepted. file:some_lib.dart is not found nor in the same directory nor in the bin. There is also an additional issue with absolute path on Windows. On Linux the following works well in both analyzer and VM (referencing the same library).
import 'same_uri_common_lib.dart' as lib1;
import 'file:/home/sgrekhov/Google/sgrekhov/co19/Language/Libraries_and_Scripts/Imports/Semantics_of_Imports/same_uri_common_lib.dart' as lib2;
import '/home/sgrekhov/Google/sgrekhov/co19/Language/Libraries_and_Scripts/Imports/Semantics_of_Imports/same_uri_common_lib.dart' as lib3;
On Windows we have:
import 'same_uri_common_lib.dart' as lib1;
import 'file:C:/Users/sgrek/Work/Google/sgrekhov/co19/Language/Libraries_and_Scripts/Imports/Semantics_of_Imports/same_uri_common_lib.dart' as lib2;
It works in the analyzer, but in VM it is a crash.
root::file:///C:/Users/sgrek/Work/Google/sgrekhov/co19/Language/Libraries_and_Scripts/Imports/Semantics_of_Imports/same_uri_common_lib.dart is already bound to Reference to file:///C:/Users/sgrek/Work/Google/sgrekhov/co19/Language/Libraries_and_Scripts/Imports/Semantics_of_Imports/same_uri_common_lib.dart with node library file:///C:/Users/sgrek/Work/Google/sgrekhov/co19/Language/Libraries_and_Scripts/Imports/Semantics_of_Imports/same_uri_common_lib.dart (Library:7322), trying to bind to Reference to library file:///C:/Users/sgrek/Work/Google/sgrekhov/co19/Language/Libraries_and_Scripts/Imports/Semantics_of_Imports/same_uri_common_lib.dart with node library file:///C:/Users/sgrek/Work/Google/sgrekhov/co19/Language/Libraries_and_Scripts/Imports/Semantics_of_Imports/same_uri_common_lib.dart (Library:7323)
#0 CanonicalName.bindTo (package:kernel/canonical_name.dart:215:7)
#1 Library.bindCanonicalNames (package:kernel/src/ast/libraries.dart:254:47)
#2 Library.ensureCanonicalNames (package:kernel/src/ast/libraries.dart:259:35)
#3 Component.computeCanonicalNamesForLibrary (package:kernel/src/ast/components.dart:107:13)
#4 BinaryPrinter._computeCanonicalNames (package:kernel/binary/ast_to_binary.dart:571:19)
#5 BinaryPrinter.writeComponentFile.<anonymous closure> (package:kernel/binary/ast_to_binary.dart:590:7)
#6 Timeline.timeSync (dart:developer/timeline.dart:173:22)
#7 BinaryPrinter.writeComponentFile (package:kernel/binary/ast_to_binary.dart:588:14)
And absolute path on Windows without file: doesn't work in both analyzer and VM. No surprise here, 'c:...' is treated as a scheme in this case.
Very good, and thanks @lrhn!
I think this is a tool issue for now. We may then decide whether or not we'd change the language specification to say more about URIs when the tools agree. I removed the 'area-language'.
The 'file:some_lib.dart' shouldn't work. That is an absolute URI, meaning the same thing as file:///some_lib.dart, a file in the root directory.
The file:/absolute/path/ works because it points to the correct file. The lib2 and lib3 on Linux should count as the same library, they both resolve to file:///absolute/path, so the it's the same URI.
You can use import '/C:/absolute/path'; and import '///C:/absolute/path'; on Windows to refer to an absolute path without conflicting with the scheme syntax. (Also escaping the colon, import 'C%3A/absolute/path';, is actually a relative path at the URI level, so it would be resolved against the current directory.)