Portable resolver test suite
This was brought up in https://github.com/nodejs/modules/issues/451#issuecomment-575282242 for resolver stability, that having tests available for tools and userland resolvers would be a great help in ensuring that they are compliant with the Node.js ES module resolver.
If we could move the resolution tests into something resembling an independent test suite that could easily be run against any async resolver function that could assist the ecosystem in this transition process.
So -
- Do we agree this would be a useful thing to provide? There were no objections when this was discussed previously in the meeting I believe.
- If so, how would we offer such a portable test suite in a useful way? Suggestions re useful patterns here would be welcome.
-
This would be amazingly useful.
-
Perhaps a function that takes a user-provided "resolve" function, and runs a bunch of assertions or tape tests with it?
Yes please, that would be awesome 👍
I'm a big fan of data driven conformance tests for this kind of thing. If somebody has a Rust/Java/C++ implementation, they can use it without having to worry about JS bindings. A test suite that takes a JS resolution function could be implemented on top of it but I could imagine that some projects would prefer to pull the conformance repo and integrate into their own test suite and runner instead.
If you provided a JS function (that could be integrated into an existing JS test suite) and then also provided a hook for someone to provide a CLI tool, then that would enable both patterns.
Possible straw API:
import RESOLUTIONS from '@nodejs/resolver-conformance-tests';
describe('node resolver-conformance', () => {
for (const resolutionTest of RESOLUTIONS) {
// Don't want to support non-import resolution.
if (!resolutionTest.environments.includes('import')) continue;
it(resolutionTest.id, async () => {
for (const file of resolutionTest.files) {
if (file.type === 'symlink') { /* ... */ }
testFS.writeFileSync(file.relativePath, file.content);
}
assertEquals(
resolutionTest.resolvedURL,
await resolve(resolutionTest.specifier, resolutionTest.referrerURL)
);
});
}
});
Some of that is boilerplate but a lot of it is really handy for people who want it to work nicely in their test suite (e.g. rerunning failing tests only). E.g. we could have dedicated support for writing the files that belong to the test case to a given FS implementation.
@jkrems, would you mind clarifying what the contents of RESOLUTIONS is in your example? I also don't understand why you are writing to a file?
The content of RESOLUTIONS is a data structure hinted at in the code consuming it. It’s a collection of conformance test cases with the file layout they require, the resolution they test, and the expected URL after resolution finishes. The test case is writing files to realize the file layout for the active test case because it’s not a given that the resolver under test directly interacts with the file system.
It seems like this is the data structure. Can you reconfirm?
// @nodejs/resolver-conformance-tests
// RESOLUTIONS
export default [
{
// resolutionTest[0]
/** @type {Array<string>} Array of conditional exports condition names. */
environments: ['import'],
/** @type {string} Test case ID. */
id: 'relative-from-referrer-module',
/** @type {string} The specifier to be tested for resolution. */
specifier: './example.mjs',
/** @type {string} The conclusion of the resolution. */
resolvedURL: 'file:///absolute/path/to/example.mjs',
/** @type {string} The file URL string of the module making the request. */
referrerURL: 'file:///absolute/path/to/referrer.mjs',
/** @type {Array<Object>} */
files: [
{
/** @type {string} The filetype. */
type: 'hardlink',
/** @type {string} Relative file path to a module. */
relativePath: './alpha.mjs',
/** @type {string} Source code of the module. */
content: 'import * as ns from ...',
},
],
},
];
The test case is writing files to realize the file layout for the active test case because it’s not a given that the resolver under test directly interacts with the file system.
It sounds like the resolver you have in mind would be interacting with an in-memory file system (at least during testing).
I'm not 100% convinced that the filesystem needs to be replicated.
It seems like this is the data structure. Can you reconfirm?
That looks about right. Just want to point out that this was just a rough pseudo-code example of the kind of data required for a conformance data set. I assume gaps would pop up as one would actually try to implement one.
It sounds like the resolver you have in mind would be interacting with an in-memory file system (at least during testing).
I wanted to make sure that virtualized file systems could use the test suite. Another approach for this would be that the test case references a fixture directory. That way resolvers that use anything but the actual disk could at least copy the directory into whatever data structure they use.
export default [ // RESOLUTIONS
One important note: The test cases would hopefully exist as plain JSON. That way a Java implementation for some IDE could use it without having to execute JavaScript just to get the data.
If I understand correctly, files is an array of objects each with a potentially different value for the content (source text) key. I imagine they would be duplicates, so why not make it its own top-level key?
According to the spec, HostResolveImportedModule states the following.
Multiple different referencingScriptOrModule, specifier pairs may map to the same Module Record instance. The actual mapping semantic is implementation-defined but typically a normalization process is applied to specifier as part of the mapping process. A typical normalization process would include actions such as alphabetic case folding and expansion of relative and abbreviated path specifiers.
The files array of objects can have various relativePaths mapping to the same Module Record instance, but once normalization of the paths occurs, (i.e. path.resolve + fs.realpath), they should all be the same if I understand correctly.
Along with specifier to absolute URL resolutions, I'm also interested in what format was resolved.
The getFormat hook may one of the ways we're able to confirm/deny validity of resolutions.
I imagine they would be duplicates, so why not make it its own top-level key?
Can you explain what you mean by duplicates? I think it would be upon the maintainers of the conformance tests to ensure that the files aren't duplicated within a test. Using potentially shared fixture directories would make that easier than to put the file contents inside of the data structure though.
The
filesarray of objects can have variousrelativePathsmapping to the same Module Record instance, [...]
I think there's a misunderstanding about what files is, maybe. It's strictly meant as a "test environment". The relativePaths says where the file is. It's a path, not a specifier. So the quote of the spec about specifier normalization shouldn't apply. relativePaths would always be normalized since it doesn't really make sense to put non-normalized paths into test setup data.
One example for "why have files at all" would be package.json. E.g. the following test case may exist:
{
"id": "self-reference-from-pkg-root",
"specifier": "enclosing-pkg-name",
"resolvedURL": "./pkg-root/lib/pkg.mjs",
"referrerURL": "./pkg-root/ref.mjs",
"environments": ["import"],
"files": [
{
"type": "file",
"relativePath": "./pkg-root/package.json",
"content": "{\"name\":\"enclosing-pkg-name\",\"exports\":\"./lib/pkg.mjs\"}",
}
]
}
Alternatively, with fixture directories it may be:
{
"id": "self-reference-from-pkg-root",
"specifier": "enclosing-pkg-name",
"resolvedURL": "./pkg-root/lib/pkg.mjs",
"referrerURL": "./pkg-root/ref.mjs",
"environments": ["import"],
"fixturesDir": "self-reference"
}
Where <test suite>/fixtures/self-reference/pkg-root/package.json would contain the same content as the files entry above.
Using potentially shared fixture directories would make that easier than to put the file contents inside of the data structure though.
👍
I think there's a misunderstanding about what files is, maybe.
Since the intention is to stabilize the resolver, I don't think specifiers would be limited to bare specifiers. Specifiers can be paths (absolute, relative) and the default resolver always returns an object containing url property, which is itself an absolute file URL string. This would need to be processed prior to comparison.
It's strictly meant as a "test environment". The
relativePathsays where the file is. It's a path, not a specifier. So the quote of the spec about specifier normalization shouldn't apply.
Okay, I understand this point now. relativePath is supposed have fixturesDir define the relative location.
I must admit, I've been a bit stumped on this issue due to its massive/unclear scope.
I would need clarity on what module specifier resolution we should be testing? Here are the various situations I'm currently aware of (and I've found bugs/inconsistencies between):
1. Module specifiers as arguments to the node CLI:
node ./foo.mjs
- In Bash
- In PowerShell
- In cmd.exe
2. Module specifiers as import specifiers in static import statements:
import * as ns from './foo.mjs';
- In the REPL (currently impossible)
- As a string passed to
--eval - In a module file
3. Module specifiers as import specifiers in dynamic import():
(async () => { await import('./foo.mjs'); })();
- In the REPL
- As a string passed to
--eval - In a module file
That's not too bad. 33 = 27
The thing is, that's only considering relative ./ when there's actually 3 of these:
- FS root:
/ - FS relative:
./ - FS updir:
../
That's not too bad. 34 = 81
Then, you also have:
- Bare specifiers:
resolve - Scoped package bare specifiers:
@nodejs/resolver-conformance-tests - Deep import specifiers
@nodejs/resolver-conformance-tests/test/fixtures/constants.mjs - File URLs:
file://,node://,data://(are there others?) - Are they absolute?
file:/// - Do they work on Windows?
file://C:/
Thanks in advance.
Footnotes (off topic)
Terms
- Sigil: A way to self-reference that is independent of the package name.
TODO
- Document the correct rooted Windows URL specifier as it is currently wrong.
- Double-check math after investigating this potential compatability matrix.
- Console
- Shell
- Various terminal emulators
The test suite should be based on resolve(specifier, context) -> Promise<resolved> being tested, and possibly also a getFormat - effectively just like the loader hooks implementation themselves.
So I would worry less about testing syntax / REPL etc and more about effectively getting "coverage" on the specification (https://nodejs.org/dist/latest-v13.x/docs/api/esm.html#esm_resolver_algorithm).
Then, yes all of these that you have listed are cases that need to be covered:
- FS root: /
- FS relative: ./
- FS updir: ../
- Bare specifiers: resolve
- Package-scoped bare specifiers: @nodejs/resolver-conformance-tests
- Deep import specifiers @nodejs/resolver-conformance-tests/test/fixtures/constants.mjs
- File URLs: file://, node://, data:// (are there others?)
- Are they absolute? file:///
- Do they work on Windows? file://C:/
As well as things like symlinks, exports, self resolution, exports edge cases, errors and conditions, file extension lookups on the CJS legacy paths, path encoding (eg emojis in file names, and avoiding URL injections).
All of these cases are covered with the resolutions done by the tests over all of the test/es-modules. So simply extracting those variations out of the main test suite could be one approach.
Using JSON for this isn't going to work due to needing to constantly escape the quotes needed for module specifiers. We need a new data interchange format. YAML is perfect for this. I highly recommend it. Does anyone foresee problems with distributing this data structure in YAML?
I would strongly prefer "literally anything but yaml". TOML, even.
TOML, even.
If I'm not mistaken, TOML uses quotes too, so we would have the same problem as we have with JSON.
Why does escaping make it untenable? The JSON doesn't need to be hand-edited.
In https://github.com/nodejs/modules/issues/472#issuecomment-584994986 there is a files.content that is supposed to contain actual module source text. Escaping all quotes there will inaccurately portray reality.
Only to a human reading the raw JSON file - is that a valuable audience?
I think it would be best to keep that as a .mjs file using a template literal for the value of the files.content key (just have to escape backticks if they ever come up) and then we can serialize the whole data structure to any other static data interchange format later. The convenience of being able to copy source and paste it into there is why I was advocating for YAML, which could be one of the several formats we distribute.
It’d probably be better for the json to have a file path pointing to an mjs file, so we can link and evaluate the js more easily.