endo icon indicating copy to clipboard operation
endo copied to clipboard

when `bundleSource` cannot find a module, show the import graph path

Open warner opened this issue 1 year ago • 0 comments

@erights and I were just tracking down a confusing error message from bundleSource(). It turned out that the nominally-pure module (a contract entrypoint) he was trying to bundle had a library which was modified to include a utility from SwingSet. That utility was pure, however the library's import statement pointed at Swingset's top-level index.js (it was not a "deep import"). And Swingset's index.js is mainly used to expose the swingset controller, which is not pure (it imports fs and process and other Node.js builtins).

So the files looked something like this:

  • contract directory has:
    • contract.js: import { thing } from './library.js';
    • library.js: import { pureUtility } from '@agoric/swingset-vat';
  • and then @agoric/swingset-vat has:
    • index.js: export { makeController } from './src/controller.js'; export { pureUtility } from './src/pureUtility.js''
    • src/controller.js: import fs from fs;
    • src/pureUtility.js: no imports

The bundleSource failed, of course, because the import-reachability graph strayed into the Node.js builtin fs module, which has no source. The swingset index.js re-exports pureUtility.js, but that doesn't mean getting it from there will be "pure" (in the "all source code can be found" sense, which is maybe 10 degrees off from the "comes with additional authority" sense). Having that path name in your object graph causes the entire object graph to become impure.

The two ways to fix this problem are 1: do a deep import { pureUtility } from '@agoric/swingset-vat/src/pureUtility.js', or 2: move the pure utility into a separate package (in which everything, index.js on down, is pure). Once we realized the problem, the options are fairly clear.

But the real issue is that the error message it emitted was a big list of all the Node.js builtins that controller.js and its friends were trying to import, none of which helps to track down the bug, which was really in library.js. From the text, I think it was just a Set of pathnames, joined with spaces (which made it hard to even parse, visually).

What I think I want instead is an ordered path through the import graph, from the entry point to the module whose source could not be found, displayed as one path per line. If there are multiple files that cannot be found (and there will probably dozens), I might be content to only show the path info for the first one, plus maybe a line like "... and N other modules".

If it looked more like:

unable to find source for module 'fs'
* import path follows:
* `./contract.js`
* `./library.js`
* `@agoric/swingset-vat` (`index.js`)
* `@agoric/swingset-vat/src/controller.js`
* `fs`

then we'd probably glance at the list and say "hey, wait a minute, we weren't trying to import controller.js, why is that there?", and then we'd notice the index.js, and then we'd realize that we were relying on a re-export that wasn't trying to maintain purity.

I'm guessing this would require a lot more bookkeeping in the archive traversal code, especially because the module graph isn't even a DAG (yay cycles), much less a single-parent tree. So the question "what is the path from entry point to missing module" can have lots of answers, and we kinda want the shortest path. But I bet any path would do. But even avoiding getting into trouble with cycles might make this a nuisance to implement.

But, I suspect it's going to become pretty important when we have more folks writing contracts, and they try to import their favorite (non-pure) libraries, and then wonder why the bundling process fails.

https://github.com/Agoric/agoric-sdk/issues/5823 was the issue.

cc @kriskowal

warner avatar Jul 26 '22 00:07 warner