pulumi
pulumi copied to clipboard
Consider lazily exporting sub-modules
The Azure Native package is quite large, and the transitive closure of source code required to load if you import the base package in at least TypeScript can incur sizable startup penalties.
For example, the pulumi new azure-typescript
template takes just a few seconds to preview on my laptop (not bad):
import * as pulumi from "@pulumi/pulumi";
import * as resources from "@pulumi/azure-native/resources";
import * as storage from "@pulumi/azure-native/storage";
// Create an Azure Resource Group
const resourceGroup = new resources.ResourceGroup("resourceGroup");
// Create an Azure resource (Storage Account)
const storageAccount = new storage.StorageAccount("sa", {
resourceGroupName: resourceGroup.name,
sku: {
name: storage.SkuName.Standard_LRS,
},
kind: storage.Kind.StorageV2,
});
// Export the primary key of the Storage Account
const storageAccountKeys = pulumi.all([resourceGroup.name, storageAccount.name]).apply(([resourceGroupName, accountName]) =>
storage.listStorageAccountKeys({ resourceGroupName, accountName }));
export const primaryStorageKey = storageAccountKeys.keys[0].value;
real 0m4.859s
user 0m3.937s
sys 0m1.231s
An innocuous variant, however, explodes that time to 50 seconds to preview:
import * as pulumi from "@pulumi/pulumi";
import * as az from "@pulumi/azure-native";
// Create an Azure Resource Group
const resourceGroup = new az.resources.ResourceGroup("resourceGroup");
// Create an Azure resource (Storage Account)
const storageAccount = new az.storage.StorageAccount("sa", {
resourceGroupName: resourceGroup.name,
sku: {
name: az.storage.SkuName.Standard_LRS,
},
kind: az.storage.Kind.StorageV2,
});
// Export the primary key of the Storage Account
const storageAccountKeys = pulumi.all([resourceGroup.name, storageAccount.name]).apply(([resourceGroupName, accountName]) =>
az.storage.listStorageAccountKeys({ resourceGroupName, accountName }));
export const primaryStorageKey = storageAccountKeys.keys[0].value;
real 0m49.224s
user 0m44.315s
sys 0m9.584s
Notice that this imports the root package instead of submodules directly. Many of our API examples do such things, and a lot of the idiomatic Pulumi TypeScript code we've written over the years does this. It's actually my personally preferred style.
It turns out I think there's an easy, compatible fix, that would let this style perform just as well as importing submodules directly: lazily fetch the submodules using JavaScript properties, rather than eagerly loading them. I did a quick-and-dirty prototype, and it indeed brought the program's runtime down to the same ballpark as the submodule approach.
The root index.ts
has to do something like this:
// The code already eagerly declares properties for each submodule, before requiring them; kept as-is:
exports.datalakeanalytics = exports.datafactory = exports.datadog = exports.datacatalog = exports.databricks = exports.databoxedge = exports.databox = exports.customproviders = exports.customerinsights = exports.costmanagement = exports.containerservice = ex... = void 0;
exports.m365securityandcompliance = exports.logz = exports.logic = exports.labservices = exports.kusto = exports.kub... exports.storage = exports.sqlvirtualmachine = void 0;
// We simply remove all of the lines such as:
// const aad = require("./aad");
// exports.aad = aad;
// const aadiam = require("./aadiam");
// exports.aadiam = aadiam;
// ...
// And replace them with code that dynamically loads (and caches) modules as they are actually used:
for (let prop of Object.getOwnPropertyNames(exports)) {
let __cachedModule;
Object.defineProperty(exports, prop, {
get: function() {
let cached = __cachedModule;
if (cached) {
return cached;
}
return __cachedModule = require("./" + prop);
}
});
}
Now, when I go back and run the originally slow program, I get:
real 0m4.933s
user 0m3.954s
sys 0m1.221s
I think this is a compatible change, but I am no JavaScript guru, and there may be some subtle reason this won't work.
An end user originally reported this and we had a pretty detailed internal conversation about possible paths forward. Assuming it works, we may want to use this technique for other modules, even if they aren't quite as enormous as this one.
One reason this may not work is that it prevents module side effects from taking place, like:
pulumi.runtime.registerResourceModule("azure-native", "storage", _module);
This didn't impact my ability to run the simple program above, but I assume something (multilang components?) needs it.
I'm not sure if there's a way to work around that. I suppose we could split the metadata out into a separate initialization routine, but that begs the question, perhaps the slowness is actually caused by some of these codepaths.
This looks awesome! Should we track this in pulumi/pulumi instead, since all the SDK generation lives there?
Wonderful.
Things to watch out here similar to Python besides making sure side effects are applied (in Python there were important side-effects related to init'ing providers) is that type-checking continues to work. If exports.datadog
type becomes Any
there's some unfortunate information loss to consider...
https://github.com/pulumi/pulumi/pull/6827 for ref was the python change, and there are still a few tickets pending about making sure the IDE experience and type checking is on par with the status quo ante.
Agree with @mikhailshilkov here, should we move to pulumi/pulumi since this seems more generally applicable?
This didn't impact my ability to run the simple program above, but I assume something (multilang components?) needs it.
This is to support deserializing resource references. We should be able to make it lazy.
I should note, if you do the workaround (import modules specifically), it too skips a whole bunch of the initialization. So, if this is a problem with the proposal above, it's likely already a problem with code being written in the wild.
Just my 2c.. Based on having solved it for Python, I hope the initialization problem is not a real blocker here. In Python we edited codegen to move init work to the top-level module, which always gets run no matter which sub-module you import (Python feature). Since the init work declares resource tables, an alternative path forward would be to tinker with the machinery that consults these tables to load modules on-demand.
Based on having solved it for Python, I hope the initialization problem is not a real blocker here. In Python we edited codegen to move init work to the top-level module
Agree 100%. The positive benefit of this would be significant. And as noted, if the module loading issues are a problem with this change, they are already a problem with the very workarounds we're giving to folks (per-module importing).
Feels like we should take this change ASAP to bank the wins, address the negative user experience, and figure out the module loading story as a longer-term initiative as it (at least to me) appears rather theoretical at the moment.
Did a quick spike and found a way to implement this in only typescript - to avoid having to do a replace on the generated JS: https://github.com/pulumi/pulumi-azure-native/pull/1879/commits/d00083902d008d906eb4418cd4973665dfb5e6e4. The only downside of this approach is that the export types look a little unusual though function fine:
Typescript implementation to create correctly typed export stub:
export const myModule: typeof import("./myModule") = undefined as any;
Generated definition:
export declare const myModule: typeof import("./myModule");
The loop to do the actual export can then just be written in typescript too:
for (let prop of Object.getOwnPropertyNames(exports)) {
if (prop == "Provider" || prop.startsWith("_")) {
continue;
}
let __cachedModule: any;
Object.defineProperty(exports, prop, {
get: function () {
let cached = __cachedModule;
if (cached) {
return cached;
}
return (__cachedModule = require("./" + prop));
},
});
}
There's also additional edge cases of other members which are also exported (e.g. __esModule
and Provider
) but these wouldn't be too hard address.
JS manipulation
The removal and insertion of the JS could work, though it would have to be part of the make target rather than codegen so this isn't an idea route.
As a side note, if using the JS replacement, we should also switch to combining the import & export step to use the form:
export * as myModule from "./myModule";
Hitting a snag in the typescript-only solution space. There are two styles of import that are currently supported, and I cannot figure out a solution that does not break one or the other of them. For example:
==> m1.ts <==
import { output } from "./types";
export function foo(x: output.MyOutputType): output.MyOutputType {
return x;
}
console.log(output.myConst);
==> m2.ts <==
import * as types from "./types";
export function foo(x: types.output.MyOutputType): types.output.MyOutputType {
return x;
}
console.log(types.output.myConst);
==> output.ts <==
export interface MyOutputType {
x: number;
}
export const myConst: number = 1;
==> types.ts <==
import * as outputType from "./output";
const output: typeof outputType = {} as any;
/*
Before:
import * as output from "./output";
*/
export {
output
}
This complains:
m1.ts:3:24 - error TS2503: Cannot find namespace 'output'.
3 export function foo(x: output.MyOutputType): output.MyOutputType {
~~~~~~
m1.ts:3:24 - error TS4078: Parameter 'x' of exported function has or is using private name 'output'.
3 export function foo(x: output.MyOutputType): output.MyOutputType {
~~~~~~
m1.ts:3:46 - error TS2503: Cannot find namespace 'output'.
3 export function foo(x: output.MyOutputType): output.MyOutputType {
~~~~~~
m1.ts:3:46 - error TS4060: Return type of exported function has or is using private name 'output'.
3 export function foo(x: output.MyOutputType): output.MyOutputType {
~~~~~~
m2.ts:3:30 - error TS2694: Namespace '"/Users/anton/tmp/pulumi-7653/types"' has no exported member 'output'.
3 export function foo(x: types.output.MyOutputType): types.output.MyOutputType {
~~~~~~
m2.ts:3:58 - error TS2694: Namespace '"/Users/anton/tmp/pulumi-7653/types"' has no exported member 'output'.
3 export function foo(x: types.output.MyOutputType): types.output.MyOutputType {
~~~~~~
Found 6 errors in 2 files.
Errors Files
4 m1.ts:3
2 m2.ts:3
It seems that TypeScript distinguishes between namespace exports and value exports, but I can't find a way to have it accept unrelated definitions under one export "outputs" - one for the namespace "outputs" and another one for the prop "outputs".
Another attempt:
==> m1.ts <==
import { output } from "./types";
export function foo(x: output.MyOutputType): output.MyOutputType {
return x;
}
console.log(output.myConst);
==> m2.ts <==
import * as types from "./types";
export function foo(x: types.output.MyOutputType): types.output.MyOutputType {
return x;
}
console.log(types.output.myConst);
==> output.ts <==
export interface MyOutputType {
x: number;
}
export const myConst: number = 1;
==> types.ts <==
import type * as output from "./output";
export {
output
}
Result:
m1.ts(7,13): error TS1361: 'output' cannot be used as a value because it was imported using 'import type'.
m2.ts(7,19): error TS2339: Property 'output' does not exist on type 'typeof import("/Users/anton/tmp/pulumi-7653/types")'.
I've worked out a better example building off Daniel's prototype here: https://github.com/pulumi/pulumi-azure-native/pull/1944
The break is in compiling TS programs that use non-recommended import forms and reference those imported forms in type positions (values positions are OK, which would be a lot more common).
We're still looking at some options to make the change completely non-breaking.
But it might be worth discussing accepting this breaking change and documenting it in the changelog. The workaround for users hitting this is switching to the preferred form of imports e.g. "@pulumi/azure-native/storage" vs "@pulumi/azure-native".
Implemented this as a codegen feature to benefit all providers n https://github.com/pulumi/pulumi/pull/10538
The next upgrade of providers should result in lazy-loaded modules.
For azure-native specifically we also recommend setting useTypeOnlyReferences=true switch in the nodejs codegen options to further optimize programs that are avoiding top-level import.