nestjs-cls icon indicating copy to clipboard operation
nestjs-cls copied to clipboard

ProxyProviderNotResolvedException thrown even after resolveProxyProviders(...)

Open Natashkinsasha opened this issue 9 months ago • 11 comments

ProxyProviderNotResolvedException thrown even after resolveProxyProviders(...) inside cls.run(...)

Summary

We occasionally get ProxyProviderNotResolvedException when accessing a proxy provider, despite resolving it right before usage inside clsService.run(...).

This happens rarely and non-deterministically in production workers (BullMQ). It looks like a race between proxy resolution and access, or a context boundary issue.

Error

ProxyProviderNotResolvedException: Cannot access the property "getHash" on the Proxy provider UserSettingDocument because is has not been resolved yet and has been registered with the "strict: true" option. Make sure to call "await cls.resolveProxyProviders()" before accessing the Proxy provider.
    at Function.create (/app/node_modules/nestjs-cls/dist/src/lib/proxy-provider/proxy-provider.exceptions.js:69:16)
    at checkAllowedPropertyAccess (/app/node_modules/nestjs-cls/dist/src/lib/proxy-provider/proxy-provider-manager.js:180:81)
    at Object.get (/app/node_modules/nestjs-cls/dist/src/lib/proxy-provider/proxy-provider-manager.js:119:51)
    at QuestPrizeGeneratorWorker.resolveProxyProviders (/apps/meta-server/src/@job/common/job-worker.ts:46:59)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at QuestPrizeGeneratorWorker.prepareContext (/apps/meta-server/src/@job/common/context-job.worker.ts:24:5)
    at QuestPrizeGeneratorWorker.execute (/apps/meta-server/src/@job/common/context-job.worker.ts:36:5)
    at /app/node_modules/bullmq/src/classes/worker.ts:910:26
    at Worker.retryIfFailed (/app/node_modules/bullmq/src/classes/worker.ts:1174:16)

What we do

@Injectable()
export class ExampleService {
  constructor(
    private readonly clsService: ClsService,
    @Inject(UserSettingDocument) private readonly settingDocument: UserSettingDocument, // proxied provider
  ) {}

  public example() {
    return this.clsService.run(async () => {
      await this.clsService.resolveProxyProviders([UserSettingDocument]);
      // Rarely throws ProxyProviderNotResolvedException here:
      this.settingDocument.getHash();
      return true;
    });
  }
}

Expected behavior

No exception is thrown after await clsService.resolveProxyProviders([...]) has completed within the same cls.run(...) scope.

Actual behavior

Very rarely (hard to reproduce locally), an exception is thrown saying the proxy has not been resolved yet. It happens in BullMQ worker jobs.

Environment

  • nestjs-cls: 5.0.1
  • NestJS: 10.3.5
  • Node.js: 18.20.0
  • Package manager: yarn 1.22.19
  • Platform: BullMQ workers (separate process), Linux (Docker)
  • BullMQ: 5.8.4
  • strict mode: strict: true for proxy provider registration
  • cls setup: ClsModule.forRoot({ global: true, middleware: { mount: true } })
  • Repository type: monorepo (private)

Minimal reproduction (suggested outline)

I haven’t been able to produce a stable repro, but a flaky one could be:

  • Start a BullMQ worker that wraps each job in clsService.run(...).
  • Register a proxy provider with strict: true.
  • Inside a job handler:
    1. await cls.resolveProxyProviders([ProxyToken])
    2. await new Promise(r => setImmediate(r)) (force a macro-task hop)
    3. Access a method on the proxied provider. Sometimes this still throws in our environment.

If there’s an official recommended pattern, I’m happy to test it and provide a full repro repo.

Workarounds tried

  • Moving resolveProxyProviders earlier in the call chain — still happens.
  • Ensuring access happens synchronously right after resolve (reduced frequency but didn’t eliminate).

Thanks a lot! Happy to provide more logs or build a repro if you can point me to the right pattern to ensure context consistency.

Natashkinsasha avatar Aug 11 '25 13:08 Natashkinsasha

Hi, thank you for the detailed description.

This can (could) happen if the UserSettingDocument Proxy provider depends on another Proxy provider.

Before we continue, please update nestjs-cls to at least 5.3.0, as there were changes made to the internal resolution handling of Proxy providers and this behavior was fixed https://github.com/Papooch/nestjs-cls/releases/tag/nestjs-cls%405.3.0.


If the issue is still present after updating, then I'll ask for for a runnable reproduction.

Papooch avatar Aug 12 '25 05:08 Papooch

Thanks for the quick reply!

We’ve already updated to nestjs-cls 5.4.1 (so ≥ 5.3.0). The issue still occurs very rarely.

Per your hint about dependencies: our UserSettingDocument proxy is registered via ClsModule.forFeatureAsync and depends on other providers. Sharing the exact snippet:

ClsModule.forFeatureAsync({
  provide: UserSettingDocument,
  imports: [SettingDocumentCoreModule],
  inject: [ClsService, AllSettingDocument, SettingDocumentService],
  useFactory: async (
    cls: ClsService,
    allSettingDocument: AllSettingDocument,
    settingDocumentService: SettingDocumentService,
  ): Promise<DefaultSettingDocument> => {
    const user = cls.get<User>(USER);
    if (user?.isExist() && user.getSettingDescriptor()) {
      const settingDescriptor = user.getSettingDescriptor();
      if (settingDescriptor) {
        return settingDocumentService.get(settingDescriptor.groups);
      }
    }
    return allSettingDocument;
  },
  global: true,
  strict: true,
});

Natashkinsasha avatar Aug 12 '25 17:08 Natashkinsasha

Hm, looking at the snippet, is it possible that

 settingDocumentService.get(settingDescriptor.groups);

returns undefined sometimes?

Because then the value of the entire Proxy provider would be undefined, which is what triggers the error - there's currently no way to tell a difference between a proxy provider that is undefined and one that has not been resolved.

Papooch avatar Aug 12 '25 17:08 Papooch

Thanks for pointing that out.

In our case, settingDocumentService.get(settingDescriptor.groups) should always return an object (never undefined). That said, if we imagine there was a chance for it to return undefined, then yes — doing something like:

return (await settingDocumentService.get(settingDescriptor.groups)) ?? null;

would indeed prevent the ProxyProviderNotResolvedException, because the provider would be explicitly resolved to null rather than being indistinguishable from an unresolved proxy.

Natashkinsasha avatar Aug 12 '25 18:08 Natashkinsasha

I'm afraid using null would make no difference currently. Looking at the implementation: https://github.com/Papooch/nestjs-cls/blob/4d54fc3fcbe6e0af1600ab4f4c385629f0597890/packages/core/src/lib/proxy-provider/proxy-provider-manager.ts#L158-L180

It uses the ?? check, which makes no difference between null and undefined. Using primitive values for Proxy providers is not supported anyway. You should always use a callback value of type object or function to avoid runtime errors.

There is certainly room for improvement in terms of clarity of the ProxyProviderNotResolvedException. I will look into that, so it actually only throws when the provider has not been resolved. Also, there should be an error when we return a primitive value from a Proxy provider factory.

Papooch avatar Aug 12 '25 18:08 Papooch

I updated to the latest version. I hope this helps. I'll keep an eye on it.

Natashkinsasha avatar Aug 13 '25 16:08 Natashkinsasha

I’ve updated to v6.0.1, but the issue still occurs. In my resolver I always return an object (never null), so the problem is not caused by returning null.

Natashkinsasha avatar Aug 28 '25 18:08 Natashkinsasha

Sorry for the late reply, I am on vacation, I'll look into this as soon as I come back.

However, if you could create a minimal reproduction, that would help immensely. This is the only occurrence of such an issue.

Papooch avatar Sep 10 '25 14:09 Papooch

I noticed a certain pattern: the ProxyProviderNotResolvedException issue only occurs in Bull workers and only before the HTTP server is started. After the HTTP server is up, this error no longer appears.

Natashkinsasha avatar Sep 23 '25 17:09 Natashkinsasha

Ah, it makes sense then! Thank you for the investigation.

The library only properly registers all Proxy providers and plugins in onApplicationBootstrap. That is because you can register a Proxy provider or a plugin in any module, so onModuleInit could miss Proxy providers from some modules that have not been initialized yet.

I suspect that you bull processors start consuming messages before onApplicationBootsrap (likely in onModuleInit, https://docs.nestjs.com/fundamentals/lifecycle-events), which could cause this issue.

I will need to check the source to confirm this, but as a fix, you could make sure that your bull consumers start after the application has successfully bootstrapped (if it is possible).

Papooch avatar Sep 24 '25 07:09 Papooch

I think the issue can be closed. It would be nice, though, if the error message was different from the standard one.

Natashkinsasha avatar Sep 25 '25 08:09 Natashkinsasha