cqrs icon indicating copy to clipboard operation
cqrs copied to clipboard

Exceptions thrown from within event handlers crashes the app

Open diego-rangel opened this issue 3 years ago • 28 comments

When starting to deal with a type of application that receives Webhooks from an external service, my application reacts by triggering events based on the type of object received. For example, an event that triggers an email to the customer. So, when dealing with some smtp plugins, I accidentally realized that if any exception is thrown from within EventHandlers, the application completely crashes.

Current behavior

Exception thrown from within EventHandlers craches the app.

Expected behavior

A way to handle and don't crash the app.

Minimal reproduction of the problem with instructions

import { Controller, Get } from '@nestjs/common';
import { EventBus } from '@nestjs/cqrs';

import { HeroKilledDragonEvent } from './events/impl/hero-killed-dragon.event';

@Controller('hero')
export class HeroesGameController {
  constructor(
    private readonly eventBus: EventBus,
  ) {}

  @Get()
  async index(): Promise<any> {
    //publishes an event which throws an exceptions and craches the app
    return this.eventBus.publish(new HeroKilledDragonEvent('123', '432'));
  }

  @Get('health')
  async health(): Promise<any> {
    return 'ok';
  }
}
export class HeroKilledDragonEvent {
  constructor(
    public readonly heroId: string,
    public readonly dragonId: string,
  ) {}
}
import { BadRequestException } from '@nestjs/common';
import { IEventHandler, EventsHandler } from '@nestjs/cqrs';

import { HeroKilledDragonEvent } from '../impl/hero-killed-dragon.event';

@EventsHandler(HeroKilledDragonEvent)
export class HeroKilledDragonHandler implements IEventHandler<HeroKilledDragonEvent> {
  handle(event: HeroKilledDragonEvent) {
    throw new BadRequestException();
  }
}

Environment


Nest version: 7.0.11
This is still an issue in the most recent Nest version
 
For Tooling issues:
- Node version: 12.13.0
- Platform:  Windows

diego-rangel avatar Aug 22 '20 23:08 diego-rangel

Would you like to create a minimal reproduction repository? This would help us target/solve this issue sooner.

kamilmysliwiec avatar Aug 23 '20 14:08 kamilmysliwiec

Hi @kamilmysliwiec , I apologize for the delay in responding. Below is the link to a reproduction repository:

https://github.com/diego-rangel/nestjs-cqrs-crash-reproducer

REPRO STEPS:

  1. HTTP GET http://localhost:3000/hero/health <--- Will respond as 'ok';
  2. HTTP GET http://localhost:3000/hero <--- It dispatches an event which throws a BadRequestException;
  3. HTTP GET http://localhost:3000/hero/health <--- Will not respond anymore;

diego-rangel avatar Aug 28 '20 21:08 diego-rangel

Any news about this issue?

diego-rangel avatar Sep 20 '20 01:09 diego-rangel

I can confirm the issue of a crashing application.

ArjenRev avatar Oct 21 '20 11:10 ArjenRev

Probably here is the problematic part https://github.com/nestjs/cqrs/blob/a75efa4cb2c715cd44fd8b82ac8052410265af90/src/event-bus.ts#L68 We don't have any check if handler.handle(event) doesn't throw any exception. This is very likely possible to crash the app. I can fix it, but @kamilmysliwiec needs to decide the proper execution context for an exception from this place.

It doesn't fit any currently supported context types since from the event bus we don't know if the event was dispatched from an HTTP request or something else.

Maybe there should be a new context type for such a case in which we give the EventHandler class as Handler and the event as Args? Or maybe the event bus should have knowledge about the request context? That can be related to this issue #60

Or maybe exceptions from the event handler should be thrown as let say EventHandlerException to make it possible to have a dedicated ExceptionFilter and make it possible to store all failed events to rerun them. I'm really into this option with a dedicated context for such a case.

Sikora00 avatar Oct 27 '20 20:10 Sikora00

Or maybe a dedicated provider like EVENT_ERROR_HANDLER which will not require a context at all?

Sikora00 avatar Mar 23 '21 09:03 Sikora00

IMHO ExepctionFilter is a too high level for it. The handler execution may occur without any request when an event is thrown by a CRON/lifecycle hook/whatever.

Another problem is we actually may don't know what the right way is to handle the error. Maybe, crashing the application is the only solution. Imagine that an error in the handler may introduce the application into an inconsistent state. The handler creator should decide what happens when the handler fails.

Dyostiq avatar Mar 23 '21 10:03 Dyostiq

Great options @Sikora00 and @Dyostiq Q

On top of the above, there could be a graceful transition to require such handler in every module (i.e. to put it in providers) to first shoot a warning that it is unhandled (like promises) and in future the application won't start if there is no handler provided (or at least warning like currently that some query/command does not have relevant handler).

Just to mention another option (not believing much in this tho but could be a nice addition to aspect-oriented programming) is dedicated decorator which could handle such.

kgajowy avatar Mar 24 '21 07:03 kgajowy

@kgajowy but actually there can be an event without handlers

Dyostiq avatar Mar 24 '21 13:03 Dyostiq

@kgajowy but actually there can be an event without handlers

@Dyostiq Sure thing, we just need to keep in mind that could be a breaking change ("silencing" some error), so the above could be still valid.

kgajowy avatar Mar 24 '21 21:03 kgajowy

Hey guys. I think I'm facing the same problem with notifications about exceptions occuring outside "request/response" scope. I mean, for instance, CRON tasks and CQRS handlers could throws exceptions which BaseExceptionFilter middleware will ignore.

Do you already have a recommended approach to make the error tracking more concise in these situations?

theo-bittencourt avatar Aug 09 '21 16:08 theo-bittencourt

Event handler exceptions causing application shutdown is an unexpected developer experience. Other comparable means of asynchronous processing in the nestjs ecosystem don't follow that behavior.

To preserve backward compatibility, it might be worth introducing a module-level configuration for error handling that allows the community to easily implement global error handling logic or config to turn off the default crash behavior altogether.

collinc777 avatar Sep 10 '21 19:09 collinc777

To preserve backward compatibility, we can also introduce a new optional method onError like in nrwl/angular for effects https://nx.dev/latest/angular/guides/misc-data-persistence#pessimistic-updates

Sikora00 avatar Sep 13 '21 16:09 Sikora00

That sounds like a good approach to me.

We could utilize something like CqrsModule.forFeature({onError: (error) => {...}}) to pass in the config.

collinc777 avatar Sep 14 '21 00:09 collinc777

Actually, I meant a method at the level of each event handler. As often we have to react to an event failure, for example, start a revert mechanism.

Sikora00 avatar Sep 14 '21 09:09 Sikora00

I interpret the problem here as being: Errors within the event handlers cause the nestjs application to shut down

The best way to solve my understanding of the problem and keep the module backward compatible would be to allow for a module-level optional error handler to be passed to the CqrsModule.

I don't see how adding an optional method to individual event handlers allows the developer to solve my understanding of the problem easily. Isn't that virtually the same amount of work as them surrounding the innards of their handler method with a try-catch?

Is my understanding of the problem correct?

collinc777 avatar Sep 14 '21 17:09 collinc777

@collinc777 https://github.com/nestjs/cqrs/issues/409#issuecomment-804792586

Sikora00 avatar Sep 15 '21 12:09 Sikora00

I don't think we should implement an Exception Filter. I don't think exception filters will play nicely with handling errors that occur in the observable subject. I'm suggesting that we provide a config for an optional onEventHandlerError. It will allow the developer to address the crashing issue on a global scale if they choose to do so. It'd be backward compatible, optional, and address the issue at hand.

If the developer wants to handler an exception on a handler level, can't they just implement a try-catch block?

I'll try to get a code example up later this week. Sorry for not providing a pr up to this point!

collinc777 avatar Sep 15 '21 15:09 collinc777

@collinc777 I meant the second part. That we shouldn't handle all errors in a single place because only the event producer knows how to react to such a situation

Sikora00 avatar Sep 15 '21 15:09 Sikora00

I'm not suggesting that we force a global error handler. I suggest that we allow the developer to add one if they choose to via an optional config.

The issue of "exceptions thrown in the handler crash the application" is global by nature. It should have a global solution.

collinc777 avatar Sep 15 '21 16:09 collinc777

@collinc777 I agree with you on this.

There seems to be a disconnect in this thread with some who:

  1. need identify and deal with unexpected errors and not have the application process shutdown
  2. want to come up with an elegant way to deal with expected errors related to a use case

This issue is about (1). The app shouldn't just kill the process unexpectedly. This does not happen for anything else in the app, and we can't use Exception Filters to deal with it, because it's outside of the request pipeline.

xdave avatar Sep 16 '21 04:09 xdave

@Sikora00 any further updates on this? Maybe we can all start using a fork until nestjs supports this officially ?

kodeine avatar Mar 12 '22 01:03 kodeine

I would love to @kodeine. There is a fork under @nestjs-architects but just tell me what is the desired approach xD

Sikora00 avatar Mar 12 '22 02:03 Sikora00

@Sikora00 ill start using cqrs from nestjs-architects, thank you! Also, a quick question, how do you go about rollback of db transactions in events? i'd love to have a chat with you!

kodeine avatar Mar 12 '22 02:03 kodeine

@kodeine you can find my on NestJS's Discord. Especially at DDD or architecture channel.

Sikora00 avatar Mar 12 '22 02:03 Sikora00

Another option https://github.com/Sikora00/nestjs-cqrs-events-error-handling

Sikora00 avatar Mar 20 '22 11:03 Sikora00

I confirm the issue. I thought it's related to node version and to unhandled promise rejections: https://developer.ibm.com/blogs/nodejs-15-release-blog/

So if you do this, at least the application won't crash.

process.on('unhandledRejection', (reason, promise) => { // do something, but don't throw the error });

mciureanu avatar Jun 02 '22 14:06 mciureanu

@mciureanu I'm not sure this is a best practice as described in this article. https://blog.heroku.com/best-practices-nodejs-errors

This could be a workaround for now, but we definitely need a way to handle this kind of error at the module, even handler or sagas level.

Exilz avatar Sep 15 '22 09:09 Exilz

This behavior was introduced in @nestjs/cqrs version 8.0.4. It works as expected on version 8.0.3

iagoleao avatar Nov 29 '22 19:11 iagoleao

So what SHOULD I do if I encounter this issue? I don't want my server freeze every BRE.

dudematthew avatar Dec 12 '22 22:12 dudematthew