aws-sdk-ruby icon indicating copy to clipboard operation
aws-sdk-ruby copied to clipboard

Putting an event on an nonexistent EventBridge event bus returns a success

Open kylejw2 opened this issue 2 years ago • 10 comments

Confirm by changing [ ] to [x] below to ensure that it's a bug:

Describe the bug I'm using the SDK to put events on an event bus that I've created. When I instantiate the EventBridge client, I can get a list of all available event buses in my account. I noticed an issue that EventBridge isn't reporting errors that I would expect it to report. For example, I changed the event bus name to one that didn't show up in the list of event buses I pulled. When I executed the put_events method for a nonexistent event bus, I received a success message and no error. I looked at the source code for the put_events command and I couldn't find any issues with it. I think this is probably an error on the AWS api. Receiving a success response when I know my event fell off the radar seems like buggy behavior.

Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version aws-sdk-eventbridge

Version of Ruby, OS environment -paste the output of ruby -v ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-darwin19]

To Reproduce (observed behavior)

require 'aws-sdk-eventbridge'

region_name = 'us-east-1'

client = Aws::EventBridge::Client.new(
  region: region_name
)

detail = {
  id: 'test',
  name: 'project.name',
  organization: {
    id: 'project.organization.public_id'
  },
  customer: {
    id: 'customer.public_id'
  },
  projectManager: {
    email: 'projectManager.email'
  }
}

entry = {
  time: Time.now,
  source: 'api.core',
  resources: [],
  detail_type: 'project.created',
  detail: detail.to_json,
  event_bus_name: 'an-event-bus-that-does-not-exist',
  trace_header: nil
}

puts detail.to_json

result = client.put_events({entries: [entry]})

puts result

Expected behavior I expect to see an error when EventBridge doesn't put the event on an event bus.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

kylejw2 avatar Feb 11 '22 23:02 kylejw2

Thanks for opening an issue. I think I would agree that this is a service issue. Let me put this in front of the EventBridge team.

mullermp avatar Feb 14 '22 17:02 mullermp

Do you have any request IDs that would help the service team investigate? It should be on the response context. You can also see them in the wire log when settings the client's http_wire_trace: true.

mullermp avatar Feb 14 '22 17:02 mullermp

The service team was able to reproduce using the CLI. I will update this thread when a fix has been planned.

mullermp avatar Feb 14 '22 21:02 mullermp

Thank you.

kylejw2 avatar Feb 15 '22 16:02 kylejw2

Service team has identified 3 possible avenues. A fix requires a behavior change. For transparency, the options are:

  1. Do nothing and keep behavior. Safe option but frustrating for customers who expect to be notified of issues where event bus doesn't exist.
  2. Send a success response but increment FailedEntryCount. Passive solution but doesn't indicate why it failed.
  3. Fail the entire operation with an exception like EventBusDoesNotExistException. More ideal but non-passive by nature to anyone relying on existing behavior. PutEvents supports more than 1 event so it would be confusing if 1 of many events fail.

The service team wants to collect some metrics of how many customers have this issue before they make a decision.

Are you currently blocked by this or anything? Or was it just a bug that you encountered and can be avoided?

mullermp avatar Feb 17 '22 18:02 mullermp

I'm not currently blocked by this. As I was exploring the SDK, I noticed this behavior. Is it possible to return a failed entry count when an event is placed on an event bus, but doesn't match any of the rules?

kylejw2 avatar Feb 23 '22 16:02 kylejw2

I also just came across this behavior, though in the .NET Sdk. I'm not too opinionated on how it gets resolved on the service side, but the current behavior is a serious violation of the "principle of least surprise", to say the least.

Any information on if/when this behavior change is going to get worked into the service and what the change will be?

jamesbascle avatar Apr 27 '22 13:04 jamesbascle

The service team has not prioritized a change. I can revive that internal ticket citing another customer ran into the issue and perhaps they will further prioritize it.

mullermp avatar Apr 27 '22 18:04 mullermp

@mullermp Using the go v2 SDK and have the same issue. If I don't get a failure when the bus I'm publishing to does not exist, how can I guarantee that any of my events are getting published? I feel like I can't trust eventbridge as a service.

This is a pretty serious issue.

jasongerard avatar Jun 17 '22 15:06 jasongerard

My vote would be to raise something like EventBusDoesNotExistException

metaskills avatar Aug 10 '22 18:08 metaskills

@mullermp We are encountering this issue on a large implementation. We have concerns that our API client code is returning false-positive success and there's no way to detect failure and retry/alarm.

Can you ask the service team to consider the following approach? I believe this would put the EventBridge API closer in line with the SNS API:

  • Your option 2 above, putEvents API sends a success response but increments FailedEntryCount. Similar to SNS:PublishBatch API in that it responds HTTP 200 but the response payload indicates what failed. I think you'd probably need an array of Failed events in the response so that clients can manage those.
  • Introduce a new putEvent API that accepts a single event and responds HTTP 5xx on failure. Similar to the SNS:Publish API

This ensures support for existing client batch behavior while adding support for client-side failure handling.

jwicks avatar Aug 25 '22 13:08 jwicks

I've forwarded that feedback to the service team. I'm sorry that I cannot do much else as I am not an owner of the event bridge service..

Please also add your frustrations here: https://github.com/aws/aws-sdk/issues/186

I'm also going to close this ticket in favor of the one in aws-sdk since this issue applies generally and not to the Ruby SDK - I do want to make sure all feedback lands there instead of spread across multiple repos.

mullermp avatar Aug 25 '22 13:08 mullermp

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

github-actions[bot] avatar Aug 25 '22 13:08 github-actions[bot]