onload icon indicating copy to clipboard operation
onload copied to clipboard

onload AF_XDP bricks aws ec2

Open lparkersc opened this issue 1 year ago • 2 comments

Environment

  • OS: RHEL 9
  • Kernel: 5.14.0-70
  • Instance Type: m5zn.metal
  • Network Driver: ENA
  • Commit: f8458aa
  • Branch: master (as of this writing)

Issue Description

I am attempting to install OpenOnload configured for XDP on the ENA driver. The setup process runs smoothly until I test the application, at which point my SSH connection is disrupted, and the instance becomes unresponsive.

Steps to Reproduce

  1. Install all required packages.
  2. Execute make.
  3. Run ./onload_install --build-profile cloud as per the documentation.
  4. Successfully load sfc modules with sudo onload_tool reload.
  5. Attach a secondary ENA interface (name: eth1, driver: ena) to my EC2 instance.
  6. Register the new ENA interface with sfc_resource: echo eth1 > /sys/module/sfc_resource/afxdp/register.
  7. Disable SELinux and run my test application: onload python3 client.py [server_ip] [port].
  8. Post-step 7, my SSH connection is disrupted, and I cannot reconnect to the instance.

Additional Information

I cannot post the system logs as the instance becomes unresponsive. I referenced this previous discussion where experiments with OpenOnload on ENA via XDP showed mixed results. However this post is two years out of date which is why I'm oppening a new issue.

I am relatively new to OpenOnload, so any advice on potential missteps or additional configurations would be greatly appreciated.

Thank you for any assistance you can provide.

lparkersc avatar Jun 25 '24 19:06 lparkersc

You ever find a resolution to this @lparkersc ? I just hit the same issue ☠️

maxholloway avatar Jan 12 '25 22:01 maxholloway

I hit the same issue.

  • kernel version : 6.8.0-1024-aws

beanqi avatar Mar 11 '25 06:03 beanqi