exanic-software icon indicating copy to clipboard operation
exanic-software copied to clipboard

Exasock: TX buffer allocation fails beyond 6 TCP connections with custom ExaNIC firmware on ExaNIC X25

Open vbhosale-algo opened this issue 2 months ago • 2 comments

Hello,

I would like to request guidance on an issue I am facing with custom ExaNIC firmware.

When using the stock ExaNIC firmware, 10 concurrent TCP connections initialize successfully. When using a custom firmware where only port 1 has user logic and port 0 is configured as a standard Ethernet interface, the following error appears once the 7th TCP connection is created on port 0:

exasock_exanic_ip_dev_init: exanic_acquire_tx_buffer failed for dev exanic0

The corresponding dmesg error is:

[  250.690291] exanic 0000:01:00.0: exanic0: Failed to allocate TX region of size: 0x01000.

Expected behavior: All TCP connections should initialize successfully as with the stock firmware.

Observed behavior: Connection initialization fails beyond 6 concurrent connections.

Environment Details: Firmware: Custom firmware (logic only on port 1, port 0 as standard interface) Hardware: ExaNIC X25 Software: exanic-software 2.7.5 OS: Gentoo

Steps to reproduce: Load custom firmware (logic only on port 1). Establish 10 TCP connections on port 0. Observe failure from the 7th connection onward.

Additional context / Analysis: The error originates in exanic_alloc_tx_region(), invoked via the EXANICCTL_TX_BUFFER_ALLOC ioctl. This ioctl is triggered by exanic_alloc_tx_buffer(), which is called by exanic_acquire_tx_buffer(). All exanic_acquire_tx_buffer() calls show page size = 0, implying the default page size is used.

Analysis: The failure likely occurs inside exanic_alloc_tx_region() during buffer allocation, possibly due to resource exhaustion or incorrect handling of port mapping in the custom firmware. Kernel debug prints show the difference between stock and custom firmware: Stock firmware:

[    3.888349] exanic: exanic_alloc_tx_region: port_num=0, size=0x1000, PAGE_SHIFT=12, usable_start_page=0x0, usable_end_page=0x20, num_pages=0x1
[    4.000339] exanic: exanic_alloc_tx_region: port_num=1, size=0x1000, PAGE_SHIFT=12, usable_start_page=0x20, usable_end_page=0x40, num_pages=0x1

Custom firmware:

[    3.944690] exanic: exanic_alloc_tx_region: port_num=0, size=0x1000, PAGE_SHIFT=12, usable_start_page=0x0, usable_end_page=0x8, num_pages=0x1
[    4.056680] exanic: exanic_alloc_tx_region: port_num=1, size=0x1000, PAGE_SHIFT=12, usable_start_page=0x8, usable_end_page=0x10, num_pages=0x1

The discrepancy indicates the issue is with port->tx_region_usable_size inside exanic_alloc_tx_region(), as the calculation: size_t usable_end_page = usable_start_page + (port->tx_region_usable_size >> PAGE_SHIFT); results in a smaller usable range for each port in the custom firmware. This reduced tx_region_usable_size limits the number of concurrent TX buffers, causing allocation failures beyond 6 TCP connections.

Thank you in advance for any help or suggestions.

vbhosale-algo avatar Oct 23 '25 06:10 vbhosale-algo

run exanic-config <deivce> -v to show TX buffer size.

According to ExaNIC Internals: Applications can request parts of the transmit buffer, at a minimum granularity of 4KiB, so the max TCP connections will be buffer_size / 4 - 1

custom firmware used more on-chip memory, the TX buffer size will be lower than Function: network interface firmware.

comphilip avatar Oct 23 '25 13:10 comphilip

If you are paying for the devkit then you can specify a variant that allows for 64k or 128k of tx buffers. Here is the line for the X25 card Available FDK variants for platform x25: full full_fastmac full_fastmac_hwtime64 full_fastmac_txdeskew full_hwtime64 full_iprules64 full_macrules64 full_multirate full_multirate_extrarxreg full_txbuf128 full_txbuf16 full_txbuf64 full_txdeskew

sirgajelot avatar Oct 23 '25 13:10 sirgajelot