netxduo
netxduo copied to clipboard
High-speed transfer hang on mutex
In an embedded system sending binary data at about 7 Mbits/second over a 100BaseT interface TCP socket, NetX Duo hangs after a few seconds, waiting on mutex nx_ip_protection. Addition of some test code indicated that nx_ip_protection was acquired in nx_tcp_socket_send_internal(), line 803, during a call from the user application of nx_tcp_socket_send():
/* Place protection while we check the sequence number for the new TCP packet. */
tx_mutex_get(&(ip_ptr -> nx_ip_protection), TX_WAIT_FOREVER);
/* Determine if the sequence number is the same. */
if (sequence_number != socket_ptr -> nx_tcp_socket_tx_sequence)
{
After line 912, the code calls either _nx_ip_packet_send() or _nx_ipv6_packet_send(), which can apparently re-request ownership of nx_ip_protection. The following changes were made, to make nx_ip_protection available during execution of these routines, without violating the expected mutex state in the code which follows:
/* Send the TCP packet to the IP component. */
#ifndef NX_DISABLE_IPV4
if (socket_ptr -> nx_tcp_socket_connect_ip.nxd_ip_version == NX_IP_VERSION_V4)
{
>>> /* Release the protection. */
>>> tx_mutex_put(&(ip_ptr -> nx_ip_protection));
_nx_ip_packet_send(ip_ptr, send_packet,
socket_ptr -> nx_tcp_socket_connect_ip.nxd_ip_address.v4,
socket_ptr -> nx_tcp_socket_type_of_service,
socket_ptr -> nx_tcp_socket_time_to_live,
NX_IP_TCP,
socket_ptr -> nx_tcp_socket_fragment_enable,
socket_ptr -> nx_tcp_socket_next_hop_address);
>>> /* Reacquire IP structure. */
>>> tx_mutex_get(&(ip_ptr -> nx_ip_protection), TX_WAIT_FOREVER);
}
#endif /* !NX_DISABLE_IPV4 */
#ifdef FEATURE_NX_IPV6
if (socket_ptr -> nx_tcp_socket_connect_ip.nxd_ip_version == NX_IP_VERSION_V6)
{
>>> /* Release the protection. */
>>> tx_mutex_put(&(ip_ptr -> nx_ip_protection));
/* Ready to send the packet! */
_nx_ipv6_packet_send(ip_ptr,
send_packet,
NX_PROTOCOL_TCP,
send_packet -> nx_packet_length,
ip_ptr -> nx_ipv6_hop_limit,
socket_ptr -> nx_tcp_socket_ipv6_addr -> nxd_ipv6_address,
socket_ptr -> nx_tcp_socket_connect_ip.nxd_ip_address.v6);
>>> /* Reacquire IP structure. */
>>> tx_mutex_get(&(ip_ptr -> nx_ip_protection), TX_WAIT_FOREVER);
}
#endif /* FEATURE_NX_IPV6 */
The changes do not prevent the hang from occurring, however. But if the two tx_mutex_get() calls are removed, the hang no longer occurs.
The changes are patterned after a similar tx_mutex_put()/tx_mutex_get() sequence, already present in the code at line 566:
/* Release the protection. */
tx_mutex_put(&(ip_ptr -> nx_ip_protection));
/* Obtain a new segmentation. */
ret = _nx_packet_allocate(pool_ptr, &send_packet,
data_offset, wait_option);
if (ret != NX_SUCCESS)
{
/* Restore preemption? */
if (preempted == NX_TRUE)
{
/*lint -e{644} -e{530} suppress variable might not be initialized, since "old_threshold" was initialized when preempted was set to NX_TRUE. */
tx_thread_preemption_change(_tx_thread_current_ptr, old_threshold, &old_threshold);
}
/* Packet allocate failure. Return.*/
return(ret);
}
/* Regain exclusive access to IP instance. */
tx_mutex_get(&(ip_ptr -> nx_ip_protection), TX_WAIT_FOREVER);