vsomeip icon indicating copy to clipboard operation
vsomeip copied to clipboard

Control SD sync/async behaviour with env var on QNX

Open kheaactua opened this issue 1 year ago • 0 comments

Description

This is a change intended for systems like QNX which do not generally have a service like netlink monitoring the network status that can easily be tied into. This is not setup to be used/enabled on Linux/Android.

This allows us to launch and use a routing manager with local UDS communication before remote networking is available which simplifies our startup graph and saves time during startup. This is in contrast to the current version where SD will fail to load if networking is not available at process start - leaving the routing manager in an error state.

I have been using this change for 1-2 years now and it has been very beneficial.

Usage

This behaviour can be enabled by exporting the following environment variables:

# Env var that if exists will cause SD setup to be performed asynchronously and
# wait on network availability.  This wait also impacts the routing_manager to
# also wait until a network interface is available before issuring OFFERs
export VSOMEIP_USE_ASYNCHRONOUS_SD

# Iff SD is running synchronously, the existence of this env var will cause the
# SD setup to still block on network availability. (mostly a testing scenario)
export VSOMEIP_WAIT_FOR_INTERFACE

# The current waiting mechanism is to block until a file (specified by this
# define)
export VSOMEIP_NETWORK_INT_READY_FILE=<file path of file created when network available>

This is implemented by modifying the signature of service_discovery::start() in order to accept a callback send from routing_manager_impl.

Notes:

  • mutexes
    • sd_impl::endpoint_ is now mutex protected
    • rm_impl::pending_sd_offers_mutex_ is a recursive mutex, as now it can be called in its own thread and the new thread in SD
  • There is no timeout on the waitfor. The original implementation had a configurable timeout, however because timing out left us in an error state anyways, this timeout was removed (raised to numeric_limits<int>::max() = ~45 days, give or take.)

kheaactua avatar May 27 '24 19:05 kheaactua