erpc
erpc copied to clipboard
Arbitrated client stucks -> dead lock
Condition: This bug occurres if a message will be lost or the server is not yet available.
Issue: If the client invokes a procedure, erpc sends the invoke out no matter if there is someone, Right depends on the driver. After that, the arbitrated transport waits forever for a reply but this reply will never received and the process will stuck -> dead lock. As well when the server is coming up.
Solution: Replace the following timeout kWaitForever by a member variable mu32_ReceiveTimeout, which is going to be initialized in the constructor and can be set by a setter method. info->m_sem.get(Semaphore::kWaitForever);
BTW: When are you going to release a new version?
Hello @willmann9527 , thank you for reporting this issue. Agree, this needs to be solved. BTW, what transport layer are you using, is it MU_transport? As for the next planned release it should be available in June.
Hello @MichalPrincNXP, thank you for the rapid reply. I use my own transport layer due to the fact that a similar to IP layer is between eRPC and the physical layer. However, how do you handle a communication from one central point to two communication partners? The thing is, p_client is generated by the erpc generator and must be reassigned before we call an erpc procedure, which results in we have to write wrappers for the generated procedures to set this pointer p_client. I guess you have also a use case where you have two clients in one application and hopefully a smarter solution.
Hello @willmann9527 , if I understand your case correctly, you would like to have one eRPC client to communicate with two independent eRPC servers, correct? Then you have two transports, each for one client-server communication, and there is not necessary to have the transport arbitrator. In fact there are two independent eRPC client instances running on the client side then.
Hello @MichalPrincNXP, Yes, right. How do you implement these two independent eRPC client instances running on the client side as the generated code uses just one client pointer p_client?
Yes, you are right, this is the corner case that is not handled by the generator. Unfortunately, it is up to the user to solve now (by a script or a wrapper like you did it).
Our main use-case requires multiple client-server instances. To achieve this, I added instance versions of all setup functions; they don't use the static global objects. I changed the generator to create a client specific name for p_client. In the setup code, I bind the client to the client manager instance. I'm curious, if you share the same API with two instances, how do you know which instance to bind p_client with? Is there some "higher entity" which knows this? Or do you have two different components, which each happen to use the same API but on different remote connections?
Hi @RamseyTI, we are planning to call a procedure in front of each erpc call, which sets the global client p_client to the proper remote connection. We also considered your solution but we wanted to stay as keep as possible at the erpc repository to facilitate updates from it. Frankly speaking, if the generator is capable of taking a client parameter would be the best solution for us. Then, the setup function uses a list of clients to initialize these clients.
@willmann9527, understood. Thanks for the reply.
Hi @RamseyTI, may I have your tailored generator (solution and/or exe) and the setup files to check if it would be work for me? That would be awesome. Thank you in advance.
Hello @willmann9527, There are three elements to our object binding solution:
- Modify client source template to replace global symbol with module specific name
- Add setup functions which take object references
- Setup code
I modified the following template to replace the global g_Client pointer with a module specific name. I used {$commonHeaderName}_cm as the module specific name.
erpcgen/src/templates/c_client_source.template
- extern ClientManager *g_Client;
+ ClientManager *{$commonHeaderName}_cm;
- RequestContext request = g_client->createRequest(true);
+ RequestContext request = {$commonHeaderName}_cm->createRequest(true);
There are several more places in the template which need the same replacement.
Modify the client setup code to add object based functions which don't reference the internal global objects.
erpc_c/setup/erpc_client_setup.h
void erpc_client_init(erpc_transport_t transport, erpc_mbf_t message_buffer_factory);
+ void erpc_client_init(erpc_transport_t transport, erpc_mbf_t message_buffer_factory,
+ void *cm_obj, void *cf_obj, void *crc_obj);
Make a similar change to all the functions in this header file. I've attached this file for reference, but I also made similar changes to the following files:
erpc_mbf_setup.h erpc_server_setup.cpp erpc_server_setup.h erpc_transport_setup.h
And the custom transport and message buffer factories we wrote.
The client setup code will depend on callback support and transport arbitration. The example code below does both:
static ManuallyConstructed<BasicCodecFactory> lprf_client_api_bcfobj;
static ManuallyConstructed<BasicCodec> lprf_client_api_bcobj;
static ManuallyConstructed<Crc16> lprf_client_api_crcobj;
static ManuallyConstructed<TransportArbitrator> lprf_client_taobj;
static ManuallyConstructed<ArbitratedClientManager> lprf_client_api_acmobj;
static ManuallyConstructed<SimpleServer> lprf_client_cbk_ssobj;
static ManuallyConstructed<BasicCodecFactory> lprf_client_cbk_bcfobj;
static ManuallyConstructed<Crc16> lprf_client_cbk_crcobj;
extern ClientManager *lprf_api_cm;
static void lprf_client_setup(void)
{
erpc_transport_t transport;
erpc_mbf_t msg_buf_fact;
erpc_transport_t arbitrator;
/* create shared transport, arbitrator, and server objects */
transport = erpc_transport_uart_simplelink_init(CONFIG_UART_LPRF, 115200);
msg_buf_fact = erpc_mbf_static_init();
arbitrator = erpc_arbitrated_client_init_ref(transport, msg_buf_fact,
&lprf_client_api_acmobj, &lprf_client_taobj,
&lprf_client_api_bcfobj, &lprf_client_api_bcobj, &lprf_client_api_crcobj);
erpc_server_init_ref(arbitrator, msg_buf_fact,
&lprf_client_cbk_ssobj, &lprf_client_cbk_bcfobj, &lprf_client_cbk_crcobj);
/* bind interfaces to client manager */
lprf_api_cm = lprf_client_api_acmobj;
/* add services to rpc server */
erpc_add_service_to_server_ref(create_BLEmesh_cbk_service(), &lprf_client_cbk_ssobj);
erpc_add_service_to_server_ref(create_BLEstk_cbk_service(), &lprf_client_cbk_ssobj);
}
Here is the complimentary code for the server side:
static ManuallyConstructed<SimpleServer> lprf_server_api_ssobj;
static ManuallyConstructed<BasicCodecFactory> lprf_server_api_bcfobj;
static ManuallyConstructed<Crc16> lprf_server_api_crcobj;
static ManuallyConstructed<BasicCodecFactory> lprf_server_cbk_bcfobj;
static ManuallyConstructed<BasicCodec> lprf_server_cbk_bcobj;
static ManuallyConstructed<Crc16> lprf_server_cbk_crcobj;
static ManuallyConstructed<TransportArbitrator> lprf_server_taobj;
static ManuallyConstructed<ArbitratedClientManager> lprf_server_cbk_acmobj;
extern ClientManager *lprf_cbk_cm;
static void lprf_server_setup(void)
{
erpc_transport_t transport;
erpc_mbf_t msg_buf_fact;
erpc_transport_t arbitrator;
/* create shared transport, arbitrator, and server objects */
transport = erpc_transport_uart_simplelink_init(CONFIG_UART_LPRF, 115200);
msg_buf_fact = erpc_mbf_static_init();
arbitrator = erpc_arbitrated_client_init_ref(transport, msg_buf_fact,
&lprf_server_cbk_acmobj, &lprf_server_taobj,
&lprf_server_cbk_bcfobj, &lprf_server_cbk_bcobj, &lprf_server_cbk_crcobj);
erpc_server_init_ref(arbitrator, msg_buf_fact,
&lprf_server_api_ssobj, &lprf_server_api_bcfobj, &lprf_server_api_crcobj);
/* add api services to rpc server */
erpc_add_service_to_server_ref(create_BLEmesh_api_service(), &lprf_server_api_ssobj);
erpc_add_service_to_server_ref(create_BLEstk_api_service(), &lprf_server_api_ssobj);
/* bind callback interfaces to client manager */
lprf_cbk_cm = lprf_server_cbk_acmobj;
}
This should give you an idea of how we implemented this. We are interested in up-streaming this solution, but have not found the time. Let us know if you are interested.
c_client_source.template.txt erpc_client_setup.h.txt erpc_client_setup.cpp.txt
@willmann9527 For the issue of client getting stuck, in a way a eRPC client timeout for each eRPC call. I am also concerned about this same issue in my implementation. Need a mechanism in client to unblock the execution flow.
Can you share how you attempted this - maybe that sample setter function (method) code and Sem variable handling ?
Talking about this idea - "Solution: Replace the following timeout kWaitForever by a member variable mu32_ReceiveTimeout, which is going to be initialized in the constructor and can be set by a setter method. info->m_sem.get(Semaphore::kWaitForever);"
Thanks.
@anirudhgargi Frankly speaking, in the end I decided against the arbitrater. But it should be not that hard. Just use a member, e.g. mu32_Timeout, and set it in the constructor to Semaphore::kWaitForever. Then set up a setter procedure, for instance SetTimeout(uint32 u32_InTimeout) { mu32_Timeout = u32_InTimeout; }. Last but not least, use the member mu32_Timeout instead of the Semaphore::kWaitForever: info->m_sem.get(mu32_Timeout);. Was that helpful?
@willmann9527 Thanks for the prompt reply. Let me implement and validate. Will share feedback soon.
I think it would be better to release pending clients in erpc_transport_arbitrator::receive with something like
erpc_status_t err = m_sharedTransport->receive(message);
if (err)
{
// if we timeout, we must unblock all pending client(s)
if (err == kErpcStatus_Timeout)
{
PendingClientInfo *client = m_clientList;
for (; client; client = client->m_next)
{
if (client->m_isValid)
{
client->m_sem.put();
}
}
}
return err;
}
Hi @pgu-swir @MichalPrincNXP do you think this jira can be closed based on merged PR?
I would close it if others involved agree too. @willmann9527 @RamseyTI @anirudhgargi any comments?
Closing for inactivity.