lib60870
lib60870 copied to clipboard
Use after free in handleConnection (cs104_connection.c:821)
Hello.
Valgrind says:
==10956== Thread 5:
==10956== Invalid write of size 1
==10956== at 0x5061EA6: handleConnection (cs104_connection.c:821)
==10956== by 0x508F668: start_thread (pthread_create.c:479)
==10956== by 0x4B37322: clone (clone.S:95)
==10956== Address 0x4c5dcd0 is 496 bytes inside a block of size 552 free'd
==10956== at 0x483BA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==10956== by 0x5063C06: Memory_free (lib_memory.c:79)
==10956== by 0x50613D9: CS104_Connection_destroy (cs104_connection.c:431)
==10956== by 0x504FD02: process_meter_connection (lib.c:972)
==10956== by 0x504FE1C: iec_104_fetch_thread (lib.c:995)
==10956== by 0x506397B: destroyAutomaticThread (thread_linux.c:87)
==10956== by 0x508F668: start_thread (pthread_create.c:479)
==10956== by 0x4B37322: clone (clone.S:95)
==10956== Block was alloc'd at
==10956== at 0x483A7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==10956== by 0x5063B5A: Memory_malloc (lib_memory.c:44)
==10956== by 0x5060CF2: createConnection (cs104_connection.c:199)
==10956== by 0x5060E4E: CS104_Connection_create (cs104_connection.c:243)
==10956== by 0x504FD85: iec_104_fetch_thread (lib.c:984)
==10956== by 0x506397B: destroyAutomaticThread (thread_linux.c:87)
==10956== by 0x508F668: start_thread (pthread_create.c:479)
==10956== by 0x4B37322: clone (clone.S:95)
==10956==
This error may occur when connection was closed and user program set flag about this in connectionHandler() and then call CS104_Connection_destroy() on this connection fast enough.
Also, if user calls CS104_Connection_destroy() in connectionHandler() on CONNECTION_CLOSED event - following error occurs before cs104_connection.c:821 :
==13643== Thread 4:
==13643== Invalid read of size 8
==13643== at 0x12AE62: handleConnection (cs104_connection.c:817)
==13643== by 0x488B668: start_thread (pthread_create.c:479)
==13643== by 0x49C7322: clone (clone.S:95)
==13643== Address 0x4a98f28 is 488 bytes inside a block of size 552 free'd
==13643== at 0x483BA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==13643== by 0x12CBD5: Memory_free (lib_memory.c:79)
==13643== by 0x12A3A8: CS104_Connection_destroy (cs104_connection.c:431)
==13643== by 0x118CFA: connectionHandler (lib.c:675)
==13643== by 0x12AE50: handleConnection (cs104_connection.c:804)
==13643== by 0x488B668: start_thread (pthread_create.c:479)
==13643== by 0x49C7322: clone (clone.S:95)
==13643== Block was alloc'd at
==13643== at 0x483A7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==13643== by 0x12CB29: Memory_malloc (lib_memory.c:44)
==13643== by 0x129CC1: createConnection (cs104_connection.c:199)
==13643== by 0x129E1D: CS104_Connection_create (cs104_connection.c:243)
==13643== by 0x119280: iec_104_fetch_thread (lib.c:996)
==13643== by 0x12C94A: destroyAutomaticThread (thread_linux.c:87)
==13643== by 0x488B668: start_thread (pthread_create.c:479)
==13643== by 0x49C7322: clone (clone.S:95)
So, there is 2 memory errors at once (and one of them is 1-byte memory corruption).
I think this is not a problem of the library. It is beause you call the destructor (CS104_Connection_destroy) while being inside of a callback function of the object you want to delete. You shouldn't call CS104_Connection_destroy inside the callback function.
This is a library problem, because there is no thread synchronizing between connection thread and user threads, and so right after flag for calling destructor was set in the callback function it might be read in user thread and destructor will be called while connection thread is still in handleConnection(), and illegal memory access will occur.
What flag are you referring to? AFAICS the CS104_Connection_destroy function is waiting for the connection thread before it releases any memory.
I'm referring for a (for example) bool flag passed as bool* pointer through void *parameter argument to the connectionHandler().
I could not reproduce this problem right now, but sometimes attached simple test program (simplified version of simple_client.c) calls connectionHandler twice for CS104_CONNECTION_CLOSED event:
Connection established
StartDT_con received
RECVD ASDU type: M_ME_TF_1(36) elements: 1 cot SPONTANEOUS(3)
RECVD ASDU type: C_IC_NA_1(100) elements: 1 cot ACTIVATION_CON(7)
RECVD ASDU type: M_BO_NA_1(7) elements: 5 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 28 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 28 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NB_1(11) elements: 18 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 32 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 32 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 32 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 32 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_SP_NA_1(1) elements: 24 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: C_IC_NA_1(100) elements: 1 cot ACTIVATION_TERMINATION(10)
RECVD ASDU type: M_ME_TF_1(36) elements: 1 cot SPONTANEOUS(3)
RECVD ASDU type: M_ME_TF_1(36) elements: 2 cot SPONTANEOUS(3)
RECVD ASDU type: M_ME_TF_1(36) elements: 3 cot SPONTANEOUS(3)
RECVD ASDU type: M_ME_TF_1(36) elements: 1 cot SPONTANEOUS(3)
RECVD ASDU type: M_ME_TF_1(36) elements: 3 cot SPONTANEOUS(3)
RECVD ASDU type: M_ME_TF_1(36) elements: 2 cot SPONTANEOUS(3)
StopDT_con received
Connection closed
Connection closed
And, this looks like the real issue, sorry for misleading.
Just reproduced the issue with exaclty same test program, here is the Valgrind report:
==14536== Memcheck, a memory error detector
==14536== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==14536== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==14536== Command: ./examples/cs104_client/simple_client
==14536==
Connection established
StartDT_con received
RECVD ASDU type: C_IC_NA_1(100) elements: 1 cot ACTIVATION_CON(7)
RECVD ASDU type: M_BO_NA_1(7) elements: 5 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 28 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 28 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NB_1(11) elements: 18 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 32 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 32 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 32 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_ME_NC_1(13) elements: 32 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: M_SP_NA_1(1) elements: 24 cot INTERROGATED_BY_STATION(20)
RECVD ASDU type: C_IC_NA_1(100) elements: 1 cot ACTIVATION_TERMINATION(10)
RECVD ASDU type: M_ME_TF_1(36) elements: 2 cot SPONTANEOUS(3)
RECVD ASDU type: M_ME_TF_1(36) elements: 1 cot SPONTANEOUS(3)
RECVD ASDU type: M_ME_TF_1(36) elements: 1 cot SPONTANEOUS(3)
RECVD ASDU type: M_ME_TF_1(36) elements: 2 cot SPONTANEOUS(3)
RECVD ASDU type: M_ME_TF_1(36) elements: 1 cot SPONTANEOUS(3)
RECVD ASDU type: M_ME_TF_1(36) elements: 2 cot SPONTANEOUS(3)
StopDT_con received
Connection closed
Connection closed
==14536== Thread 2:
==14536== Invalid read of size 8
==14536== at 0x1240C7: handleConnection (cs104_connection.c:818)
==14536== by 0x4879668: start_thread (pthread_create.c:479)
==14536== by 0x49B5322: clone (clone.S:95)
==14536== Address 0x4a87228 is 488 bytes inside a block of size 552 free'd
==14536== at 0x483BA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==14536== by 0x125E51: Memory_free (lib_memory.c:79)
==14536== by 0x1235F9: CS104_Connection_destroy (cs104_connection.c:431)
==14536== by 0x1148A2: main (simple_client.c:53)
==14536== Block was alloc'd at
==14536== at 0x483A7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==14536== by 0x125DA5: Memory_malloc (lib_memory.c:44)
==14536== by 0x122F0F: createConnection (cs104_connection.c:199)
==14536== by 0x12306E: CS104_Connection_create (cs104_connection.c:243)
==14536== by 0x114831: main (simple_client.c:44)
==14536==
==14536==
==14536== HEAP SUMMARY:
==14536== in use at exit: 472 bytes in 3 blocks
==14536== total heap usage: 89 allocs, 86 frees, 25,362 bytes allocated
==14536==
==14536== LEAK SUMMARY:
==14536== definitely lost: 192 bytes in 1 blocks
==14536== indirectly lost: 0 bytes in 0 blocks
==14536== possibly lost: 272 bytes in 1 blocks
==14536== still reachable: 8 bytes in 1 blocks
==14536== suppressed: 0 bytes in 0 blocks
==14536== Rerun with --leak-check=full to see details of leaked memory
==14536==
==14536== For lists of detected and suppressed errors, rerun with: -s
==14536== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Thanks for the example code. I see a problem with calling "CS104_Connection_close" inside the connecton handler (when stop dt is received). It shouldn't be called there because it is the function that waits for the connection handling thread to terminate (and is also called by CS104_Connection_destroy for this purpose). I guess it should be explicitely mentioned what functions can be called inside the callbacks and what functions cannot be called. For sure I could also add some synchronization code to allow the function to be called inside the callback. Yet I don't like the idea to blow up the synchronization code.
I'd applied following patch to the sources of library:
diff --git a/lib60870-C/src/hal/inc/hal_thread.h b/lib60870-C/src/hal/inc/hal_thread.h
index 8af89e9..7cba336 100644
--- a/lib60870-C/src/hal/inc/hal_thread.h
+++ b/lib60870-C/src/hal/inc/hal_thread.h
@@ -80,8 +80,10 @@ Thread_start(Thread thread);
* \brief Destroy a Thread and free all related resources.
*
* \param thread the Thread instance to destroy
+ *
+ * \return was Thread sucessfully destroyed or not
*/
-void
+bool
Thread_destroy(Thread thread);
/**
diff --git a/lib60870-C/src/hal/thread/linux/thread_linux.c b/lib60870-C/src/hal/thread/linux/thread_linux.c
index 72fe875..d5a53ba 100644
--- a/lib60870-C/src/hal/thread/linux/thread_linux.c
+++ b/lib60870-C/src/hal/thread/linux/thread_linux.c
@@ -22,6 +22,9 @@
#include <pthread.h>
#include <semaphore.h>
#include <unistd.h>
+#include <errno.h>
+#include <memory.h>
+#include <stdio.h>
#include "hal_thread.h"
#include "lib_memory.h"
@@ -104,14 +107,21 @@ Thread_start(Thread thread)
thread->state = 1;
}
-void
+bool
Thread_destroy(Thread thread)
{
if (thread->state == 1) {
- pthread_join(thread->pthread, NULL);
+ if (pthread_join(thread->pthread, NULL) != 0) {
+ char error_msg[256];
+ int saved_errno = errno;
+ strerror_r(saved_errno, error_msg, sizeof(error_msg));
+ fprintf(stderr, "ERROR: Thread_destroy() failed: %s (%d)\n", error_msg, saved_errno);
+ return false;
+ }
}
GLOBAL_FREEMEM(thread);
+ return true;
}
void
diff --git a/lib60870-C/src/iec60870/cs104/cs104_connection.c b/lib60870-C/src/iec60870/cs104/cs104_connection.c
index 9544a0d..a168da9 100644
--- a/lib60870-C/src/iec60870/cs104/cs104_connection.c
+++ b/lib60870-C/src/iec60870/cs104/cs104_connection.c
@@ -410,7 +410,9 @@ CS104_Connection_close(CS104_Connection self)
#if (CONFIG_USE_THREADS == 1)
if (self->connectionHandlingThread)
{
- Thread_destroy(self->connectionHandlingThread);
+ if (!Thread_destroy(self->connectionHandlingThread)) {
+ return;
+ }
self->connectionHandlingThread = NULL;
}
#endif
And got following output:
StopDT_con received
ERROR: Thread_destroy() failed: Operation now in progress (115)
Connection closed
So, it's possible to notify user about this problem, or even print message about invalid function calls from callbacks if connection thread will set an internal flag ("in_callback", for example) before entering callback (and clear this flag after callback functions returned) and check this flag inside "dangerous" functions, so user will be able to see and fix those issues.
I'd seen some other places where return values of system functions was not properly checked, and i'm think those places potentially can cause such problems too.
Of course, the patch above will cause memory leaks (which is a bit better than memory corruptions or segfaults, i think), so it can't completely replace correct fix for this problem (but can detect the problems).