pharo icon indicating copy to clipboard operation
pharo copied to clipboard

Opening a socket with fd>=1024 crashes the VM

Open akgrant43 opened this issue 2 years ago • 4 comments

Bug description The VM can crash with an access violation on linux when a socket is opened with a file descriptor >= 1024.

To Reproduce

aio.c stores the handlers for async io in three arrays:

static aioHandler rdHandler[FD_SETSIZE];
static aioHandler wrHandler[FD_SETSIZE];
static aioHandler exHandler[FD_SETSIZE];

FD_SETSIZE is typically 1024, so when a socket with an fd of 1024 or greater is opened and a read operation performed, the read handler is incorrectly written in to wrHandler. For fd=1024 this corresponds to stdout (fd=0 in wrHandler), so the next time stdout IO is performed and aioPoll() is called the VM will crash as the socket dataHandler() will be incorrectly called.

Expected behavior All IO completes successfully and the VM doesn't crash.

Screenshots The gdb session describing the crash is included below.

Version information:

  • OS: Ubuntu
  • Version: 22.04
  • Pharo Version: 10

Expected development cost

https://access.redhat.com/solutions/488623 suggests using epoll/libevents over select().

The opensmalltalk VM already has this change incorporated in to AIO, https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/1bb96795fca099baedd8f45e59c847786aae91d6/platforms/unix/vm/aio.c#L730 which would provide a head-start to implementing the change.

Additionally, if epoll() isn't available, an error should be raised if an attempt is made to poll an fd >= 1024.

Additional context

The test harness we set up has Pharo acting as a server and connecting 1500 clients to it, responding to requests and logging activity to the console.

Running Pharo under gdb and triggering the issue results in:

Thread 1 "GlamorousToolki" received signal SIGSEGV, Segmentation fault.
0x00007ffff240c3e8 in dataHandler (fd=0, data=0x0, flags=4) at extracted/plugins/SocketPlugin/src/common/SocketPluginImpl.c:591
591	  logTrace("dataHandler(%d=%d, %p, %d)\n", fd, pss->s, data, flags);
(gdb) 


(gdb) bt
#0  0x00007ffff240c3e8 in dataHandler (fd=0, data=0x0, flags=4)
    at extracted/plugins/SocketPlugin/src/common/SocketPluginImpl.c:591
#1  0x00007ffff712235b in aio_handle_events (microSeconds=50000)
    at extracted/vm/src/unix/aio.c:314
#2  0x00007ffff7121cde in aioPoll (microSeconds=50000)
    at extracted/vm/src/unix/aio.c:216
#3  0x00007ffff711ce25 in ioRelinquishProcessorForMicroseconds (microSeconds=50000)
    at extracted/vm/src/common/heartbeat.c:307
#4  0x00007ffff70effd1 in primitiveRelinquishProcessor ()
    at generated/64/vm/src/cointerp.c:35534
#5  0x000000002c789c76 in  ()
#6  0x00007ffff5701640 in  ()
#7  0x000000002de5b190 in  ()
#8  0x0000000100000000 in  ()
#9  0x000000002e163a50 in  ()
#10 0x00007fffffff6180 in  ()
#11 0x00007ffff70720e0 in interpret ()
    at generated/64/vm/src/cointerp.c:2978


(gdb) list
586	/* read or write data transfer is now possible for the socket. */
587	
588	static void dataHandler(int fd, void *data, int flags)
589	{
590	  privateSocketStruct *pss= (privateSocketStruct *)data;
591	  logTrace("dataHandler(%d=%d, %p, %d)\n", fd, pss->s, data, flags);
592	
593	  if (pss == NULL)
594	    {
595	      logTrace("dataHandler: pss is NULL fd=%d data=%p flags=0x%x\n", fd, data, flags);


(gdb) print data
$1 = (void *) 0x0
(gdb) print pss
$2 = (privateSocketStruct *) 0x0


(gdb) up
#1  0x00007ffff712235b in aio_handle_events (microSeconds=50000)
    at extracted/vm/src/unix/aio.c:314
314	            handler(fd, clientData[fd], AIO_W);
(gdb) list
309	        }
310	        //_DO(AIO_W, wr)
311	        if (FD_ISSET(fd, &wr)) {
312	            handler = wrHandler[fd];
313	            FD_CLR(fd, &wrMask);
314	            handler(fd, clientData[fd], AIO_W);
315	            wrHandler[fd]= undefinedHandler;
316	        }
317	        //_DO(AIO_X, ex)
318	        if (FD_ISSET(fd, &ex)) {


(gdb) print fd
$3 = 0
(gdb) print clientData[fd]
$4 = (void *) 0x0
(gdb) print wrHandler[fd]
$5 = (aioHandler) 0x7ffff240c3a0 <dataHandler>

akgrant43 avatar Oct 17 '22 08:10 akgrant43

Thanks alistair. Pablo is about to recover the work he did on socket and this will probably go also in the socket improvement todo. @tesonep

Ducasse avatar Oct 17 '22 17:10 Ducasse

Great to hear, thanks!

akgrant43 avatar Oct 17 '22 17:10 akgrant43

Fixed in https://github.com/pharo-project/pharo-vm/pull/805

tesonep avatar May 15 '24 09:05 tesonep

This is great!!! @tesonep is that fix only for sockets or does it affect files as well? You remember when I complained about the usage of select() there? Thanks @akgrant43 for the fix

noha avatar May 16 '24 14:05 noha

Fixed by: https://github.com/pharo-project/pharo-vm/pull/805

akgrant43 avatar Jul 09 '24 07:07 akgrant43