Skip to content
Snippets Groups Projects
Commit ffe092b6 authored by Rob Swindell's avatar Rob Swindell :speech_balloon:
Browse files

Set initial state of 'data_event' and 'highwater_event' to *not* signaled

Since commit fafed094 (2 months ago), the Synchronet Web Server on
my Windows systems has occasionally gone into a state where every HTTP session
thread was causing 100% CPU utilization and each new HTTP session thread
would just exacerbate the problem eventually leading to complete system
instability/unresponsiveness until the sbbs instance was terminated. This was
more readily reproducible on a Win7-32 VM, but would occasionally, though
much less frequently, happen in a native instance on Win10-64 as well.

Using the VisualStudio debugger, I was able to narrow it down to this:

Each new HTTP thread (eventually, hundreds of such threads) would get stuck
in this loop in http_output_thread():

    while(session->socket!=INVALID_SOCKET) {

		/* Wait for something to output in the RingBuffer */
		if((avail=RingBufFull(obuf))==0) {	/* empty */
			if(WaitForEvent(obuf->data_event, 1000) != WAIT_OBJECT_0)
				continue;
			/* Check for spurious sem post... */
			if((avail=RingBufFull(obuf))==0)
				continue; // <--- data_event signaled, but never cleared
		}

There appears to be a race condition where this logic could be executed
immediately after the output ringbuf was created, but before writebuf() was
ever called (which would have actually placed data in the output buffer),
causing a potential high-utilization infinite loop: the data_event is signaled
but there is no data and the event is never reset and nothing can ever add
data to the ringbuf due to starvation of CPU cycles.

Uses of ringbuf's data_event elsewhere in Synchronet don't seem to be subject
to this issue since they always call RingBufRead after, which will clear the
data_event when the ringbuf is actually empty (no similar loops to this one).

The root cause just appears to be a simple copy/paste issue in the code added
to RingBufInit(): the preexisting 'empty_event' was initialized with a
correct initiate state of 'signaled' (because by default a ringbuf is empty)
but the newly added events (data_event and highwater_event) should *not* be
initially-signaled because... the ringbuf is empty. So I added some parameter
comments to these calls to CreateEvent() to hopefully make that more clear
and prevent similar mistakes in the future.

Relieved to have this one resolved finally.
parent 2e934268
No related branches found
No related tags found
1 merge request!463MRC mods by Codefenix (2024-10-20)
Pipeline #3972 passed
/* ringbuf.c */
/* Synchronet ring buffer routines */
/* $Id: ringbuf.c,v 1.32 2019/08/26 23:37:52 rswindell Exp $ */
/****************************************************************************
* @format.tab-size 4 (Plain Text/Source Code File Header) *
* @format.use-tabs true (see http://www.synchro.net/ptsc_hdr.html) *
......@@ -17,21 +13,9 @@
* See the GNU General Public License for more details: gpl.txt or *
* http://www.fsf.org/copyleft/gpl.html *
* *
* Anonymous FTP access to the most recent released source is available at *
* ftp://vert.synchro.net, ftp://cvs.synchro.net and ftp://ftp.synchro.net *
* *
* Anonymous CVS access to the development source and modification history *
* is available at cvs.synchro.net:/cvsroot/sbbs, example: *
* cvs -d :pserver:anonymous@cvs.synchro.net:/cvsroot/sbbs login *
* (just hit return, no password is necessary) *
* cvs -d :pserver:anonymous@cvs.synchro.net:/cvsroot/sbbs checkout src *
* *
* For Synchronet coding style and modification guidelines, see *
* http://www.synchro.net/source.html *
* *
* You are encouraged to submit any modifications (preferably in Unix diff *
* format) via e-mail to mods@synchro.net *
* *
* Note: If this box doesn't appear square, then you need to fix your tabs. *
****************************************************************************/
......@@ -96,9 +80,9 @@ int RingBufInit( RingBuf* rb, DWORD size
rb->pEnd=rb->pStart+size;
rb->size=size;
#ifdef RINGBUF_EVENT
rb->data_event=CreateEvent(NULL,TRUE,TRUE,NULL);
rb->highwater_event=CreateEvent(NULL,TRUE,TRUE,NULL);
rb->empty_event=CreateEvent(NULL,TRUE,TRUE,NULL);
rb->data_event=CreateEvent(NULL, /* ManualReset: */TRUE, /* InitialState: */FALSE, NULL);
rb->highwater_event=CreateEvent(NULL, /* ManualReset: */TRUE, /* InitialState: */FALSE, NULL);
rb->empty_event=CreateEvent(NULL, /* ManualReset: */TRUE, /* InitialState: */TRUE, NULL);
#endif
#ifdef RINGBUF_MUTEX
pthread_mutex_init(&rb->mutex,NULL);
......
  • Author Owner

    The correct culprit commit was a2aa9631 (the one after fafed...)

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment