From ca7ab040466b030281a9aacabbb8bddce803b0e3 Mon Sep 17 00:00:00 2001
From: "Rob Swindell (on Windows)" <rob@synchro.net>
Date: Fri, 2 Jun 2023 17:36:15 -0700
Subject: [PATCH] Add a 60-second timeout to sbbs_t::passthru_socket_activate()

Keyop reported an issue via irc whereby a user that failed to download a file
would leave the node "hung" in "downloading via telnet" node status even
though the user had long since disconnected and the log reflected that the
terminal server was aware of this:

term Node 4 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
term Node 4 <user> sexyz: !zmodem_recv_header TIMEOUT
term Node 4 <user> external Timeout waiting for output buffer to empty
<minutes later>
term Node 4 connection reset by peer on send
term Node 4 !ERROR 32 sending on socket 102
term Node 4 !ERROR 32 sending on socket 102
term Node 4 !ERROR 32 sending on socket 102
term Node 4 !ERROR 32 sending on socket 102
term Node 4 !ERROR 32 sending on socket 102
term Node 4 disconnected
term Node 4 !ERROR 32 sending on socket 102

and

term Node 3 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
term Node 3 <user> sexyz: !zmodem_recv_header TIMEOUT
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
term Node 3 <user> sexyz: !zmodem_recv_header TIMEOUT
term Node 3 <user> external Timeout waiting for output buffer to empty
<minutes later>
term Node 3 connection reset by peer on receive
term Node 3 !ERROR 32 sending on socket 96

These nodes were then locked up in call to passthru_socket_activate(false)
as reported by gdb, e.g.

Looking at passthru_socket_activate(), the deactivation path (called at the
end of external() in this case), it was clear that this could be an infinite
loop in the case the user had disconnected:

    do { // Allow time for the passthru_thread to move any pending socket data to the outbuf
        SLEEP(100); // Before the node_thread starts sending its own data to the outbuf
    } while(RingBufFull(&outbuf));

These flush/purge loops aren't strictly needed if the user has disconnected,
but as can be seen by the above logs, the terminal server may not know that
(the socket may not indicate disconnect) before passthru_socket_activate()
is called by external().

So... worst case, just do the activation and deactivation buffer flushes
and purges for 60 seconds.
---
 src/sbbs3/main.cpp | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/sbbs3/main.cpp b/src/sbbs3/main.cpp
index 0f4f35e9f8..f0fcf15992 100644
--- a/src/sbbs3/main.cpp
+++ b/src/sbbs3/main.cpp
@@ -2196,9 +2196,11 @@ void input_thread(void *arg)
 // to eliminate any stale data from the previous passthru session
 void sbbs_t::passthru_socket_activate(bool activate)
 {
+	time_t timeout = time(NULL) + 60;
+
 	if(activate) {
 		BOOL rd = FALSE;
-		while(socket_check(client_socket_dup, &rd, /* wr_p */NULL, /* timeout */0) && rd) {
+		while(socket_check(client_socket_dup, &rd, /* wr_p */NULL, /* timeout */0) && rd && time(NULL) < timeout) {
 			char ch;
 			if(recv(client_socket_dup, &ch, sizeof(ch), /* flags: */0) != sizeof(ch))
 				break;
@@ -2215,7 +2217,7 @@ void sbbs_t::passthru_socket_activate(bool activate)
 
 		do { // Allow time for the passthru_thread to move any pending socket data to the outbuf
 			SLEEP(100); // Before the node_thread starts sending its own data to the outbuf
-		} while(RingBufFull(&outbuf));
+		} while(RingBufFull(&outbuf) && time(NULL) < timeout);
 	}
 	passthru_socket_active = activate;
 }
-- 
GitLab