Skip to content
Snippets Groups Projects
  • Rob Swindell's avatar
    ca7ab040
    Add a 60-second timeout to sbbs_t::passthru_socket_activate() · ca7ab040
    Rob Swindell authored
    Keyop reported an issue via irc whereby a user that failed to download a file
    would leave the node "hung" in "downloading via telnet" node status even
    though the user had long since disconnected and the log reflected that the
    terminal server was aware of this:
    
    term Node 4 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
    term Node 4 <user> sexyz: !zmodem_recv_header TIMEOUT
    term Node 4 <user> external Timeout waiting for output buffer to empty
    <minutes later>
    term Node 4 connection reset by peer on send
    term Node 4 !ERROR 32 sending on socket 102
    term Node 4 !ERROR 32 sending on socket 102
    term Node 4 !ERROR 32 sending on socket 102
    term Node 4 !ERROR 32 sending on socket 102
    term Node 4 !ERROR 32 sending on socket 102
    term Node 4 disconnected
    term Node 4 !ERROR 32 sending on socket 102
    
    and
    
    term Node 3 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
    term Node 3 <user> sexyz: !zmodem_recv_header TIMEOUT
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
    term Node 3 <user> sexyz: !zmodem_recv_header TIMEOUT
    term Node 3 <user> external Timeout waiting for output buffer to empty
    <minutes later>
    term Node 3 connection reset by peer on receive
    term Node 3 !ERROR 32 sending on socket 96
    
    These nodes were then locked up in call to passthru_socket_activate(false)
    as reported by gdb, e.g.
    
    Looking at passthru_socket_activate(), the deactivation path (called at the
    end of external() in this case), it was clear that this could be an infinite
    loop in the case the user had disconnected:
    
        do { // Allow time for the passthru_thread to move any pending socket data to the outbuf
            SLEEP(100); // Before the node_thread starts sending its own data to the outbuf
        } while(RingBufFull(&outbuf));
    
    These flush/purge loops aren't strictly needed if the user has disconnected,
    but as can be seen by the above logs, the terminal server may not know that
    (the socket may not indicate disconnect) before passthru_socket_activate()
    is called by external().
    
    So... worst case, just do the activation and deactivation buffer flushes
    and purges for 60 seconds.
    ca7ab040
    History
    Add a 60-second timeout to sbbs_t::passthru_socket_activate()
    Rob Swindell authored
    Keyop reported an issue via irc whereby a user that failed to download a file
    would leave the node "hung" in "downloading via telnet" node status even
    though the user had long since disconnected and the log reflected that the
    terminal server was aware of this:
    
    term Node 4 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
    term Node 4 <user> sexyz: !zmodem_recv_header TIMEOUT
    term Node 4 <user> external Timeout waiting for output buffer to empty
    <minutes later>
    term Node 4 connection reset by peer on send
    term Node 4 !ERROR 32 sending on socket 102
    term Node 4 !ERROR 32 sending on socket 102
    term Node 4 !ERROR 32 sending on socket 102
    term Node 4 !ERROR 32 sending on socket 102
    term Node 4 !ERROR 32 sending on socket 102
    term Node 4 disconnected
    term Node 4 !ERROR 32 sending on socket 102
    
    and
    
    term Node 3 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
    term Node 3 <user> sexyz: !zmodem_recv_header TIMEOUT
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !Receive timeout (1 seconds)
    term Node 3 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
    term Node 3 <user> sexyz: !zmodem_recv_header TIMEOUT
    term Node 3 <user> external Timeout waiting for output buffer to empty
    <minutes later>
    term Node 3 connection reset by peer on receive
    term Node 3 !ERROR 32 sending on socket 96
    
    These nodes were then locked up in call to passthru_socket_activate(false)
    as reported by gdb, e.g.
    
    Looking at passthru_socket_activate(), the deactivation path (called at the
    end of external() in this case), it was clear that this could be an infinite
    loop in the case the user had disconnected:
    
        do { // Allow time for the passthru_thread to move any pending socket data to the outbuf
            SLEEP(100); // Before the node_thread starts sending its own data to the outbuf
        } while(RingBufFull(&outbuf));
    
    These flush/purge loops aren't strictly needed if the user has disconnected,
    but as can be seen by the above logs, the terminal server may not know that
    (the socket may not indicate disconnect) before passthru_socket_activate()
    is called by external().
    
    So... worst case, just do the activation and deactivation buffer flushes
    and purges for 60 seconds.