Most of the 11 TCP states are pretty easy to understand and most programmers know what they mean:
- CLOSED: There is no connection.
- LISTEN: The local end-point is waiting for a connection request from a remote end-point i.e. a passive open was performed.
- SYN-SENT: The first step of the three-way connection handshake was performed. A connection request has been sent to a remote end-point i.e. an active open was performed.
- SYN-RECEIVED: The second step of the three-way connection handshake was performed. An acknowledgement for the received connection request as well as a connection request has been sent to the remote end-point.
- ESTABLISHED: The third step of the three-way connection handshake was performed. The connection is open.
- FIN-WAIT-1: The first step of an active close (four-way handshake) was performed. The local end-point has sent a connection termination request to the remote end-point.
- CLOSE-WAIT: The local end-point has received a connection termination request and acknowledged it e.g. a passive close has been performed and the local end-point needs to perform an active close to leave this state.
- FIN-WAIT-2: The remote end-point has sent an acknowledgement for the previously sent connection termination request. The local end-point waits for an active connection termination request from the remote end-point.
- LAST-ACK: The local end-point has performed a passive close and has initiated an active close by sending a connection termination request to the remote end-point.
- CLOSING: The local end-point is waiting for an acknowledgement for a connection termination request before going to the TIME-WAIT state.
- TIME-WAIT: The local end-point waits for twice the maximum segment lifetime (MSL) to pass before going to CLOSED to be sure that the remote end-point received the acknowledgement.
Most people working with high-level programming languages actually only really know the states CLOSED, LISTEN and ESTABLISHED. Using netstat the chances are that you will not see connections in the SYN_SENT, SYN_RECV, FIN_WAIT_1, LAST_ACK or CLOSING states. A TCP end-point usually stays in these states for only a very short period of time and if many connections get stuck for a longer time in these states, something really bad happened.
FIN_WAIT_2, TIME_WAIT and CLOSE_WAIT are more common. They are all related to the connection termination four-way handshake. Here is a short overview of the states involved:
The upper part shows the states on the end-point initiating the termination. The lower part the states on the other end-point.
So the initiating end-point (i.e. the client) sends a termination request to the server and waits for an acknowledgement in state FIN-WAIT-1. The server sends an acknowledgement and goes in state CLOSE_WAIT. The client goes into FIN-WAIT-2 when the acknowledgement is received and waits for an active close. When the server actively sends its own termination request, it goes into LAST-ACK and waits for an acknowledgement from the client. When the client receives the termination request from the server, it sends an acknowledgement and goes into TIME_WAIT and after some time into CLOSED. The server goes into CLOSED state once it receives the acknowledgement from the client.
If many sockets which were connected to a specific remote application end up stuck in this state, it usually indicates that the remote application either always dies unexpectedly when in the CLOSE_WAIT state or just fails to perform an active close after the passive close.
The timeout for sockets in the FIN-WAIT-2 state is defined with the parameter tcp_fin_timeout. You should set it to value high enough so that if the remote end-point is going to perform an active close, it will have time to do it. On the other hand sockets in this state do use some memory (even though not much) and this could lead to a memory overflow if too many sockets are stuck in this state for too long.
The TIME-WAIT state means that from the local end-point point of view, the connection is closed but we’re still waiting before accepting a new connection in order to prevent delayed duplicate packets from the previous connection from being accepted by the new connection.
In this state, TCP blocks any second connection between these address/port pairs until the TIME_WAIT state is exited after waiting for twice the maximum segment lifetime (MSL).
In most cases, seeing many TIME_WAIT connection doesn’t show any issue. You only have to start worrying when the number of TIME_WAIT connections cause performance problems or a memory overflow.
If you see that connections related to a given process tend to always end up in the CLOSE_WAIT state, it means that this process does not perform an active close after the passive close. When you write a program communicating over TCP, you should detect when the connection was closed by the remote host and close the socket appropriately. If you fail to do this the socket will stay in the CLOSE_WAIT until the process itself disappears.
So basically, CLOSE_WAIT means the operating system knows that the remote application has closed the connection and waits for the local application to also do so. So you shouldn’t try and tune the TCP parameters to solve this but check the application owning the connection on the local host. Since there is no CLOSE_WAIT timeout, a connection can stay in this state forever (or at least until the program does eventually close the connection or the process exists or is killed).
If you cannot fix the application or have it fixed, the solution is to kill the process holding the connection open. Of course, there is still a risk of losing data since the local end-point may still send data it has in a buffer. Also, if many applications run in the same process (as it is the case for Java Enterprise applications), killing the owning process is not always an option.
I haven’t ever tried to force closing of a CLOSE_WAIT connection using tcpkill, killcx or cutter but if you can’t kill or restart the process holding the connection, it might be an option.