CLOSE_WAIT狀態(tài)解決方案
這個問題之前沒有怎么留意過,是最近在面試過程中遇到的一個問題,面了兩家公司,兩家公司竟然都面到到了這個問題,不得不使我開始關(guān)注這個問題。說起CLOSE_WAIT狀態(tài),如果不知道的話,還是先瞧一下TCP的狀態(tài)轉(zhuǎn)移圖吧。
關(guān)閉socket分為主動關(guān)閉(Active closure)和被動關(guān)閉(Passive closure)兩種情況。前者是指有本地主機主動發(fā)起的關(guān)閉;而后者則是指本地主機檢測到遠程主機發(fā)起關(guān)閉之后,作出回應(yīng),從而關(guān)閉整個連接。將關(guān)閉部分的狀態(tài)轉(zhuǎn)移摘出來,就得到了下圖:
產(chǎn)生原因
通過圖上,我們來分析,什么情況下,連接處于CLOSE_WAIT狀態(tài)呢?
在被動關(guān)閉連接情況下,在已經(jīng)接收到FIN,但是還沒有發(fā)送自己的FIN的時刻,連接處于CLOSE_WAIT狀態(tài)。
通常來講,CLOSE_WAIT狀態(tài)的持續(xù)時間應(yīng)該很短,正如SYN_RCVD狀態(tài)。但是在一些特殊情況下,就會出現(xiàn)連接長時間處于CLOSE_WAIT狀態(tài)的情況。
出現(xiàn)大量close_wait的現(xiàn)象,主要原因是某種情況下對方關(guān)閉了socket鏈接,但是我方忙與讀或者寫,沒有關(guān)閉連接。代碼需要判斷socket,一旦讀到0,斷開連接,read返回負,檢查一下errno,如果不是AGAIN,就斷開連接。
參考資料4中描述,通過發(fā)送SYN-FIN報文來達到產(chǎn)生CLOSE_WAIT狀態(tài)連接,沒有進行具體實驗。不過個人認為協(xié)議棧會丟棄這種非法報文,感興趣的同學(xué)可以測試一下,然后把結(jié)果告訴我;-)
為了更加清楚的說明這個問題,我們寫一個測試程序,注意這個測試程序是有缺陷的。
只要我們構(gòu)造一種情況,使得對方關(guān)閉了socket,我們還在read,或者是直接不關(guān)閉socket就會構(gòu)造這樣的情況。
server.c:
#include <stdio.h> #include <string.h> #include <netinet/in.h> #define MAXLINE 80 #define SERV_PORT 8000 int main(void) { struct sockaddr_in servaddr, cliaddr; socklen_t cliaddr_len; int listenfd, connfd; char buf[MAXLINE]; char str[INET_ADDRSTRLEN]; int i, n; listenfd = socket(AF_INET, SOCK_STREAM, 0); int opt = 1; setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt)); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(SERV_PORT); bind(listenfd, (struct sockaddr *)&servaddr, sizeof(servaddr)); listen(listenfd, 20); printf("Accepting connections ...\n"); while (1) { cliaddr_len = sizeof(cliaddr); connfd = accept(listenfd, (struct sockaddr *)&cliaddr, &cliaddr_len); //while (1) { n = read(connfd, buf, MAXLINE); if (n == 0) { printf("the other side has been closed.\n"); break; } printf("received from %s at PORT %d\n", inet_ntop(AF_INET, &cliaddr.sin_addr, str, sizeof(str)), ntohs(cliaddr.sin_port)); for (i = 0; i < n; i++) buf[i] = toupper(buf[i]); write(connfd, buf, n); } //這里故意不關(guān)閉socket,或者是在close之前加上一個sleep都可以 //sleep(5); //close(connfd); } }
client.c:
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/socket.h> #include <netinet/in.h> #define MAXLINE 80 #define SERV_PORT 8000 int main(int argc, char *argv[]) { struct sockaddr_in servaddr; char buf[MAXLINE]; int sockfd, n; char *str; if (argc != 2) { fputs("usage: ./client message\n", stderr); exit(1); } str = argv[1]; sockfd = socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; inet_pton(AF_INET, "127.0.0.1", &servaddr.sin_addr); servaddr.sin_port = htons(SERV_PORT); connect(sockfd, (struct sockaddr *)&servaddr, sizeof(servaddr)); write(sockfd, str, strlen(str)); n = read(sockfd, buf, MAXLINE); printf("Response from server:\n"); write(STDOUT_FILENO, buf, n); write(STDOUT_FILENO, "\n", 1); close(sockfd); return 0; }
結(jié)果如下:
debian-wangyao:~$ ./client a Response from server: A debian-wangyao:~$ ./client b Response from server: B debian-wangyao:~$ ./client c Response from server: C debian-wangyao:~$ netstat -antp | grep CLOSE_WAIT (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) tcp 1 0 127.0.0.1:8000 127.0.0.1:58309 CLOSE_WAIT 6979/server tcp 1 0 127.0.0.1:8000 127.0.0.1:58308 CLOSE_WAIT 6979/server tcp 1 0 127.0.0.1:8000 127.0.0.1:58307 CLOSE_WAIT 6979/server
解決方法
基本的思想就是要檢測出對方已經(jīng)關(guān)閉的socket,然后關(guān)閉它。
1.代碼需要判斷socket,一旦read返回0,斷開連接,read返回負,檢查一下errno,如果不是AGAIN,也斷開連接。(注:在UNP 7.5節(jié)的圖7.6中,可以看到使用select能夠檢測出對方發(fā)送了FIN,再根據(jù)這條規(guī)則就可以處理CLOSE_WAIT的連接)
2.給每一個socket設(shè)置一個時間戳last_update,每接收或者是發(fā)送成功數(shù)據(jù),就用當(dāng)前時間更新這個時間戳。定期檢查所有的時間戳,如果時間戳與當(dāng)前時間差值超過一定的閾值,就關(guān)閉這個socket。
3.使用一個Heart-Beat線程,定期向socket發(fā)送指定格式的心跳數(shù)據(jù)包,如果接收到對方的RST報文,說明對方已經(jīng)關(guān)閉了socket,那么我們也關(guān)閉這個socket。
4.設(shè)置SO_KEEPALIVE選項,并修改內(nèi)核參數(shù)
前提是啟用socket的KEEPALIVE機制:
//啟用socket連接的KEEPALIVE
int iKeepAlive = 1;
setsockopt(s, SOL_SOCKET, SO_KEEPALIVE, (void *)&iKeepAlive, sizeof(iKeepAlive));
tcp_keepalive_intvl (integer; default: 75; since Linux 2.4)
The number of seconds between TCP keep-alive probes.
tcp_keepalive_probes (integer; default: 9; since Linux 2.2)
The maximum number of TCP keep-alive probes to send before giving up and killing the connection if no response is obtained from the other end.
tcp_keepalive_time (integer; default: 7200; since Linux 2.2)
The number of seconds a connection needs to be idle before TCP begins sending out keep-alive probes. Keep-alives are only sent when the SO_KEEPALIVE socket option is enabled. The default value is 7200 seconds (2 hours). An idle connec‐tion is terminated after approximately an additional 11 minutes (9 probes an interval of 75 seconds apart) when keep-alive is enabled.
echo 120 > /proc/sys/net/ipv4/tcp_keepalive_time
echo 2 > /proc/sys/net/ipv4/tcp_keepalive_intvl
echo 1 > /proc/sys/net/ipv4/tcp_keepalive_probes
除了修改內(nèi)核參數(shù)外,可以使用setsockopt修改socket參數(shù),參考man 7 socket。
int KeepAliveProbes=1; int KeepAliveIntvl=2; int KeepAliveTime=120; setsockopt(s, IPPROTO_TCP, TCP_KEEPCNT, (void *)&KeepAliveProbes, sizeof(KeepAliveProbes)); setsockopt(s, IPPROTO_TCP, TCP_KEEPIDLE, (void *)&KeepAliveTime, sizeof(KeepAliveTime)); setsockopt(s, IPPROTO_TCP, TCP_KEEPINTVL, (void *)&KeepAliveIntvl, sizeof(KeepAliveIntvl));
到此這篇關(guān)于CLOSE_WAIT狀態(tài)解決方案的文章就介紹到這了,更多相關(guān)CLOSE_WAIT狀態(tài)內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
C++學(xué)習(xí)之如何進行內(nèi)存資源管理
與java、golang等自帶垃圾回收機制的語言不同,C++并不會自動回收內(nèi)存,這往往會導(dǎo)致內(nèi)存泄漏和內(nèi)存溢出等問題,所以掌握C++中的內(nèi)存管理技巧和工具是非常重要的,本文就來和大家詳細講講2023-05-05