www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - std.socket - problems closing socket

reply simendsjo <simendsjo gmail.com> writes:
Not sure if this is a problem with std.socket, nginx or my knowledge of 
sockets. I'm pretty sure it's the last one.

I'm experimenting with fastcgi on nginx, and the socket stays in 
TIME_WAIT even after I call
   socket.shutdown(SocketShutdown.BOTH);
   socket.close();

(Crossposted from SO: 
http://stackoverflow.com/questions/7616601/nginx-fastcgi-and-open-sockets)
Sep 30 2011
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Sat, 01 Oct 2011 00:26:35 +0100, simendsjo <simendsjo gmail.com> wrote:

 Not sure if this is a problem with std.socket, nginx or my knowledge of  
 sockets. I'm pretty sure it's the last one.

 I'm experimenting with fastcgi on nginx, and the socket stays in  
 TIME_WAIT even after I call
    socket.shutdown(SocketShutdown.BOTH);
    socket.close();

 (Crossposted from SO:  
 http://stackoverflow.com/questions/7616601/nginx-fastcgi-and-open-sockets)
For a "graceful" close you're supposed to ensure there is no data pending. To do that you: shutdown(SD_SEND); // send only, not recv <enter a loop reading all data remaining on the socket> close(); The loop should read until recv returns 0. If recv returns -1 and the socket is blocking it should error/exit. If recv returns -1 and the socket is non-blocking it should check for [WSA]EWOULDBLOCK (and select/sleep + loop) or error/exit. The reason to do this is to flush all the data from the socket buffers on the remote and local ends, otherwise a close can cause remote buffered data to cause a "connection broken" error on the remote end, and/or (I am guessing a little here) may cause the socket to close while negotiating a graceful close, and/or remain in a TIME_WAIT state due to buffered data or data "in flight". .. are you setting any close options/timeouts i.e. LINGER? -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 03 2011
parent reply simendsjo <simendsjo gmail.com> writes:
On 03.10.2011 11:36, Regan Heath wrote:
 For a "graceful" close you're supposed to ensure there is no data
 pending.  To do that you:

 shutdown(SD_SEND);  // send only, not recv
 <enter a loop reading all data remaining on the socket>
 close();

 The loop should read until recv returns 0.  If recv returns -1 and the
 socket is blocking it should error/exit.  If recv returns -1 and the
 socket is non-blocking it should check for [WSA]EWOULDBLOCK (and
 select/sleep + loop) or error/exit.

 The reason to do this is to flush all the data from the socket buffers
 on the remote and local ends, otherwise a close can cause remote
 buffered data to cause a "connection broken" error on the remote end,
 and/or (I am guessing a little here) may cause the socket to close while
 negotiating a graceful close, and/or remain in a TIME_WAIT state due to
 buffered data or data "in flight".

 ... are you setting any close options/timeouts i.e. LINGER?
Thanks. recv returns -1 for many requests. The errors are only WSAECONNABORTED and WSAECONNRESET as described here: http://msdn.microsoft.com/en-us/library/ms740668.aspx I'm doing socket.shutdown(SocketShutdown.SEND) now after sending all my data and reading until I receive 0 or -1. (doesn't really matter as sending the FastCGI EndRequest makes the server shut it down as it doesn't handle multiplexing) I have tried with linger too, but it doesn't help: socket.setOption(SocketOptionLevel.SOCKET, SocketOption.LINGER, std.socket.linger(1, 30)); Could this be caused by some bad settings on the webserver? PS: Seems my computer can handle about 16000 TIME_WAIT before it starts "hanging".
Oct 03 2011
next sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 03 Oct 2011 12:57:56 +0100, simendsjo <simendsjo gmail.com> wrote:

 On 03.10.2011 11:36, Regan Heath wrote:
 For a "graceful" close you're supposed to ensure there is no data
 pending.  To do that you:

 shutdown(SD_SEND);  // send only, not recv
 <enter a loop reading all data remaining on the socket>
 close();

 The loop should read until recv returns 0.  If recv returns -1 and the
 socket is blocking it should error/exit.  If recv returns -1 and the
 socket is non-blocking it should check for [WSA]EWOULDBLOCK (and
 select/sleep + loop) or error/exit.

 The reason to do this is to flush all the data from the socket buffers
 on the remote and local ends, otherwise a close can cause remote
 buffered data to cause a "connection broken" error on the remote end,
 and/or (I am guessing a little here) may cause the socket to close while
 negotiating a graceful close, and/or remain in a TIME_WAIT state due to
 buffered data or data "in flight".

 ... are you setting any close options/timeouts i.e. LINGER?
Thanks.
:)
 recv returns -1 for many requests. The errors are only WSAECONNABORTED  
 and WSAECONNRESET as described here:  
 http://msdn.microsoft.com/en-us/library/ms740668.aspx
To help me understand (I know nothing about fastcgi or nginx) can you clarify... 1. Your D code is the client side, connecting to the web server and sending GET/POST style requests? 2. You get these ABORTED and RESET errors on the client side? If yes to all the above, then it sounds like the web server/fastcgi is closing the socket without reading all the data you're sending, which probably means you're sending something it's not expecting. I would start by verifying exactly what data you're sending, and that it's all expected by the remote end.
 I'm doing socket.shutdown(SocketShutdown.SEND) now after sending all my  
 data and reading until I receive 0 or -1. (doesn't really matter as  
 sending the FastCGI EndRequest makes the server shut it down as it  
 doesn't handle multiplexing)
So, the socket closure is initiated by fastcgi/the web server. This supports the theory that it's not reading some of your data, because it's not expecting it, and this is likely the cause of the ABORT/RESET errors you're seeing.
 I have tried with linger too, but it doesn't help:
 socket.setOption(SocketOptionLevel.SOCKET, SocketOption.LINGER,  
 std.socket.linger(1, 30));
The default LINGER options should be fine, as-is. But, double check the D socket code just in case it is setting different LINGER options by default (I haven't used it, or looked myself, sorry).
 Could this be caused by some bad settings on the webserver?
It is possible, but I would double check your requests first. There may be a setting, or settings for aborting connections which take too long, or fail to send certain data, or connect from the wrong IP, or... If your requests are otherwise working, then I suspect you're sending some 'extra' data which is not being read.
 PS: Seems my computer can handle about 16000 TIME_WAIT before it starts  
 "hanging".
You'll be running out of operating system handles or similar at that point :p -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 03 2011
parent reply simendsjo <simendsjo gmail.com> writes:
On 03.10.2011 16:16, Regan Heath wrote:
 On Mon, 03 Oct 2011 12:57:56 +0100, simendsjo <simendsjo gmail.com> wrote:
(...)
 To help me understand (I know nothing about fastcgi or nginx) can you
 clarify...
 1. Your D code is the client side, connecting to the web server and
 sending GET/POST style requests?
 2. You get these ABORTED and RESET errors on the client side?


 If yes to all the above, then it sounds like the web server/fastcgi is
 closing the socket without reading all the data you're sending, which
 probably means you're sending something it's not expecting. I would
 start by verifying exactly what data you're sending, and that it's all
 expected by the remote end.
Yes. I've coded the client as follows: 1) start listening socket 2) wait for incoming connections or incoming data 3) receive(). If a socket returns 0 or -1, close it and process next with data 4) read fastcgi request from server 5) write fastcgi response 6) write fastcgi EndRequest (the server should now end the request) 7) if the application should close the request, send shutdown(send) 8) accept incoming connection 9) back to 2) FastCGI connections works in one of two ways: the server is responsible for closing the connections (supports mulitplexing) or the application should close the connection after a request has been sent. For the latter I send SocketShutdown.SEND after writing EndRequest in step 6), but it doesn't really matter as nginx doesn't support multiplexing. It closes the connection after each request anyway. I see the same result no matter what option I use. I'm running the exact same request and writing the exact same response for all queries, so there shouldn't be any unknown fields. I also only get an error on <1/5 of the requests, and even when the error occurs, the response has been written completely to the browser.
 I'm doing socket.shutdown(SocketShutdown.SEND) now after sending all
 my data and reading until I receive 0 or -1. (doesn't really matter as
 sending the FastCGI EndRequest makes the server shut it down as it
 doesn't handle multiplexing)
So, the socket closure is initiated by fastcgi/the web server. This supports the theory that it's not reading some of your data, because it's not expecting it, and this is likely the cause of the ABORT/RESET errors you're seeing.
 I have tried with linger too, but it doesn't help:
 socket.setOption(SocketOptionLevel.SOCKET, SocketOption.LINGER,
 std.socket.linger(1, 30));
The default LINGER options should be fine, as-is. But, double check the D socket code just in case it is setting different LINGER options by default (I haven't used it, or looked myself, sorry).
Linger is default off, but it doesn't help to turn it on. RCV/SNDTIMO is also set to 0.
 Could this be caused by some bad settings on the webserver?
It is possible, but I would double check your requests first. There may be a setting, or settings for aborting connections which take too long, or fail to send certain data, or connect from the wrong IP, or... If your requests are otherwise working, then I suspect you're sending some 'extra' data which is not being read.
The requests are handled in ~1msec, so there shouldn't be any timeouts. The default timeout on nginx for fastcgi is 60 seconds too. I can easily process ~200 requests per second (and nginx and my server doesn't break a sweat, it's my curl spammers that's using all the cpu)
 PS: Seems my computer can handle about 16000 TIME_WAIT before it
 starts "hanging".
You'll be running out of operating system handles or similar at that point :p
Yup. I'll probably never have that problem in a production environment though :)
Oct 03 2011
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 03 Oct 2011 17:33:57 +0100, simendsjo <simendsjo gmail.com> wrote:
 Yes. I've coded the client as follows:
 1) start listening socket
 2) wait for incoming connections or incoming data
 3) receive(). If a socket returns 0 or -1, close it and process next  
 with data
 4) read fastcgi request from server
 5) write fastcgi response
 6) write fastcgi EndRequest (the server should now end the request)
 7) if the application should close the request, send shutdown(send)
 8) accept incoming connection
 9) back to 2)

 FastCGI connections works in one of two ways: the server is responsible  
 for closing the connections (supports mulitplexing) or the application  
 should close the connection after a request has been sent.
 For the latter I send SocketShutdown.SEND after writing EndRequest in  
 step 6), but it doesn't really matter as nginx doesn't support  
 multiplexing. It closes the connection after each request anyway. I see  
 the same result no matter what option I use.
Ok, so your "client" (that you have coded) is also the "application" you refer to in the bit about FastCGI above? Or are there 2 components here, and are both written in D? Does the fastcgi "EndRequest" close the socket/connection? If so, doing a socket.Shutdown /after/ this is not going to work as the socket has already been closed (which implicitly does a shutdown(BOTH)). In that case, try doing the shutdown /before/ the EndRequest, and make sure you also read any/all data remaining on the socket before doing the EndRequest/close. The key question seems to be, at which point does nginx close the connection? and therefore, is there any unread data on the socket (at either end) when it does. If, for example, it flushes the response to the other end, but does not wait for it to be read, and closes the socket, you will get CONNRESET/ABORTED errors on the other end.
 I'm running the exact same request and writing the exact same response  
 for all queries, so there shouldn't be any unknown fields.
I didn't mean unknown "field" I mean extra data of any kind, but I suspect you're using an API to form the requests etc so this is probably not the case.
 I also only get an error on <1/5 of the requests, and even when the  
 error occurs, the response has been written completely to the browser.
Ahh, ok, I believe the problem is simply the timing of the 'close/EndRequest'. Sometimes it happens /before/ the data has been completely read (1/5), other times after (4/5). R
Oct 03 2011
parent reply simendsjo <simendsjo gmail.com> writes:
On 03.10.2011 20:02, Regan Heath wrote:
 On Mon, 03 Oct 2011 17:33:57 +0100, simendsjo <simendsjo gmail.com> wrote:
 Yes. I've coded the client as follows:
 1) start listening socket
 2) wait for incoming connections or incoming data
 3) receive(). If a socket returns 0 or -1, close it and process next
 with data
 4) read fastcgi request from server
 5) write fastcgi response
 6) write fastcgi EndRequest (the server should now end the request)
 7) if the application should close the request, send shutdown(send)
 8) accept incoming connection
 9) back to 2)

 FastCGI connections works in one of two ways: the server is
 responsible for closing the connections (supports mulitplexing) or the
 application should close the connection after a request has been sent.
 For the latter I send SocketShutdown.SEND after writing EndRequest in
 step 6), but it doesn't really matter as nginx doesn't support
 multiplexing. It closes the connection after each request anyway. I
 see the same result no matter what option I use.
Ok, so your "client" (that you have coded) is also the "application" you refer to in the bit about FastCGI above? Or are there 2 components here, and are both written in D?
It's just one component to handle FastCGI requests. I didn't want to rely on the external libfcgi.
 Does the fastcgi "EndRequest" close the socket/connection? If so, doing
 a socket.Shutdown /after/ this is not going to work as the socket has
 already been closed (which implicitly does a shutdown(BOTH)). In that
 case, try doing the shutdown /before/ the EndRequest, and make sure you
 also read any/all data remaining on the socket before doing the
 EndRequest/close.
EndRequest doesn't really close the socket, it's just a message to the server telling that the full response is written (request handled). If the server (nginx) is responsible, it can reuse the connection to give other requests. If the server says that the application is responsible, shutdown(send) is called. This is part of the specification.
 The key question seems to be, at which point does nginx close the
 connection? and therefore, is there any unread data on the socket (at
 either end) when it does. If, for example, it flushes the response to
 the other end, but does not wait for it to be read, and closes the
 socket, you will get CONNRESET/ABORTED errors on the other end.

 I'm running the exact same request and writing the exact same response
 for all queries, so there shouldn't be any unknown fields.
I didn't mean unknown "field" I mean extra data of any kind, but I suspect you're using an API to form the requests etc so this is probably not the case.
 I also only get an error on <1/5 of the requests, and even when the
 error occurs, the response has been written completely to the browser.
Ahh, ok, I believe the problem is simply the timing of the 'close/EndRequest'. Sometimes it happens /before/ the data has been completely read (1/5), other times after (4/5). R
It seems nginx is to blame here, and not me. I tried Lighttp and it works. It gives several EWOULDBLOCK, but I can just handle these again with no problem. I should have tried this sooner... I've used a lot of time trying to track down these problems :| Thanks for all your help - I'll update this thread if I find a solution to the nginx issue.
Oct 03 2011
next sibling parent simendsjo <simendsjo gmail.com> writes:
On 03.10.2011 20:41, simendsjo wrote:
 It seems nginx is to blame here, and not me. I tried Lighttp and it
 works. It gives several EWOULDBLOCK, but I can just handle these again
 with no problem. I should have tried this sooner... I've used a lot of
 time trying to track down these problems :|

 Thanks for all your help - I'll update this thread if I find a solution
 to the nginx issue.
Well, that was quick... Seems I was running a development version of nginx. I downloaded the stable version, and things work as expected - I can finally try to get some actual coding done :)
Oct 03 2011
prev sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 03 Oct 2011 19:41:14 +0100, simendsjo <simendsjo gmail.com> wrote:
 It seems nginx is to blame here, and not me. I tried Lighttp and it  
 works. It gives several EWOULDBLOCK, but I can just handle these again  
 with no problem. I should have tried this sooner... I've used a lot of  
 time trying to track down these problems :|
EWOULDBLOCK is to be expected, it simply means you've tried to read when there is no data available, before the close/shutdown(SEND) from the other end. :) -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 04 2011
prev sibling parent "Regan Heath" <regan netmail.co.nz> writes:
This might be a useful read..
http://msdn.microsoft.com/en-us/library/windows/desktop/ms738547(v=vs.85).aspx
Oct 03 2011