It’s also
important to know that this client and server engage in a
proprietary sort of discussion, and so use the port number 50007 outside
the range reserved for standard protocols (0 to 1023). There’s nothing
preventing a client from opening a socket on one of these special ports,
however. For instance, the following client-side code connects to
programs listening on the standard email, FTP, and HTTP web server ports
on three different server machines:
C:\...\PP4E\Internet\Sockets>python
>>>from socket import *
>>>sock = socket(AF_INET, SOCK_STREAM)
>>>sock.connect(('pop.secureserver.net', 110))
# talk to POP email server
>>>print(sock.recv(70))
b'+OK <[email protected]>\r\n'
>>>sock.close()
>>>sock = socket(AF_INET, SOCK_STREAM)
>>>sock.connect(('learning-python.com', 21))
# talk to FTP server
>>>print(sock.recv(70))
b'220---------- Welcome to Pure-FTPd [privsep] [TLS] ----------\r\n220-You'
>>>sock.close()
>>>sock = socket(AF_INET, SOCK_STREAM)
>>>sock.connect(('www.python.net', 80))
# talk to Python's HTTP server
>>>sock.send(b'GET /\r\n')
# fetch root page reply
7
>>>sock.recv(70)
b'>>>sock.recv(70)
b'www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">\r\n", line 1, in
File "", line 1, in bind
socket.error: (13, 'Permission denied')
Even if run by a user with the required permission, you’ll get
the different exception we saw earlier if the port is already being
used by a real web server. On computers being used as general servers,
these ports really are reserved. This is one reason we’ll run a web
server of our own locally for testing when we start writing
server-side scripts later in this book—the above code works on a
Windows PC, which allows us to experiment with websites locally, on a
self-contained machine:
C:\...\PP4E\Internet\Sockets>python
>>>from socket import *
>>>sock = socket(AF_INET, SOCK_STREAM)
# can bind port 80 on Windows
>>>sock.bind(('', 80))
# allows running server on localhost
>>>
We’ll learn more about installing web servers later in
Chapter 15
. For the purposes of this chapter,
we need to get realistic about how our socket servers handle their
clients.
[
45
]
You might be interested to know that the last part of this
example, talking to port 80, is exactly what your web browser does
as you surf the Web: followed links direct it to download web pages
over this port. In fact, this lowly port is the primary basis of the
Web. In
Chapter 15
, we will meet an
entire application environment based upon sending formatted data
over port 80—CGI server-side scripting. At the bottom, though, the
Web is just bytes over sockets, with a user interface. The wizard
behind the curtain is not as impressive as he may seem!
Theecho
client and
server programs shown previously serve to illustrate socket
fundamentals. But the server model used suffers from a fairly major flaw.
As described earlier, if multiple clients try to connect to the server,
and it takes a long time to process a given client’s request, the server
will fail. More accurately, if the cost of handling a given request
prevents the server from returning to the code that checks for new clients
in a timely manner, it won’t be able to keep up with all the requests, and
some clients will eventually be denied connections.
In real-world client/server programs, it’s far more typical to code
a server so as to avoid blocking new requests while handling a current
client’s request. Perhaps the easiest way to do so is to service each
client’s request in parallel—in a new process, in a new thread, or by
manually switching (multiplexing) between clients in an event loop. This
isn’t a socket issue per se, and we already learned how to start processes
and threads in
Chapter 5
. But since these
schemes are so typical of socket server programming, let’s explore all
three ways to handle client requests in parallel here.
The script in
Example 12-4
works like
the originalecho
server, but instead forks a new process to handle each new client
connection. Because thehandleClient
function runs in a new process, thedispatcher
function can immediately resume its
main loop in order to detect and service a new incoming request.
Example 12-4. PP4E\Internet\Sockets\fork-server.py
"""
Server side: open a socket on a port, listen for a message from a client,
and send an echo reply; forks a process to handle each client connection;
child processes share parent's socket descriptors; fork is less portable
than threads--not yet on Windows, unless Cygwin or similar installed;
"""
import os, time, sys
from socket import * # get socket constructor and constants
myHost = '' # server machine, '' means local host
myPort = 50007 # listen on a non-reserved port number
sockobj = socket(AF_INET, SOCK_STREAM) # make a TCP socket object
sockobj.bind((myHost, myPort)) # bind it to server port number
sockobj.listen(5) # allow 5 pending connects
def now(): # current time on server
return time.ctime(time.time())
activeChildren = []
def reapChildren(): # reap any dead child processes
while activeChildren: # else may fill up system table
pid, stat = os.waitpid(0, os.WNOHANG) # don't hang if no child exited
if not pid: break
activeChildren.remove(pid)
def handleClient(connection): # child process: reply, exit
time.sleep(5) # simulate a blocking activity
while True: # read, write a client socket
data = connection.recv(1024) # till eof when socket closed
if not data: break
reply = 'Echo=>%s at %s' % (data, now())
connection.send(reply.encode())
connection.close()
os._exit(0)
def dispatcher(): # listen until process killed
while True: # wait for next connection,
connection, address = sockobj.accept() # pass to process for service
print('Server connected by', address, end=' ')
print('at', now())
reapChildren() # clean up exited children now
childPid = os.fork() # copy this process
if childPid == 0: # if in child process: handle
handleClient(connection)
else: # else: go accept next connect
activeChildren.append(childPid) # add to active child pid list
dispatcher()
Parts of this script are a bit tricky, and most of its library
calls work only on Unix-like platforms. Crucially, it runs on
Cygwin Python on Windows, but not standard Windows
Python. Before we get into too many forking details, though, let’s
focus on how this server arranges to handle multiple client
requests.
First, notice that to simulate a long-running operation (e.g.,
database updates, other network traffic), this server adds a
five-secondtime.sleep
delay in its
client handler function,handleClient
. After the delay, the original
echo reply action is performed. That means that when we run a server
and clients this time, clients won’t receive the echo reply until five
seconds after they’ve sent their requests to the server.
To help keep track of requests and replies, the server prints
its system time each time a client connect request is received, and
adds its system time to the reply. Clients print the reply time sent
back from the server, not their own—clocks on the server and client
may differ radically, so to compare apples to apples, all times are
server times. Because of the simulated delays, we also must usually
start each client in its own console window on Windows (clients will
hang in a blocked state while waiting for their reply).
But the grander story here is that this script runs one main
parent process on the server machine, which does nothing but watch for
connections (indispatcher
), plus
one child process per active client connection, running in parallel
with both the main parent process and the other client processes (inhandleClient
). In principle, the
server can handle any number of clients without bogging down.
To test, let’s first start the server remotely in a SSH or
Telnet window, and start three clients locally in three distinct
console windows. As we’ll see in a moment, this server can also be run
under Cygwin locally if you have Cygwin but don’t have a remote server
account like the one on
learning-python.com
used here:
[server window (SSH or Telnet)]
[...]$uname -p -o
i686 GNU/Linux
[...]$python fork-server.py
Server connected by ('72.236.109.185', 58395) at Sat Apr 24 06:46:45 2010
Server connected by ('72.236.109.185', 58396) at Sat Apr 24 06:46:49 2010
Server connected by ('72.236.109.185', 58397) at Sat Apr 24 06:46:51 2010
[client window 1]
C:\...\PP4E\Internet\Sockets>python echo-client.py learning-python.com
Client received: b"Echo=>b'Hello network world' at Sat Apr 24 06:46:50 2010"
[client window 2]
C:\...\PP4E\Internet\Sockets>python echo-client.py learning-python.com Bruce
Client received: b"Echo=>b'Bruce' at Sat Apr 24 06:46:54 2010"
[client window 3]
C:\...\Sockets>python echo-client.py learning-python.com The Meaning of Life
Client received: b"Echo=>b'The' at Sat Apr 24 06:46:56 2010"
Client received: b"Echo=>b'Meaning' at Sat Apr 24 06:46:56 2010"
Client received: b"Echo=>b'of' at Sat Apr 24 06:46:56 2010"
Client received: b"Echo=>b'Life' at Sat Apr 24 06:46:57 2010"
Again, all times here are on the server machine. This may be a
little confusing because four windows are involved. In plain English,
the test proceeds as follows:
The server starts running remotely.
All three clients are started and connect to the server a
few seconds apart.
On the server, the client requests trigger three forked
child processes, which all immediately go to sleep for five
seconds (to simulate being busy doing something useful).
Each client waits until the server replies, which happens
five seconds after their initial requests.
In other words, clients are serviced at the same time by forked
processes, while the main parent process continues listening for new
client requests. If clients were not handled in parallel like this, no
client could connect until the currently connected client’s
five-second delay expired.
In a more realistic application, that delay could be fatal if
many clients were trying to connect at once—the server would be stuck
in the action we’re simulating withtime.sleep
, and not get back to the main
loop toaccept
new client requests.
With process forks per request, clients can be serviced in
parallel.
Notice that we’re using the same client script here
(
echo-client.py
, from
Example 12-2
), just a different
server; clients simply send and receive data to a machine and port and
don’t care how their requests are handled on the server. The result
displayed shows a byte string within a byte string, because the client
sends one to the server and the server sends one back; because the
server uses string formatting and manual
encoding
instead of byte string
concatenation, the client’s message is shown as byte string explicitly
here.
Also note that the server is running remotely on a Linux machine
in the preceding section. As we learned in
Chapter 5
, thefork
call is not supported on Windows in
standard Python at the time this book was written. It does run on
Cygwin Python, though, which allows us to start this server locally onlocalhost
, on the same machine as
its clients:
[Cygwin shell window]
[C:\...\PP4E\Internet\Socekts]$python fork-server.py
Server connected by ('127.0.0.1', 58258) at Sat Apr 24 07:50:15 2010
Server connected by ('127.0.0.1', 58259) at Sat Apr 24 07:50:17 2010
[Windows console, same machine]
C:\...\PP4E\Internet\Sockets>python echo-client.py localhost bright side of life
Client received: b"Echo=>b'bright' at Sat Apr 24 07:50:20 2010"
Client received: b"Echo=>b'side' at Sat Apr 24 07:50:20 2010"
Client received: b"Echo=>b'of' at Sat Apr 24 07:50:20 2010"
Client received: b"Echo=>b'life' at Sat Apr 24 07:50:20 2010"
[Windows console, same machine]
C:\...\PP4E\Internet\Sockets>python echo-client.py
Client received: b"Echo=>b'Hello network world' at Sat Apr 24 07:50:22 2010"
We can also run this test on the remote Linux server entirely,
with two SSH or Telnet windows. It works about the same as when
clients are started locally, in a DOS console window, but here “local”
actually means a remote machine you’re using locally. Just for fun,
let’s also contact the remote server from a locally running client to
show how the server is also available to the Internet at large—when
servers are coded with sockets and forks this way, clients can connect
from arbitrary machines, and can overlap arbitrarily in time:
[one SSH (or Telnet) window]
[...]$python fork-server.py
Server connected by ('127.0.0.1', 55743) at Sat Apr 24 07:15:14 2010
Server connected by ('127.0.0.1', 55854) at Sat Apr 24 07:15:26 2010
Server connected by ('127.0.0.1', 55950) at Sat Apr 24 07:15:36 2010
Server connected by ('72.236.109.185', 58414) at Sat Apr 24 07:19:50 2010
[another SSH window, same machine]
[...]$python echo-client.py
Client received: b"Echo=>b'Hello network world' at Sat Apr 24 07:15:19 2010"
[...]$python echo-client.py localhost niNiNI!
Client received: b"Echo=>b'niNiNI!' at Sat Apr 24 07:15:31 2010"
[...]$python echo-client.py localhost Say no more!
Client received: b"Echo=>b'Say' at Sat Apr 24 07:15:41 2010"
Client received: b"Echo=>b'no' at Sat Apr 24 07:15:41 2010"
Client received: b"Echo=>b'more!' at Sat Apr 24 07:15:41 2010"
[Windows console, local machine]
C:\...\Internet\Sockets>python echo-client.py learning-python.com Blue, no yellow!
Client received: b"Echo=>b'Blue,' at Sat Apr 24 07:19:55 2010"
Client received: b"Echo=>b'no' at Sat Apr 24 07:19:55 2010"
Client received: b"Echo=>b'yellow!' at Sat Apr 24 07:19:55 2010"
Now that we have a handle on the basic model, let’s move on to
the tricky bits. This server script is fairly straightforward as
forking code goes, but a few words about the library tools it employs
are in order.
We metos.fork
in
Chapter 5
, but
recall that forked processes are essentially a copy of
the process that forks them, and so they inherit file and socket
descriptors from their parent process. As a result, the new child
process that runs thehandleClient
function has access to the connection socket created in the parent
process. Really, this is why the child process works at all—when
conversing on the connected socket, it’s using the same socket that
parent’saccept
call returns.
Programs know they are in a forked child process if the fork call
returns 0; otherwise, the original parent process gets back the new
child’s ID.
In earlier fork examples,
child processes usually call one of theexec
variants to start a new program in the
child process. Here, instead, the child process simply calls a
function in the same program and exits withos._exit
. It’s imperative to callos._exit
here—
if we did not, each child would
live on afterhandleClient
returns,
and compete for accepting new client requests.
In fact, without the exit call, we’d wind up with as many
perpetual server processes as requests served—remove the exit call and
do aps
shell command after running
a few clients, and you’ll see what I mean. With the call, only the
single parent process listens for new requests.os._exit
is likesys.exit
, but it exits the calling process
immediately without cleanup actions. It’s normally used only in child
processes, andsys.exit
is used
everywhere else.
Note, however,
that it’s not quite enough to make sure that child
processes exit and die. On systems like Linux, though not on Cygwin,
parents must also be sure to issue await
system call to remove the entries for
dead child processes from the system’s process table. If we don’t do
this, the child processes will no longer run, but they will consume an
entry in the system process table. For long-running servers, these
bogus entries may become problematic.
It’s common to call such dead-but-listed child processes
zombies
: they continue to use system resources
even though they’ve already passed over to the great operating system
beyond. To clean up after child processes are gone, this server keeps
a list,active
Children
, of the process IDs of all
child processes it spawns. Whenever a new incoming client request is
received, the server runs itsreapChildren
to issue await
for any dead children by issuing the
standard Pythonos.waitpid(0,os.WNOHANG)
call.
Theos.waitpid
call attempts
to wait for a child process to exit and returns its process ID and
exit status. With a0
for its first
argument, it waits for any child process. With theWNOHANG
parameter for its second, it does
nothing if no child process has exited (i.e., it does not block or
pause the caller). The net effect is that this call simply asks the
operating system for the process ID of any child that has exited. If
any have, the process ID returned is removed both from the system
process table and from this script’sactiveChildren
list.
To see why all this complexity is needed, comment out thereapChildren
call in this script,
run it on a platform where this is an issue, and then run a few
clients. On my Linux server, aps
full process listing command shows that all the dead
-f
child processes stay in the system process table (show as
):
[...]$ps –f
UID PID PPID C STIME TTY TIME CMD
5693094 9990 30778 0 04:34 pts/0 00:00:00 python fork-server.py
5693094 10844 9990 0 04:35 pts/0 00:00:00 [python]
5693094 10869 9990 0 04:35 pts/0 00:00:00 [python]
5693094 11130 9990 0 04:36 pts/0 00:00:00 [python]
5693094 11151 9990 0 04:36 pts/0 00:00:00 [python]
5693094 11482 30778 0 04:36 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 -bash
When thereapChildren
command
is reactivated, dead child zombie entries are cleaned up each time the
server gets a new client connection request, by calling the Pythonos.waitpid
function. A few zombies
may accumulate if the server is heavily loaded, but they will remain
only until the next client connection is received (you get only as
many zombies as processes served in parallel since the lastaccept
):
[...]$python fork-server.py &
[1] 20515
[...]$ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 20515 30778 0 04:43 pts/0 00:00:00 python fork-server.py
5693094 20777 30778 0 04:43 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 -bash
[...]$
Server connected by ('72.236.109.185', 58672) at Sun Apr 25 04:43:51 2010
Server connected by ('72.236.109.185', 58673) at Sun Apr 25 04:43:54 2010
[...]$ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 20515 30778 0 04:43 pts/0 00:00:00 python fork-server.py
5693094 21339 20515 0 04:43 pts/0 00:00:00 [python]
5693094 21398 20515 0 04:43 pts/0 00:00:00 [python]
5693094 21573 30778 0 04:44 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 -bash
[...]$
Server connected by ('72.236.109.185', 58674) at Sun Apr 25 04:44:07 2010
[...]$ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 20515 30778 0 04:43 pts/0 00:00:00 python fork-server.py
5693094 21646 20515 0 04:44 pts/0 00:00:00 [python]
5693094 21813 30778 0 04:44 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 -bash
In fact, if you type fast enough, you can actually see a child
process morph from a real running program into a zombie. Here, for
example, a child spawned to handle a new request changes to
on exit. Its connection
cleans up lingering zombies, and its own process entry will be removed
completely when the next request is received:
[...]$
Server connected by ('72.236.109.185', 58676) at Sun Apr 25 04:48:22 2010
[...]ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 20515 30778 0 04:43 pts/0 00:00:00 python fork-server.py
5693094 27120 20515 0 04:48 pts/0 00:00:00 python fork-server.py
5693094 27174 30778 0 04:48 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 -bash
[...]$ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 20515 30778 0 04:43 pts/0 00:00:00 python fork-server.py
5693094 27120 20515 0 04:48 pts/0 00:00:00 [python]
5693094 27234 30778 0 04:48 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 -bash