[
17
]
We will clarify the notions of “client” and “server” in the
Internet programming part of this book. There, we’ll communicate
with sockets (which we’ll see later in this chapter are roughly
like bidirectional pipes for programs running both across networks
and on the same machine), but the overall conversation model is
similar. Named pipes (fifos), described ahead, are also a better
match to the client/server model because they can be accessed by
arbitrary, unrelated processes (no forks are required). But as
we’ll see, the socket port model is generally used by most
Internet scripting protocols—email, for instance, is mostly just
formatted strings shipped over sockets between programs on
standard port numbers reserved for the email protocol.
Now that you
know about IPC alternatives and have had a chance to explore
processes, threads, and both process nonportability and thread GIL
limitations, it turns out that there is another alternative, which aims to
provide just the best of both worlds. As mentioned earlier, Python’s
standard librarymultiprocessing
module
package allows scripts to spawn processes using an API very
similar to thethreading
module.
This relatively new package works on both Unix and Windows, unlike
low-level process forks. It supports a process spawning model which is
largely platform-neutral, and provides tools for related goals, such as
IPC, including locks, pipes, and queues. In addition, because it uses
processes instead of threads to run code in parallel, it effectively works
around the limitations of the thread GIL. Hence,multiprocessing
allows the programmer to
leverage the capacity of multiple processors for parallel tasks, while
retaining much of the simplicity and portability of the threading
model.
So why learn yet another parallel processing paradigm and toolkit,
when we already have the threads, processes, and IPC tools like sockets,
pipes, and thread queues that we’ve already studied? Before we get into
the details, I want to begin with a few words about why you may (or may
not) care about this package. In more specific terms, although this
module’s performance may not compete with that of pure threads or
process forks for some applications, this module offers a compelling
solution for many:
Compared to raw process forks, you gain cross-platform
portability and powerful IPC tools.
Compared to threads, you essentially trade some potential and
platform
-
dependent
extra task start-up time for
the ability to run tasks in truly parallel fashion on multi-core or
multi-CPU machines.
On the other hand, this module imposes some constraints and
tradeoffs that threads do not:
Since objects are copied across process boundaries, shared
mutable state does not work as it does for threads—changes in one
process are not generally noticed in the other. Really, freely
shared state may be the most compelling reason to use threads; its
absence in this module may prove limiting in some threading
contexts.
Because this module requires pickleability for both its
processes on Windows, as well as some of its IPC tools in general,
some coding paradigms are difficult or nonportable—especially if
they use bound methods or pass unpickleable objects such as sockets
to spawned processes.
For instance, common coding patterns with lambda that work for thethreading
module cannot be used as
process target callables in this module on Windows, because they cannot
be pickled. Similarly, because bound object methods are also not
pickleable, a threaded program may require a more indirect design if it
either runs bound methods in its threads or implements thread exit
actions by posting arbitrary callables (possibly including bound
methods) on shared queues. The in-process model of threads supports such
direct lambda and bound method use, but the separate processes ofmultiprocessing
do not.
In fact we’ll write a thread manager for GUIs in
Chapter 10
that relies on queueing
in-process
callables this way to implement
thread exit actions—the callables are queued by worker threads, and
fetched and dispatched by the main thread. Because the
threaded
PyMailGUI program we’ll code in
Chapter 14
both uses this manager to queue
bound methods for thread exit actions and runs bound methods as the main
action of a thread itself, it could not be directly translated to the
separate process model implied bymultiprocessing
.
Without getting into too many details here, to usemultiprocessing
, PyMailGUI’s actions might
have to be coded as simple functions or complete process subclasses for
pickleability. Worse, they may have to be implemented as simpler action
identifiers dispatched in the main process, if they update either the
GUI itself or object state in general —pickling results in an object
copy in the receiving process, not a reference to the original, and
forks on Unix essentially copy an entire process. Updating the state of
a mutable message cache copied by pickling it to pass to a new process,
for example, has no effect on the original.
The pickleability constraints for process arguments on Windows can
limitmulti
processing
’s scope in other contexts as
well. For instance, in
Chapter 12
, we’ll find
that this module doesn’t directly solve the lack of portability for theos.fork
call for traditionally coded
socket servers
on Windows, because connected
sockets are not pickled correctly when passed into a new process created
by this module to converse with a client. In this context, threads
provide a more portable and likely more efficient solution.
Applications that pass simpler types of messages, of course, may
fare better. Message constraints are easier to accommodate when they are
part of an initial process-based design. Moreover, other tools in this
module, such as its managers and shared memory API, while narrowly
focused and not as general as shared thread state, offer additional
mutable state options for some programs.
Fundamentally, though, becausemultiprocessing
is based on separate
processes, it may be best geared for tasks which are relatively
independent, do not share mutable object state freely, and can make do
with the message passing and shared memory tools provided by this
module. This includes many applications, but this module is not
necessarily a direct replacement for every threaded program, and it is
not an alternative to process forks in all contexts.
To truly understand both this module package’s benefits, as well
as its tradeoffs, let’s turn to a first example and explore this
package’s implementation along the
way.
We don’t have
space to do full justice to this sophisticated module in
this book; see its coverage in the Python library manual for the full
story. But as a brief introduction, by design most of this module’s
interfaces mirror thethreading
andqueue
modules we’ve already met, so
they should already seem familiar. For example, themultiprocessing
module’sProcess
class is
intended to mimic thethreading
module’sThread
class we met
earlier—it allows us to launch a function call in parallel with the
calling script; with this module, though, the function runs in a process
instead of a thread.
Example 5-29
illustrates these basics in action:
Example 5-29. PP4E\System\Processes\multi1.py
"""
multiprocess basics: Process works like threading.Thread, but
runs function call in parallel in a process instead of a thread;
locks can be used to synchronize, e.g. prints on some platforms;
starts new interpreter on windows, forks a new process on unix;
"""
import os
from multiprocessing import Process, Lock
def whoami(label, lock):
msg = '%s: name:%s, pid:%s'
with lock:
print(msg % (label, __name__, os.getpid()))
if __name__ == '__main__':
lock = Lock()
whoami('function call', lock)
p = Process(target=whoami, args=('spawned child', lock))
p.start()
p.join()
for i in range(5):
Process(target=whoami, args=(('run process %s' % i), lock)).start()
with lock:
print('Main process exit.')
When run, this script first calls a function directly and
in-process; then launches a call to that function in a new process and
waits for it to exit; and finally spawns five function call processes in
parallel in a loop—all using an API identical to that of thethreading.
Thread
model we studied earlier in this
chapter. Here’s this script’s output on Windows; notice how the five
child processes spawned at the end of this script outlive their parent,
as is the usual case for processes:
C:\...\PP4E\System\Processes>multi1.py
function call: name:__main__, pid:8752
spawned child: name:__main__, pid:9268
Main process exit.
run process 3: name:__main__, pid:9296
run process 1: name:__main__, pid:8792
run process 4: name:__main__, pid:2224
run process 2: name:__main__, pid:8716
run process 0: name:__main__, pid:6936
Just like thethreading.Thread
class we met earlier, themultiprocessing.Process
object can either be
passed atarget
with arguments (as
done here) or subclassed to redefine itsrun
action method. Itsstart
method invokes itsrun
method in a new process, and the defaultrun
simply calls the passed-in
target. Also likethreading
, ajoin
method waits for child process
exit, and aLock
object is provided
as one of a handful of process synchronization tools; it’s used here to
ensure that prints don’t overlap among processes on platforms where this
might matter (it may not on Windows).
Technically, to achieve
its portability, this module currently works by
selecting from platform-specific alternatives:
On Unix, it forks a new child process and invokes theProcess
object’srun
method in the new child.
On Windows, it spawns a new interpreter by using
Windows-specific process creation tools, passing the pickledProcess
object in to the new
process over a pipe, and starting a “python -c” command line in
the new process, which runs a special Python-coded function in
this package that reads and unpickles theProcess
and invokes itsrun
method.
We met pickling briefly in
Chapter 1
,
and we will study it further later in this book. The implementation is
a bit more complex than this, and is prone to change over time, of
course, but it’s really quite an amazing trick. While the portable API
generally hides these details from your code, its basic structure can
still have subtle impacts on the way you’re allowed to use it. For
instance:
On Windows, the main process’s logic should generally be
nested under a__name__ ==
test as done here when using this module, so it
__main__
can be imported freely by a new interpreter without side effects.
As we’ll learn in more detail in
Chapter 17
, unpickling classes and
functions requires an import of their enclosing module, and this
is the root of this requirement.
Moreover, when globals are accessed in child processes on
Windows, their values may not be the same as that in the parent atstart
time, because their
module will be imported into a new process.
Also on Windows, all arguments toProcess
must be pickleable. Because this
includestarget
, targets should
be simple functions so they can be pickled; they cannot be bound
or unbound object
methods
and cannot be
functions created with a
lambda
. Seepickle
in Python’s library manual for
more on pickleability rules; nearly every object type works, but
callables like functions and classes must be importable—they are
pickled by name only, and later imported to recreate bytecode. On
Windows, objects with system state, such as connected sockets,
won’t generally work as arguments to a process target either,
because they are not
pickleable
.
Similarly, instances of customProcess
subclasses must be pickleable on
Windows as well. This includes all their attribute values. Objects
available in this package (e.g.,Lock
in
Example 5-29
) are pickleable, and
so may be used as bothProcess
constructor arguments and subclass attributes.
IPC objects in this package that appear in later examples
likePipe
andQueue
accept only pickleable objects,
because of their implementation (more on this in the next
section).
On Unix, although a child process can make use of a shared
global item created in the parent, it’s better to pass the object
as an argument to the child process’s constructor, both for
portability to Windows and to avoid potential problems if such
objects were garbage collected in the parent.
There are additional rules documented in the library manual. In
general, though, if you stick to passing in shared objects to
processes and using the synchronization and
communication
tools provided by this
package, your code will usually be portable and correct. Let’s look
next at a few of those tools in
action.
While the processes
created by this package can always communicate using
general system-wide tools like the sockets and fifo files we met
earlier, themultiprocessing
module
also provides portable message passing tools specifically geared to this
purpose for the processes it spawns:
ItsPipe
object
provides an anonymous pipe, which serves as a
connection between two processes. When called,Pipe
returns
twoConnection
objects that represent the ends of the pipe. Pipes are bidirectional
by default, and allow arbitrary pickleable Python objects to be sent
and received. On Unix they are implemented internally today with
either a connected socket pair or theos.pipe
call we met earlier, and on
Windows with named pipes specific to that platform. Much like theProcess
object described earlier,
though, thePipe
object’s
portable API spares callers from such things.
ItsValue
andArray
objects
implement shared process/thread-safe memory for
communication between processes. These calls return scalar and array
objects based in thectypes
module and
created in shared memory, with access synchronized by
default.
ItsQueue
object
serves as a FIFO list of Python objects, which allows
multiple producers and consumers. A queue is essentially a pipe with
extra locking mechanisms to coordinate more arbitrary accesses, and
inherits the pickleability constraints ofPipe
.
Because these devices are safe to use across multiple processes,
they can often serve to synchronize points of communication and obviate
lower-level tools like locks, much the same as the thread queues we met
earlier. As usual, a pipe (or a pair of them) may be used to implement a
request/reply model. Queues support more flexible models; in fact, a GUI
that wishes to avoid the limitations of the GIL might use themulti
processing
module’sProcess
andQueue
to spawn long-running tasks that post
results, rather than threads. As mentioned, although this may incur
extra start-up overhead on some platforms, unlike threads today, tasks
coded this way can be as truly parallel as the underlying platform
allows.
One constraint worth noting here: this package’s pipes (and by
proxy, queues)
pickle
the objects passed through
them, so that they can be reconstructed in the receiving process (as
we’ve seen, on Windows the receiver process may be a fully independent
Python interpreter). Because of that, they do not support unpickleable
objects; as suggested earlier, this includes some callables like bound
methods and lambda functions (see file
multi-badq.py
in the book examples package
for a demonstration of code that violates this constraint). Objects with
system state, such as sockets, may fail as well. Most other Python
object types, including classes and simple functions, work fine on pipes
and queues.
Also keep in mind that because they are pickled, objects
transferred this way are effectively
copied
in the
receiving process; direct in-place changes to mutable objects’ state
won’t be noticed in the sender. This makes sense if you remember that
this package runs independent processes with their own memory spaces;
state cannot be as freely shared as in threading, regardless of which
IPC tools you use.
To demonstrate
the IPC tools listed above, the next three examples
implement three flavors of communication between parent and child
processes.
Example 5-30
uses a
simple shared pipe object to send and receive data between parent and
child processes.
Example 5-30. PP4E\System\Processes\multi2.py
"""
Use multiprocess anonymous pipes to communicate. Returns 2 connection
object representing ends of the pipe: objects are sent on one end and
received on the other, though pipes are bidirectional by default
"""
import os
from multiprocessing import Process, Pipe
def sender(pipe):
"""
send object to parent on anonymous pipe
"""
pipe.send(['spam'] + [42, 'eggs'])
pipe.close()
def talker(pipe):
"""
send and receive objects on a pipe
"""
pipe.send(dict(name='Bob', spam=42))
reply = pipe.recv()
print('talker got:', reply)
if __name__ == '__main__':
(parentEnd, childEnd) = Pipe()
Process(target=sender, args=(childEnd,)).start() # spawn child with pipe
print('parent got:', parentEnd.recv()) # receive from child
parentEnd.close() # or auto-closed on gc
(parentEnd, childEnd) = Pipe()
child = Process(target=talker, args=(childEnd,))
child.start()
print('parent got:', parentEnd.recv()) # receieve from child
parentEnd.send({x * 2 for x in 'spam'}) # send to child
child.join() # wait for child exit
print('parent exit')
When run on Windows, here’s this script’s output—one child
passes an object to the parent, and the other both sends and receives
on the same pipe:
C:\...\PP4E\System\Processes>multi2.py
parent got: ['spam', 42, 'eggs']
parent got: {'name': 'Bob', 'spam': 42}
talker got: {'ss', 'aa', 'pp', 'mm'}
parent exit
This module’s pipe objects make communication between two
processes portable (and nearly trivial).
Example 5-31
uses shared
memory to serve as both inputs and outputs of spawned
processes. To make this work portably, we must create objects defined
by the package and pass them toProcess
constructors. The last test in this
demo (“loop4”) probably represents the most common use case for shared
memory—that of distributing computation work to multiple parallel
processes.
Example 5-31. PP4E\System\Processes\multi3.py
"""
Use multiprocess shared memory objects to communicate.
Passed objects are shared, but globals are not on Windows.
Last test here reflects common use case: distributing work.
"""
import os
from multiprocessing import Process, Value, Array
procs = 3
count = 0 # per-process globals, not shared
def showdata(label, val, arr):
"""
print data values in this process
"""
msg = '%-12s: pid:%4s, global:%s, value:%s, array:%s'
print(msg % (label, os.getpid(), count, val.value, list(arr)))
def updater(val, arr):
"""
communicate via shared memory
"""
global count
count += 1 # global count not shared
val.value += 1 # passed in objects are
for i in range(3): arr[i] += 1
if __name__ == '__main__':
scalar = Value('i', 0) # shared memory: process/thread safe
vector = Array('d', procs) # type codes from ctypes: int, double
# show start value in parent process
showdata('parent start', scalar, vector)
# spawn child, pass in shared memory
p = Process(target=showdata, args=('child ', scalar, vector))
p.start(); p.join()
# pass in shared memory updated in parent, wait for each to finish
# each child sees updates in parent so far for args (but not global)
print('\nloop1 (updates in parent, serial children)...')
for i in range(procs):
count += 1
scalar.value += 1
vector[i] += 1
p = Process(target=showdata, args=(('process %s' % i), scalar, vector))
p.start(); p.join()
# same as prior, but allow children to run in parallel
# all see the last iteration's result because all share objects
print('\nloop2 (updates in parent, parallel children)...')
ps = []
for i in range(procs):
count += 1
scalar.value += 1
vector[i] += 1
p = Process(target=showdata, args=(('process %s' % i), scalar, vector))
p.start()
ps.append(p)
for p in ps: p.join()
# shared memory updated in spawned children, wait for each
print('\nloop3 (updates in serial children)...')
for i in range(procs):
p = Process(target=updater, args=(scalar, vector))
p.start()
p.join()
showdata('parent temp', scalar, vector)
# same, but allow children to update in parallel
ps = []
print('\nloop4 (updates in parallel children)...')
for i in range(procs):
p = Process(target=updater, args=(scalar, vector))
p.start()
ps.append(p)
for p in ps: p.join()
# global count=6 in parent only
# show final results here # scalar=12: +6 parent, +6 in 6 children
showdata('parent end', scalar, vector) # array[i]=8: +2 parent, +6 in 6 children
The following is this script’s output on Windows. Trace through
this and the code to see how it runs; notice how the changed value of
the global variable is not shared by the spawned processes on Windows,
but passed-inValue
andArray
objects are. The final output line
reflects changes made to shared memory in both the parent and spawned
children—the array’s final values are all 8.0, because they were
incremented twice in the parent, and once in each of six spawned
children; the scalar value similarly reflects changes made by both
parent and child; but unlike for threads, the global is per-process
data on Windows:
C:\...\PP4E\System\Processes>multi3.py
parent start: pid:6204, global:0, value:0, array:[0.0, 0.0, 0.0]
child : pid:9660, global:0, value:0, array:[0.0, 0.0, 0.0]
loop1 (updates in parent, serial children)...
process 0 : pid:3900, global:0, value:1, array:[1.0, 0.0, 0.0]
process 1 : pid:5072, global:0, value:2, array:[1.0, 1.0, 0.0]
process 2 : pid:9472, global:0, value:3, array:[1.0, 1.0, 1.0]
loop2 (updates in parent, parallel children)...
process 1 : pid:9468, global:0, value:6, array:[2.0, 2.0, 2.0]
process 2 : pid:9036, global:0, value:6, array:[2.0, 2.0, 2.0]
process 0 : pid:9548, global:0, value:6, array:[2.0, 2.0, 2.0]
loop3 (updates in serial children)...
parent temp : pid:6204, global:6, value:9, array:[5.0, 5.0, 5.0]
loop4 (updates in parallel children)...
parent end : pid:6204, global:6, value:12, array:[8.0, 8.0, 8.0]
If you imagine the last test here run with a much larger array
and many more parallel children, you might begin to sense some of the
power of this package for
distributing work.
Finally, besides
basic spawning and IPC tools, themultiprocessing
module also:
Allows itsProcess
class
to be subclassed to provide structure and state retention (much
likethreading.Thread
, but for
processes).
Implements a process-safeQueue
object which may be shared by any
number of processes for more general communication needs (much
likequeue.Queue
, but for
processes).
Queues support a more flexible multiple client/server model.
Example 5-32
, for instance,
spawns three producer threads to post to a shared queue and repeatedly
polls for results to appear—in much the same fashion that a GUI might
collect results in parallel with the display itself, though here the
concurrency is achieved with processes instead of threads.
Example 5-32. PP4E\System\Processes\multi4.py
"""
Process class can also be subclassed just like threading.Thread;
Queue works like queue.Queue but for cross-process, not cross-thread
"""
import os, time, queue
from multiprocessing import Process, Queue # process-safe shared queue
# queue is a pipe + locks/semas
class Counter(Process):
label = ' @'
def __init__(self, start, queue): # retain state for use in run
self.state = start
self.post = queue
Process.__init__(self)
def run(self): # run in newprocess on start()
for i in range(3):
time.sleep(1)
self.state += 1
print(self.label ,self.pid, self.state) # self.pid is this child's pid
self.post.put([self.pid, self.state]) # stdout file is shared by all
print(self.label, self.pid, '-')
if __name__ == '__main__':
print('start', os.getpid())
expected = 9
post = Queue()
p = Counter(0, post) # start 3 processes sharing queue
q = Counter(100, post) # children are producers
r = Counter(1000, post)
p.start(); q.start(); r.start()
while expected: # parent consumes data on queue
time.sleep(0.5) # this is essentially like a GUI,
try: # though GUIs often use threads
data = post.get(block=False)
except queue.Empty:
print('no data...')
else:
print('posted:', data)
expected -= 1
p.join(); q.join(); r.join() # must get before join putter
print('finish', os.getpid(), r.exitcode) # exitcode is child exit status
Notice in this code how:
Thetime.sleep
calls in
this code’s producer simulate long-running tasks.
All four processes share the same output stream;print
calls go the same place and don’t
overlap badly on Windows (as we saw earlier, themultiprocessing
module also has a
shareableLock
object to
synchronize access if required).
The exit status of child process is available after they
finish in theirexitcode
attribute
.
When run, the output of the main consumer process traces its
queue fetches, and the (indented) output of spawned child producer
processes gives process IDs and state.
C:\...\PP4E\System\Processes>multi4.py
start 6296
no data...
no data...
@ 8008 101
posted: [8008, 101]
@ 6068 1
@ 3760 1001
posted: [6068, 1]
@ 8008 102
posted: [3760, 1001]
@ 6068 2
@ 3760 1002
posted: [8008, 102]
@ 8008 103
@ 8008 -
posted: [6068, 2]
@ 6068 3
@ 6068 -
@ 3760 1003
@ 3760 -
posted: [3760, 1002]
posted: [8008, 103]
posted: [6068, 3]
posted: [3760, 1003]
finish 6296 0
If you imagine the “@” lines here as results of long-running
operations and the others as a main GUI thread, the wide relevance of
this package may become more
apparent.