As we learned earlier,
independent programs generally communicate with
system-global tools such as sockets and the fifo files we studied
earlier. Although processes spawned bymultiprocessing
can leverage these tools, too,
their closer relationship affords them the host of additional IPC
communication devices provided by this
module
.
Like threads,multiprocessing
is designed to run function calls in parallel, not to start entirely
separate programs directly. Spawned functions might use tools likeos.system
,os.popen
, andsubprocess
to start a program if such an
operation might block the caller, but there’s otherwise often no point
in starting a process that just starts a program (
you might as well
start the program and
skip a step). In fact, on Windows,multi
processing
today uses the same process
creation call assubprocess
, so
there’s little point in starting two processes to run one.
It is, however, possible to start new programs in the child
processes spawned, using tools like theos.exec*
calls we met earlier—by spawning a
process portably withmultiprocessing
and overlaying it with a new program this way, we start a new
independent program, and effectively work around the lack of theos.fork
call in standard Windows
Python.
This generally assumes that the new program doesn’t require any
resources passed in by theProcess
API, of course (once a new program starts, it erases that which was
running), but it offers a portable equivalent to the fork/exec
combination on Unix. Furthermore, programs started this way can still
make use of more traditional IPC tools, such as sockets and fifos, we
met earlier in this chapter.
Example 5-33
illustrates the
technique.
Example 5-33. PP4E\System\Processes\multi5.py
"Use multiprocessing to start independent programs, os.fork or not"
import os
from multiprocessing import Process
def runprogram(arg):
os.execlp('python', 'python', 'child.py', str(arg))
if __name__ == '__main__':
for i in range(5):
Process(target=runprogram, args=(i,)).start()
print('parent exit')
This script starts 5 instances of the
child.py
script we wrote in
Example 5-4
as independent processes,
without waiting for them to finish. Here’s this script at work on
Windows, after deleting a superfluous system prompt that shows up
arbitrarily in the middle of its output (it runs the same on Cygwin, but
the output is not interleaved there):
C:\...\PP4E\System\Processes>type child.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])
C:\...\PP4E\System\Processes>multi5.py
parent exit
Hello from child 9844 2
Hello from child 8696 4
Hello from child 1840 0
Hello from child 6724 1
Hello from child 9368 3
This technique isn’t possible with threads, because all threads
run in the same process; overlaying it with a new program would kill all
its threads. Though this is unlikely to be as fast as a fork/exec
combination on Unix, it at least provides similar and portable
functionality on Windows when required.
Finally,multiprocessing
provides
many more tools than these examples deploy, including
condition, event, and semaphore synchronization tools, and local and
remote managers that implement servers for shared object. For instance,
Example 5-34
demonstrates its
support for
pools
—spawned children that work in
concert on a given task.
Example 5-34. PP4E\System\Processes\multi6.py
"Plus much more: process pools, managers, locks, condition,..."
import os
from multiprocessing import Pool
def powers(x):
#print(os.getpid()) # enable to watch children
return 2 ** x
if __name__ == '__main__':
workers = Pool(processes=5)
results = workers.map(powers, [2]*100)
print(results[:16])
print(results[-2:])
results = workers.map(powers, range(100))
print(results[:16])
print(results[-2:])
When run, Python arranges to delegate portions of the task to
workers run in parallel:
C:\...\PP4E\System\Processes>multi6.py
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
[4, 4]
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768]
[316912650057057350374175801344, 633825300114114700748351602688]
To be fair, besides such additional features and tools,multiprocessing
also comes with additional
constraints beyond those we’ve already covered (pickleability, mutable
state, and so on). For example, consider the
following sort of code:
def action(arg1, arg2):
print(arg1, arg2)
if __name__ == '__main__':
Process(target=action, args=('spam', 'eggs')).start() # shell waits for child
This works as expected, but if we change the last line to the
following it fails on Windows because
lambdas
are
not pickleable (really, not importable):
Process(target=(lambda: action('spam', 'eggs'))).start() # fails!-not pickleable
This precludes a common coding pattern that uses lambda to add
data to calls, which we’ll use often for callbacks in the GUI part of
this book. Moreover, this differs from thethreading
module that is the model for this
package—calls like the following which work for threads must be
translated to a callable and arguments:
threading.Thread(target=(lambda: action(2, 4))).start() # but lambdas work here
Conversely, some behavior of thethreading
module is mimicked bymultiprocessing
, whether you wish it did or
not. Because programs using this package wait for child processes to
end by default, we must mark processes asdaemon
if we don’t want to block the shell
where the following sort of code is run (technically, parents attempt
to terminate daemonic children on exit, which means that the program
can exit when only daemonic children remain, much likethreading
):
def action(arg1, arg2):
print(arg1, arg2)
time.sleep(5) # normally prevents the parent from exiting
if __name__ == '__main__':
p = Process(target=action, args=('spam', 'eggs'))
p.daemon = True # don't wait for it
p.start()
There’s more on some of these issues in the Python library
manual; they are not show-stoppers by any stretch, but special cases
and potential pitfalls to some. We’ll revisit the lambda and daemon
issues in a more realistic context in
Chapter 8
, where we’ll usemultiprocessing
to launch GUI demos
independently.
As this section’s examples suggest,multiprocessing
provides a powerful
alternative which aims to combine the portability and much of the
utility of threads with the fully parallel potential of processes and
offers additional solutions to IPC, exit status, and other parallel
processing goals.
Hopefully, this section has also given you a better understanding
of this module’s tradeoffs discussed at its beginning. In particular,
its separate process model precludes the freely shared mutable state of
threads, and bound methods and lambdas are prohibited by both the
pickleability requirements of its IPC pipes and queues, as well as its
process action implementation on Windows. Moreover, its requirement of
pickleability for process arguments on Windows also precludes it as an
option for conversing with clients in socket servers portably.
While not a replacement for threading in all applications, though,multiprocessing
offers compelling
solutions for many. Especially for parallel-programming tasks which can
be designed to avoid its limitations, this module can offer both
performance and portability that Python’s more direct multitasking tools
cannot.
Unfortunately, beyond this brief introduction, we don’t have space
for a more complete treatment of this module in this book. For more
details, refer to the Python library manual. Here, we turn next to a
handful of additional program launching tools and a wrap up of this
chapter.
We’ve seen a
variety of ways to launch programs in this book so far—from
theos.fork
/exec
combination on Unix, to portable shell
command-line launchers likeos.system
,os.popen
, andsubprocess
, to the portablemultiprocessing
module options of the last
section. There are still other ways to start programs in the Python
standard library, some of which are more platform neutral or obscure than
others. This section wraps up this chapter with a quick tour through this
set.
Theos.spawnv
andos.spawnve
calls
were originally introduced to launch programs on Windows,
much like afork
/exec
call combination on Unix-like platforms.
Today, these calls work on both Windows and Unix-like systems, and
additional variants have been added to parrotos.exec
.
In recent versions of Python, the portablesubprocess
module has started to supersede
these calls. In fact, Python’s library manual includes a note stating
that this module has more powerful and equivalent tools and should be
preferred toos.spawn
calls.
Moreover, the newermultiprocessing
module can achieve similarly portable results today when combined withos.exec
calls, as we saw earlier.
Still, theos.spawn
calls continue to
work as advertised and may appear in Python code you encounter.
Theos.spawn
family of calls
execute a program named by a command line in a new process, on both
Windows and Unix-like systems. In basic operation, they are similar to
thefork
/exec
call combination on Unix and can be used
as alternatives to thesystem
andpopen
calls we’ve already learned. In the
following interaction, for instance, we start a Python program with a
command line in two traditional ways (the second also reads its
output):
C:\...\PP4E\System\Processes>python
>>>print(open('makewords.py').read())
print('spam')
print('eggs')
print('ham')
>>>import os
>>>os.system('python makewords.py')
spam
eggs
ham
0
>>>result = os.popen('python makewords.py').read()
>>>print(result)
spam
eggs
ham
The equivalentos.spawn
calls
achieve the same effect, with a slightly more complex call signature
that provides more control over the way the program is launched:
>>>os.spawnv(os.P_WAIT, r'C:\Python31\python', ('python', 'makewords.py'))
spam
eggs
ham
0
>>>os.spawnl(os.P_NOWAIT, r'C:\Python31\python', 'python', 'makewords.py')
1820
>>> spam
eggs
ham
Thespawn
calls are also much
like forking programs in Unix. They don’t actually copy the calling
process (so shared descriptor operations won’t work), but they can be
used to start a program running completely independent of the calling
program, even on Windows. The script in
Example 5-35
makes the similarity to
Unix programming patterns more obvious. It launches a program with afork
/exec
combination on Unix-like platforms
(including Cygwin), or anos.spawnv
call on Windows.
Example 5-35. PP4E\System\Processes\spawnv.py
"""
start up 10 copies of child.py running in parallel;
use spawnv to launch a program on Windows (like fork+exec);
P_OVERLAY replaces, P_DETACH makes child stdout go nowhere;
or use portable subprocess or multiprocessing options today!
"""
import os, sys
for i in range(10):
if sys.platform[:3] == 'win':
pypath = sys.executable
os.spawnv(os.P_NOWAIT, pypath, ('python', 'child.py', str(i)))
else:
pid = os.fork()
if pid != 0:
print('Process %d spawned' % pid)
else:
os.execlp('python', 'python', 'child.py', str(i))
print('Main process exiting.')
To make sense of these examples, you have to understand the
arguments being passed to the spawn calls. In this script, we callos.spawnv
with a process mode flag,
the full directory path to the Python interpreter, and a tuple of
strings representing the shell command line with which to start a new
program. The path to the Python interpreter executable program running a
script is available assys.executable
. In general, the
process mode
flag is taken from these predefined
values:
os.P_NOWAIT
andos.P_NOWAITO
Thespawn
functions will
return as soon as the new process has been created, with the
process ID as the return value. Available on Unix and
Windows.
os.P_WAIT
Thespawn
functions will
not return until the new process has run to completion and will
return the exit code of the process if the run is successful or
“-signal” if a signal kills the process. Available on Unix and
Windows.
os.P_DETACH
andos.P_OVERLAY
P_DETACH
is similar toP_NOWAIT
, but the new process
is detached from the console of the calling process. IfP_OVERLAY
is used, the current program
will be replaced (much likeos.exec
). Available on Windows.
In fact, there are eight different calls in the spawn family,
which all start a program but vary slightly in their call signatures. In
their names, an “l” means you list arguments individually, “p” means the
executable file is looked up on the system path, and “e” means a
dictionary is passed in to provide the shelled environment of the
spawned program: theos.spawnve
call,
for example, works the same way asos.spawnv
but accepts an extra fourth
dictionary argument to specify a different shell environment for the
spawned program (which, by default, inherits all of the parent’s
settings):
os.spawnl(mode, path, ...)
os.spawnle(mode, path, ..., env)
os.spawnlp(mode, file, ...) # Unix only
os.spawnlpe(mode, file, ..., env) # Unix only
os.spawnv(mode, path, args)
os.spawnve(mode, path, args, env)
os.spawnvp(mode, file, args) # Unix only
os.spawnvpe(mode, file, args, env) # Unix only
Because these calls mimic the names and call signatures of theos.exec
variants, see earlier in this
chapter for more details on the differences between these call forms.
Unlike theos.exec
calls, only half
of theos.spawn
forms—those without
system path checking (and hence without a “p” in their names)—are
currently implemented on Windows. All the process mode flags are
supported on Windows, but detach and overlay modes are not available on
Unix. Because this sort of detail may be prone to change, to verify
which are present, be sure to see the library manual or run adir
built-in function call on theos
module after an import.
Here is the script in
Example 5-35
at work on Windows,
spawning 10 independent copies of the
child.py
Python program we met earlier in this chapter:
C:\...\PP4E\System\Processes>type child.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])
C:\...\PP4E\System\Processes>python spawnv.py
Hello from child −583587 0
Hello from child −558199 2
Hello from child −586755 1
Hello from child −562171 3
Main process exiting.
Hello from child −581867 6
Hello from child −588651 5
Hello from child −568247 4
Hello from child −563527 7
Hello from child −543163 9
Hello from child −587083 8
Notice that the copies print their output in random order, and the
parent program exits before all children do; all of these programs are
really running in parallel on Windows. Also observe that the child
program’s output shows up in the console box where
spawnv.py
was run; when usingP_NOWAIT
, standard output comes to the
parent’s console, but it seems to go nowhere when usingP_DETACH
(which is most likely a feature when
spawning GUI programs).
But having shown you this call, I need to again point out that
both thesubprocess
andmultiprocessing
modules offer more portable
alternatives for spawning programs with command lines today. In fact,
unlessos.spawn
calls provide unique
behavior you can’t live without (e.g., control of shell window pop ups
on Windows), the platform-specific alternatives code of
Example 5-35
can be replaced altogether
with the
portablemultiprocessing
code in
Example 5-33
.
Althoughos.spawn
calls may be
largely superfluous today, there are other tools that can still make a
strong case for themselves. For instance, theos.system
call can be used on Windows to
launch a DOSstart
command, which
opens (i.e., runs) a file independently based on its Windows filename
associations, as though it were clicked.os.startfile
makes this even simpler in recent
Python releases, and it can avoid blocking its caller, unlike some other
tools.
To understand why,
first you need to know how the DOS start command works
in general. Roughly, a DOS command line of the formstart
command
works as ifcommand
were typed in the
Windows Run dialog box available in the Start button menu. Ifcommand
is a filename, it is opened exactly
as if its name was double-clicked in the Windows Explorer file
selector GUI.
For instance, the following three DOS commands automatically
start Internet Explorer, my registered image viewer program, and my
sound media player program on the files named in the commands. Windows
simply opens the file with whatever program is associated to handle
filenames of that form. Moreover, all three of these programs run
independently of the DOS console box where the command is
typed:
C:\...\PP4E\System\Media>start lp4e-preface-preview.html
C:\...\PP4E\System\Media>start ora-lp4e.jpg
C:\...\PP4E\System\Media>start sousa.au
Because thestart
command can
run any file and command line, there is no reason it cannot also be
used to start an independently running Python program:
C:\...\PP4E\System\Processes>start child.py 1
This works because Python is registered to open names ending in
.py
when it is installed. The script
child.py
is launched independently of the DOS
console window even though we didn’t provide the name or path of the
Python interpreter program. Because
child.py
simply prints a message and exits, though, the result isn’t exactly
satisfying: a new DOS window pops up to serve as the script’s standard
output, and it immediately goes away when the child exits. To do
better, add aninput
call at the
bottom of the program file to wait for a key press before
exiting:
C:\...\PP4E\System\Processes>type child-wait.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])
input("Press") # don't flash on Windows
C:\...\PP4E\System\Processes>start child-wait.py 2
Now the child’s DOS window pops up and stays up after thestart
command has returned.
Pressing the Enter key in the pop-up DOS window makes it go
away.
Since we know that
Python’sos.system
andos.popen
can be called by a
script to run
any
command line that can be typed
at a DOS shell prompt, we can also start independently running
programs from a Python script by simply running a DOSstart
command line. For instance:
C:\...\PP4E\System\Media>python
>>>import os
>>>cmd = 'start lp4e-preface-preview.html'
# start IE browser
>>>os.system(cmd)
# runs independent
0
The Pythonos.system
calls
here start whatever web page browser is registered on your machine to
open
.html
files (unless these programs are
already running). The launched programs run completely independent of
the Python session—when running a DOS start command,os.system
does not wait for the spawned
program to exit.
In fact,start
is
so useful that recent Python releases also include anos.startfile
call, which is
essentially the same as spawning a DOS start command withos.system
and works as though the named file
were double-clicked. The following calls, for instance, have a similar
effect:
>>>os.startfile('lp-code-readme.txt')
>>>os.system('start lp-code-readme.txt')
Both pop up the text file in Notepad on my Windows computer.
Unlike the second of these calls, though,os.startfile
provides no option to wait for
the application to close (the DOSstart
command’s/WAIT
option does) and no way to retrieve
the application’s exit status (returned fromos.system
).
On recent versions of Windows, the following has a similar
effect, too, because the registry is used at the command line (though
this form pauses until the file’s viewer is closed—like usingstart /WAIT
):
>>>os.system('lp-code-readme.txt')
# 'start' is optional today
This is a convenient way to open arbitrary document and media
files, but keep in mind that theos.startfile
call works only on Windows,
because it uses the Windows registry to know how to open a file. In
fact, there are even more obscure and nonportable ways to launch
programs, including Windows-specific options in the PyWin32 package,
which we’ll finesse here. If you want to be more platform neutral,
consider using one of the other many program launcher tools we’ve
seen, such asos.popen
oros.spawnv
. Or better yet, write a module to
hide the details—as the next and final
section demonstrates.