Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

Programming Python (16 page)

Other os Module Exports

That’s as much of a tour around
osas we have space for here. Since most other
osmodule tools are even more
difficult to appreciate outside the context of larger application
topics, we’ll postpone a deeper look at them until later chapters. But
to let you sample the flavor of this module, here is a quick preview for
reference. Among the
osmodule’s
other weapons are these:

os.environ: Fetches and
sets shell environment variables
os.fork: Spawns a
new child process on Unix-like systems
os.pipe: Communicates
between programs
os.execlp: Starts new
programs
os.spawnv: Starts new programs with
lower-level control
os.open: Opens a low-level
descriptor-based file
os.mkdir: Creates a
new directory
os.mkfifo: Creates a new
named pipe
os.stat: Fetches
low-level file information
os.remove: Deletes a
file by its pathname
os.walk: Applies a
function or loop body to all parts of an entire
directory tree

And so on. One caution up front: the
osmodule provides a set of file
open,
read,
and
writecalls, but all of these
deal with low-level file access and are entirely distinct from Python’s
built-in
stdiofile objects that we
create with the built-in
openfunction. You should normally use the built-in
openfunction, not the
osmodule, for all but very special
file-processing needs (e.g., opening with exclusive access file
locking).

In the next chapter we will apply
sysand
ostools such as those we’ve introduced here to implement common
system-level tasks, but this book doesn’t have space to provide an
exhaustive list of the contents of modules we will meet along the way.
Again, if you have not already done so, you should become acquainted
with the contents of modules such as
osand
sysusing the resources described earlier. For now, let’s move on to explore
additional system tools in the context of broader system programming
concepts—
the context surrounding a
running
script.

subprocess, os.popen, and Iterators

In
Chapter 4
, we’ll explore
file iterators, but you’ve probably already studied the basics prior
to picking up this book. Because
os.popenobjects have an iterator that reads
one line at a time, their
readlinesmethod call is usually superfluous. For example, the following steps
through lines produced by another program without any explicit
reads:

>>>
import os
>>>
for line in os.popen('dir /B *.py'): print(line, end='')
...
helloshell.py
more.py
__init__.py

Interestingly, Python 3.1 implements
os.popenusing the
subprocess.Popenobject that we studied in this chapter. You can see this for yourself
in file
os.py
in the Python
standard library on your machine (see
C:\Python31\Lib
on Windows); the
os.popenresult is an object that manages
the
Popenobject and its piped
stream:

>>>
I = os.popen('dir /B *.py')
>>>
I

Because this pipe wrapper object defines an
__iter__method, it
supports line iteration, both automatic (e.g., the
forloop above) and manual. Curiously,
although the pipe wrapper object supports direct
__next__method calls as though it were its
own iterator (just like simple files), it does not support the
nextbuilt-in function, even though
the latter is supposed to simply call the former:

>>>
I = os.popen('dir /B *.py')
>>>
I.__next__()
'helloshell.py\n'
>>>
I = os.popen('dir /B *.py')
>>>
next(I)
TypeError: _wrap_close object is not an iterator

The reason for this is subtle—direct
__next__calls are intercepted by a
__getattr__defined in the pipe wrapper
object, and are properly delegated to the wrapped object; but
nextfunction calls invoke Python’s operator
overloading machinery, which in 3.X bypasses the wrapper’s
__getattr__for special method names like
__next__. Since the pipe wrapper
object doesn’t define a
__next__of
its own, the call is not caught and delegated, and the
nextbuilt-in fails. As explained in full in
the book
Learning
Python
, the wrapper’s
__getattr__isn’t tried because 3.X begins
such searches at the class, not the instance.

This behavior may or may not have been anticipated, and you
don’t need to care if you iterate over pipe lines automatically with
forloops, comprehensions, and
other tools. To code manual iterations robustly, though, be sure to
call the
iterbuilt-in first—this
invokes the
__iter__defined in the
pipe wrapper object itself, to correctly support both flavors of
advancement:

>>>
I = os.popen('dir /B *.py'
)
>>>
I = iter(I)
# what for loops do
>>>
I.__next__()
# now both forms work
'helloshell.py\n'
>>>
next(I)
'more.py\n'

^[
6
]The Python code
exec(open(file).read())also runs a
program file’s code, but within the same process that called it.
It’s similar to an import in that regard, but it works more as if
the file’s text had been
pasted
into the
calling program at the place where the
execcall appears (unless explicit
global or local namespace dictionaries are passed). Unlike
imports, such an
execunconditionally reads and executes a file’s code (it may be run
more than once per process), no module object is generated by the
file’s execution, and unless optional namespace dictionaries are
passed in, assignments in the file’s code may overwrite variables
in the scope where the
execappears; see other resources or the Python library manual for more
details.

Chapter 3. Script Execution Context

“I’d Like to Have an Argument, Please”

Python scripts don’t run in a vacuum (despite what you may have
heard). Depending on platforms and startup procedures, Python programs may
have all sorts of enclosing context—information automatically passed in to
the program by the operating system when the program starts up. For
instance, scripts have access to the following sorts of system-level
inputs and interfaces:

Current working directory: os.getcwdgives
access to the directory from which a script is
started, and many file tools use its value implicitly.
Command-line arguments: sys.argvgives
access to words typed on the command line that are
used to start the program and that serve as script inputs.
Shell variables: os.environprovides
an interface to names assigned in the enclosing shell
(or a parent program) and passed in to the script.
Standard streams: sys.stdin,
stdout, and
stderrexport the three
input/output streams that are at the heart of
command-line shell tools, and can be leveraged by scripts with
printoptions, the
os.popencall and
subprocessmodule introduced in
Chapter 2
, the
io.StringIOclass, and more.

Such tools can serve as inputs to scripts, configuration parameters,
and so on. In this chapter, we will explore all these four context’s
tools—both their Python interfaces and their typical roles.

Current Working Directory

The notion of the
current working directory (CWD) turns out to be a key
concept in some scripts’ execution: it’s always the implicit place where
files processed by the script are assumed to reside unless their names
have absolute directory paths. As we saw earlier,
os.getcwdlets a script fetch the CWD name
explicitly, and
os.chdirallows a script
to move to a new CWD.

Keep in mind, though, that filenames without full pathnames map to
the CWD and have nothing to do with your
PYTHONPATHsetting.
Technically, a script is always launched from the CWD, not
the directory containing the script file. Conversely, imports always first
search the directory containing the script, not the CWD (unless the script
happens to also be located in the CWD). Since this distinction is subtle
and tends to trip up beginners, let’s explore it in a bit more
detail.

CWD, Files, and Import Paths

When you run a
Python script by typing a shell command line such as
python dir1\dir2\file.py, the CWD is
the directory you were in when you typed this command, not
dir1
\
dir2
. On the other hand,
Python automatically adds the identity of the script’s home directory to
the front of the module search path such that
file.py
can always import other files in
dir1
\
dir2
no matter where it
is run from. To illustrate, let’s write a simple script to echo both its
CWD and its module search path:

C:\...\PP4E\System>
type whereami.py
import os, sys
print('my os.getcwd =>', os.getcwd())           # show my cwd execution dir
print('my sys.path  =>', sys.path[:6])          # show first 6 import paths
input()                                         # wait for keypress if clicked

Now, running this script in the directory in which it resides sets
the CWD as expected and adds it to the front of the module import search
path. We met the
sys.pathmodule search
path earlier; its first entry might also be the empty string to
designate CWD when you’re working interactively, and most of the CWD has
been truncated to “...” here for display:

C:\...\PP4E\System>
set PYTHONPATH=C:\PP4thEd\Examples
C:\...\PP4E\System>
python whereami.py
my os.getcwd => C:\...\PP4E\System
my sys.path  => ['C:\\...\\PP4E\\System', 'C:\\PP4thEd\\Examples',
...more...
]

But if we run this script from other places, the CWD moves with us
(it’s the directory where we type commands), and Python adds a directory
to the front of the module search path that allows the script to still
see files in its own home directory. For instance, when running from one
level up (
..), the
Systemname added to the front of
sys.pathwill be the first directory that
Python searches for imports within
whereami.py
; it
points imports back to the directory containing the script that was run.
Filenames without complete paths, though, will be mapped to the CWD
(
C:\PP4thEd\Examples\PP4E
), not the
System
subdirectory nested there:

C:\...\PP4E\System>
cd ..
C:\...\PP4E>
python System\whereami.py
my os.getcwd => C:\...\PP4E
my sys.path  => ['C:\\...\\PP4E\\System', 'C:\\PP4thEd\\Examples',
...more...
]
C:\...\PP4E>
cd System\temp
C:\...\PP4E\System\temp>
python ..\whereami.py
my os.getcwd => C:\...\PP4E\System\temp
my sys.path  => ['C:\\...\\PP4E\\System', 'C:\\PP4thEd\\Examples', ...]

The net effect is that filenames without
directory paths in a script will be mapped to the place
where the
command
was typed (
os.getcwd), but imports still have access to
the directory of the
script
being run (via the
front of
sys.path). Finally, when a
file is launched by clicking its icon, the CWD is just the directory
that contains the clicked file. The following output, for example,
appears in a new DOS console box when
whereami.py
is double-clicked in Windows Explorer:

my os.getcwd => C:\...\PP4E\System
my sys.path  => ['C:\\...\\PP4E\\System',
...more...
]

In this case, both the CWD used for filenames and the first import
search directory are the directory containing the script file. This all
usually works out just as you expect, but there are two pitfalls to
avoid:

Filenames might need to include complete directory paths if
scripts cannot be sure from where they will be run.
Command-line scripts cannot always rely on the CWD to gain
import visibility to files that are not in their own directories;
instead, use
PYTHONPATHsettings
and package import paths to access modules in other
directories.

For example, scripts in this book, regardless of how they are run,
can always import other files in their own home directories without
package path imports (
import filehere), but must go through the PP4E package root to find
files anywhere else in the examples tree (
from PP4E.dir1.dir2 import filethere), even if they are run from
the directory containing the desired external module. As usual for
modules, the
PP4E\dir1\dir2
directory name could
also be added to
PYTHONPATHto make
files there visible everywhere without package path imports (though
adding more directories to
PYTHONPATHincreases the likelihood of name clashes). In either case, though,
imports are always resolved to the script’s home directory or other
Python search path settings, not to the CWD.

CWD and Command Lines

This distinction
between the CWD and import search paths explains why many
scripts in this book designed to operate in the current working
directory (instead of one whose name is passed in) are run with command
lines such as this one:

C:\temp>
python C:\...\PP4E\Tools\cleanpyc.py
process cwd

In this example, the Python script file itself lives in the
directory
C:\...\PP4E\Tools
, but because it is run
from
C:\temp
, it processes the files located in
C:\temp
(i.e., in the CWD, not in the script’s home
directory). To process files elsewhere with such a script, simply
cdto the directory to be processed
to change the CWD:

C:\temp>
cd C:\PP4thEd\Examples
C:\PP4thEd\Examples>
python C:\...\PP4E\Tools\cleanpyc.py
process cwd

Because the CWD is always implied, a
cdcommand tells the
script which directory to process in no less certain terms than passing
a directory name to the script explicitly, like this (portability note:
you may need to add quotes around the
*.pyin this and other command-line examples
to prevent it from being expanded in some Unix shells):

C:\...\PP4E\Tools>
python find.py *.py C:\temp
process named dir

In this command line, the CWD is the directory containing the
script to be run (notice that the script filename has no directory path
prefix); but since this script processes a directory named explicitly on
the command line (
C:\temp
), the CWD is irrelevant.
Finally, if we want to run such a script located in some other directory
in order to process files located in yet another directory, we can
simply give directory paths to both:

C:\temp>
python C:\...\PP4E\Tools\find.py *.cxx C:\PP4thEd\Examples\PP4E

Here, the script has import visibility to files in its
PP4E\Tools
home directory and processes files in
the directory named on the command line, but the CWD is something else
entirely (
C:\temp
). This last form is more to type,
of course, but watch for a variety of CWD and explicit script-path
command lines like these in this
book.

Other books

Deliverance by James Dickey

Beowulf's Children by Niven, Larry, Pournelle, Jerry, Barnes, Steven

The Blue Moon - Part 1 - Into the Forest by Nolan Bauerle

Keeping Her Guilty Secret (Forever Yours Trilogy) by James, Anna

Valentine's Wishes by Daisy Banks

A Courtesan’s Guide to Getting Your Man by Celeste Bradley, Susan Donovan

Unsettled (Chosen #1) by Alisa Mullen

Before I Fall by Lauren Oliver

Paper Chasers by Mark Anthony

The Boy in the Field by Jo Oram