Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

Programming Python (16 page)

Other os Module Exports

That’s as much of a tour around
os
as we have space for here. Since most other
os
module tools are even more
difficult to appreciate outside the context of larger application
topics, we’ll postpone a deeper look at them until later chapters. But
to let you sample the flavor of this module, here is a quick preview for
reference. Among the
os
module’s
other weapons are these:

os.environ

Fetches and
sets shell environment variables

os.fork

Spawns a
new child process on Unix-like systems

os.pipe

Communicates
between programs

os.execlp

Starts new
programs

os.spawnv

Starts new programs with
lower-level control

os.open

Opens a low-level
descriptor-based file

os.mkdir

Creates a
new directory

os.mkfifo

Creates a new
named pipe

os.stat

Fetches
low-level file information

os.remove

Deletes a
file by its pathname

os.walk

Applies a
function or loop body to all parts of an entire
directory tree

And so on. One caution up front: the
os
module provides a set of file
open
,
read
,
and
write
calls, but all of these
deal with low-level file access and are entirely distinct from Python’s
built-in
stdio
file objects that we
create with the built-in
open
function. You should normally use the built-in
open
function, not the
os
module, for all but very special
file-processing needs (e.g., opening with exclusive access file
locking).

In the next chapter we will apply
sys
and
os
tools such as those we’ve introduced here to implement common
system-level tasks, but this book doesn’t have space to provide an
exhaustive list of the contents of modules we will meet along the way.
Again, if you have not already done so, you should become acquainted
with the contents of modules such as
os
and
sys
using the resources described earlier. For now, let’s move on to explore
additional system tools in the context of broader system programming
concepts—
the context surrounding a
running
script.

subprocess, os.popen, and Iterators

In
Chapter 4
, we’ll explore
file iterators, but you’ve probably already studied the basics prior
to picking up this book. Because
os.popen
objects have an iterator that reads
one line at a time, their
readlines
method call is usually superfluous. For example, the following steps
through lines produced by another program without any explicit
reads:

>>>
import os
>>>
for line in os.popen('dir /B *.py'): print(line, end='')
...
helloshell.py
more.py
__init__.py

Interestingly, Python 3.1 implements
os.popen
using the
subprocess.Popen
object that we studied in this chapter. You can see this for yourself
in file
os.py
in the Python
standard library on your machine (see
C:\Python31\Lib
on Windows); the
os.popen
result is an object that manages
the
Popen
object and its piped
stream:

>>>
I = os.popen('dir /B *.py')
>>>
I

Because this pipe wrapper object defines an
__iter__
method, it
supports line iteration, both automatic (e.g., the
for
loop above) and manual. Curiously,
although the pipe wrapper object supports direct
__next__
method calls as though it were its
own iterator (just like simple files), it does not support the
next
built-in function, even though
the latter is supposed to simply call the former:

>>>
I = os.popen('dir /B *.py')
>>>
I.__next__()
'helloshell.py\n'
>>>
I = os.popen('dir /B *.py')
>>>
next(I)
TypeError: _wrap_close object is not an iterator

The reason for this is subtle—direct
__next__
calls are intercepted by a
__getattr__
defined in the pipe wrapper
object, and are properly delegated to the wrapped object; but
next
function calls invoke Python’s operator
overloading machinery, which in 3.X bypasses the wrapper’s
__getattr__
for special method names like
__next__
. Since the pipe wrapper
object doesn’t define a
__next__
of
its own, the call is not caught and delegated, and the
next
built-in fails. As explained in full in
the book
Learning
Python
, the wrapper’s
__getattr__
isn’t tried because 3.X begins
such searches at the class, not the instance.

This behavior may or may not have been anticipated, and you
don’t need to care if you iterate over pipe lines automatically with
for
loops, comprehensions, and
other tools. To code manual iterations robustly, though, be sure to
call the
iter
built-in first—this
invokes the
__iter__
defined in the
pipe wrapper object itself, to correctly support both flavors of
advancement:

>>>
I = os.popen('dir /B *.py'
)
>>>
I = iter(I)
# what for loops do
>>>
I.__next__()
# now both forms work
'helloshell.py\n'
>>>
next(I)
'more.py\n'

[
6
]
The Python code
exec(open(file).read())
also runs a
program file’s code, but within the same process that called it.
It’s similar to an import in that regard, but it works more as if
the file’s text had been
pasted
into the
calling program at the place where the
exec
call appears (unless explicit
global or local namespace dictionaries are passed). Unlike
imports, such an
exec
unconditionally reads and executes a file’s code (it may be run
more than once per process), no module object is generated by the
file’s execution, and unless optional namespace dictionaries are
passed in, assignments in the file’s code may overwrite variables
in the scope where the
exec
appears; see other resources or the Python library manual for more
details.

Chapter 3. Script Execution Context
“I’d Like to Have an Argument, Please”

Python scripts don’t run in a vacuum (despite what you may have
heard). Depending on platforms and startup procedures, Python programs may
have all sorts of enclosing context—information automatically passed in to
the program by the operating system when the program starts up. For
instance, scripts have access to the following sorts of system-level
inputs and interfaces:

Current working directory

os.getcwd
gives
access to the directory from which a script is
started, and many file tools use its value implicitly.

Command-line arguments

sys.argv
gives
access to words typed on the command line that are
used to start the program and that serve as script inputs.

Shell variables

os.environ
provides
an interface to names assigned in the enclosing shell
(or a parent program) and passed in to the script.

Standard streams

sys.stdin
,
stdout
, and
stderr
export the three
input/output streams that are at the heart of
command-line shell tools, and can be leveraged by scripts with
print
options, the
os.popen
call and
subprocess
module introduced in
Chapter 2
, the
io.StringIO
class, and more.

Such tools can serve as inputs to scripts, configuration parameters,
and so on. In this chapter, we will explore all these four context’s
tools—both their Python interfaces and their typical roles.

Current Working Directory

The notion of the
current working directory (CWD) turns out to be a key
concept in some scripts’ execution: it’s always the implicit place where
files processed by the script are assumed to reside unless their names
have absolute directory paths. As we saw earlier,
os.getcwd
lets a script fetch the CWD name
explicitly, and
os.chdir
allows a script
to move to a new CWD.

Keep in mind, though, that filenames without full pathnames map to
the CWD and have nothing to do with your
PYTHONPATH
setting.
Technically, a script is always launched from the CWD, not
the directory containing the script file. Conversely, imports always first
search the directory containing the script, not the CWD (unless the script
happens to also be located in the CWD). Since this distinction is subtle
and tends to trip up beginners, let’s explore it in a bit more
detail.

CWD, Files, and Import Paths

When you run a
Python script by typing a shell command line such as
python dir1\dir2\file.py
, the CWD is
the directory you were in when you typed this command, not
dir1
\
dir2
. On the other hand,
Python automatically adds the identity of the script’s home directory to
the front of the module search path such that
file.py
can always import other files in
dir1
\
dir2
no matter where it
is run from. To illustrate, let’s write a simple script to echo both its
CWD and its module search path:

C:\...\PP4E\System>
type whereami.py
import os, sys
print('my os.getcwd =>', os.getcwd()) # show my cwd execution dir
print('my sys.path =>', sys.path[:6]) # show first 6 import paths
input() # wait for keypress if clicked

Now, running this script in the directory in which it resides sets
the CWD as expected and adds it to the front of the module import search
path. We met the
sys.path
module search
path earlier; its first entry might also be the empty string to
designate CWD when you’re working interactively, and most of the CWD has
been truncated to “...” here for display:

C:\...\PP4E\System>
set PYTHONPATH=C:\PP4thEd\Examples
C:\...\PP4E\System>
python whereami.py
my os.getcwd => C:\...\PP4E\System
my sys.path => ['C:\\...\\PP4E\\System', 'C:\\PP4thEd\\Examples',
...more...
]

But if we run this script from other places, the CWD moves with us
(it’s the directory where we type commands), and Python adds a directory
to the front of the module search path that allows the script to still
see files in its own home directory. For instance, when running from one
level up (
..
), the
System
name added to the front of
sys.path
will be the first directory that
Python searches for imports within
whereami.py
; it
points imports back to the directory containing the script that was run.
Filenames without complete paths, though, will be mapped to the CWD
(
C:\PP4thEd\Examples\PP4E
), not the
System
subdirectory nested there:

C:\...\PP4E\System>
cd ..
C:\...\PP4E>
python System\whereami.py
my os.getcwd => C:\...\PP4E
my sys.path => ['C:\\...\\PP4E\\System', 'C:\\PP4thEd\\Examples',
...more...
]
C:\...\PP4E>
cd System\temp
C:\...\PP4E\System\temp>
python ..\whereami.py
my os.getcwd => C:\...\PP4E\System\temp
my sys.path => ['C:\\...\\PP4E\\System', 'C:\\PP4thEd\\Examples', ...]

The net effect is that filenames without
directory paths in a script will be mapped to the place
where the
command
was typed (
os.getcwd
), but imports still have access to
the directory of the
script
being run (via the
front of
sys.path
). Finally, when a
file is launched by clicking its icon, the CWD is just the directory
that contains the clicked file. The following output, for example,
appears in a new DOS console box when
whereami.py
is double-clicked in Windows Explorer:

my os.getcwd => C:\...\PP4E\System
my sys.path => ['C:\\...\\PP4E\\System',
...more...
]

In this case, both the CWD used for filenames and the first import
search directory are the directory containing the script file. This all
usually works out just as you expect, but there are two pitfalls to
avoid:

  • Filenames might need to include complete directory paths if
    scripts cannot be sure from where they will be run.

  • Command-line scripts cannot always rely on the CWD to gain
    import visibility to files that are not in their own directories;
    instead, use
    PYTHONPATH
    settings
    and package import paths to access modules in other
    directories.

For example, scripts in this book, regardless of how they are run,
can always import other files in their own home directories without
package path imports (
import
filehere
), but must go through the PP4E package root to find
files anywhere else in the examples tree (
from
PP4E.dir1.dir2 import filethere
), even if they are run from
the directory containing the desired external module. As usual for
modules, the
PP4E\dir1\dir2
directory name could
also be added to
PYTHONPATH
to make
files there visible everywhere without package path imports (though
adding more directories to
PYTHONPATH
increases the likelihood of name clashes). In either case, though,
imports are always resolved to the script’s home directory or other
Python search path settings, not to the CWD.

CWD and Command Lines

This distinction
between the CWD and import search paths explains why many
scripts in this book designed to operate in the current working
directory (instead of one whose name is passed in) are run with command
lines such as this one:

C:\temp>
python C:\...\PP4E\Tools\cleanpyc.py
process cwd

In this example, the Python script file itself lives in the
directory
C:\...\PP4E\Tools
, but because it is run
from
C:\temp
, it processes the files located in
C:\temp
(i.e., in the CWD, not in the script’s home
directory). To process files elsewhere with such a script, simply
cd
to the directory to be processed
to change the CWD:

C:\temp>
cd C:\PP4thEd\Examples
C:\PP4thEd\Examples>
python C:\...\PP4E\Tools\cleanpyc.py
process cwd

Because the CWD is always implied, a
cd
command tells the
script which directory to process in no less certain terms than passing
a directory name to the script explicitly, like this (portability note:
you may need to add quotes around the
*.py
in this and other command-line examples
to prevent it from being expanded in some Unix shells):

C:\...\PP4E\Tools>
python find.py *.py C:\temp
process named dir

In this command line, the CWD is the directory containing the
script to be run (notice that the script filename has no directory path
prefix); but since this script processes a directory named explicitly on
the command line (
C:\temp
), the CWD is irrelevant.
Finally, if we want to run such a script located in some other directory
in order to process files located in yet another directory, we can
simply give directory paths to both:

C:\temp>
python C:\...\PP4E\Tools\find.py *.cxx C:\PP4thEd\Examples\PP4E

Here, the script has import visibility to files in its
PP4E\Tools
home directory and processes files in
the directory named on the command line, but the CWD is something else
entirely (
C:\temp
). This last form is more to type,
of course, but watch for a variety of CWD and explicit script-path
command lines like these in this
book.

Other books

Deliverance by James Dickey
Beowulf's Children by Niven, Larry, Pournelle, Jerry, Barnes, Steven
Valentine's Wishes by Daisy Banks
A Courtesan’s Guide to Getting Your Man by Celeste Bradley, Susan Donovan
Unsettled (Chosen #1) by Alisa Mullen
Before I Fall by Lauren Oliver
Paper Chasers by Mark Anthony


readsbookonline.com Copyright 2016 - 2024