Mistakes happen. As
we’ve seen, Python provides interfaces to a variety of
system services, along with tools for adding others.
Example 6-9
shows some of the more
commonly used system tools in action. It implements a simple
regression test
system for Python scripts—it runs
each in a directory of Python scripts with provided input and command-line
arguments, and compares the output of each run to the prior run’s results.
As such, this script can be used as an automated testing system to catch
errors introduced by changes in program source files; in a big system, you
might not know when a fix is really a bug in disguise.
Example 6-9. PP4E\System\Tester\tester.py
"""
################################################################################
Test a directory of Python scripts, passing command-line arguments,
piping in stdin, and capturing stdout, stderr, and exit status to
detect failures and regressions from prior run outputs. The subprocess
module spawns and controls streams (much like os.popen3 in Python 2.X),
and is cross-platform. Streams are always binary bytes in subprocess.
Test inputs, args, outputs, and errors map to files in subdirectories.
This is a command-line script, using command-line arguments for
optional test directory name, and force-generation flag. While we
could package it as a callable function, the fact that its results
are messages and output files makes a call/return model less useful.
Suggested enhancement: could be extended to allow multiple sets
of command-line arguments and/or inputs per test script, to run a
script multiple times (glob for multiple ".in*" files in Inputs?).
Might also seem simpler to store all test files in same directory
with different extensions, but this could grow large over time.
Could also save both stderr and stdout to Errors on failures, but
I prefer to have expected/actual output in Outputs on regressions.
################################################################################
"""
import os, sys, glob, time
from subprocess import Popen, PIPE
# configuration args
testdir = sys.argv[1] if len(sys.argv) > 1 else os.curdir
forcegen = len(sys.argv) > 2
print('Start tester:', time.asctime())
print('in', os.path.abspath(testdir))
def verbose(*args):
print('-'*80)
for arg in args: print(arg)
def quiet(*args): pass
trace = quiet
# glob scripts to be tested
testpatt = os.path.join(testdir, 'Scripts', '*.py')
testfiles = glob.glob(testpatt)
testfiles.sort()
trace(os.getcwd(), *testfiles)
numfail = 0
for testpath in testfiles: # run all tests in dir
testname = os.path.basename(testpath) # strip directory path
# get input and args
infile = testname.replace('.py', '.in')
inpath = os.path.join(testdir, 'Inputs', infile)
indata = open(inpath, 'rb').read() if os.path.exists(inpath) else b''
argfile = testname.replace('.py', '.args')
argpath = os.path.join(testdir, 'Args', argfile)
argdata = open(argpath).read() if os.path.exists(argpath) else ''
# locate output and error, scrub prior results
outfile = testname.replace('.py', '.out')
outpath = os.path.join(testdir, 'Outputs', outfile)
outpathbad = outpath + '.bad'
if os.path.exists(outpathbad): os.remove(outpathbad)
errfile = testname.replace('.py', '.err')
errpath = os.path.join(testdir, 'Errors', errfile)
if os.path.exists(errpath): os.remove(errpath)
# run test with redirected streams
pypath = sys.executable
command = '%s %s %s' % (pypath, testpath, argdata)
trace(command, indata)
process = Popen(command, shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
process.stdin.write(indata)
process.stdin.close()
outdata = process.stdout.read()
errdata = process.stderr.read() # data are bytes
exitstatus = process.wait() # requires binary files
trace(outdata, errdata, exitstatus)
# analyze results
if exitstatus != 0:
print('ERROR status:', testname, exitstatus) # status and/or stderr
if errdata:
print('ERROR stream:', testname, errpath) # save error text
open(errpath, 'wb').write(errdata)
if exitstatus or errdata: # consider both failure
numfail += 1 # can get status+stderr
open(outpathbad, 'wb').write(outdata) # save output to view
elif not os.path.exists(outpath) or forcegen:
print('generating:', outpath) # create first output
open(outpath, 'wb').write(outdata)
else:
priorout = open(outpath, 'rb').read() # or compare to prior
if priorout == outdata:
print('passed:', testname)
else:
numfail += 1
print('FAILED output:', testname, outpathbad)
open(outpathbad, 'wb').write(outdata)
print('Finished:', time.asctime())
print('%s tests were run, %s tests failed.' % (len(testfiles), numfail))
We’ve seen the tools used by this script earlier in this part of the
book—subprocess
,os.path
,glob
, files, and the like. This example largely
just pulls these tools together to solve a useful purpose. Its core
operation is comparing new outputs to old, in order to spot changes
(“regressions”). Along the way, it also manages command-line arguments,
error messages, status codes, and files.
This script is also larger than most we’ve seen so far, but it’s a
realistic and representative system administration tool (in fact, it’s
derived from a similar tool I actually used in the past to detect changes
in a compiler). Probably the best way to understand how it works is to
demonstrate what it does. The next section steps through a testing session
to be read in conjunction with studying the test script’s code.
Much of the magic behind the test driver script in
Example 6-9
has to do with its
directory structure. When you run it for the first time in a test
directory (or force it to start from scratch there by passing a second
command-line argument), it:
Collects scripts to be run in theScripts
subdirectory
Fetches any associated script input and command-line arguments
from theInputs
andArgs
subdirectories
Generates initial
stdout
output files for tests that exit normally in theOutputs
subdirectory
Reports tests that fail either by exit status code or by error
messages appearing in
stderr
On all failures, the script also saves any
stderr
error message text, as well as any
stdout
data generated up to the
point of failure; standard error text is saved to a file in theErrors
subdirectory, and standard output of
failed tests is saved with a special “.bad” filename extension inOutputs
(saving this normally in theOutputs
subdirectory would trigger a
failure when the test is later fixed!). Here’s a first run:
C:\...\PP4E\System\Tester>python tester.py . 1
Start tester: Mon Feb 22 22:13:38 2010
in C:\Users\mark\Stuff\Books\4E\PP4E\dev\Examples\PP4E\System\Tester
generating: .\Outputs\test-basic-args.out
generating: .\Outputs\test-basic-stdout.out
generating: .\Outputs\test-basic-streams.out
generating: .\Outputs\test-basic-this.out
ERROR status: test-errors-runtime.py 1
ERROR stream: test-errors-runtime.py .\Errors\test-errors-runtime.err
ERROR status: test-errors-syntax.py 1
ERROR stream: test-errors-syntax.py .\Errors\test-errors-syntax.err
ERROR status: test-status-bad.py 42
generating: .\Outputs\test-status-good.out
Finished: Mon Feb 22 22:13:41 2010
8 tests were run, 3 tests failed.
To run each script, the tester configures any preset command-line
arguments provided, pipes in fetched canned input (if any), and captures
the script’s standard output and error streams, along with its exit
status code. When I ran this example, there were 8 test scripts, along
with a variety of inputs and outputs. Since the directory and file
naming structures are the key to this example, here is a listing of the
test directory I used—theScripts
directory is primary, because that’s where tests to be run are
collected
:
C:\...\PP4E\System\Tester>dir /B
Args
Errors
Inputs
Outputs
Scripts
tester.py
xxold
C:\...\PP4E\System\Tester>dir /B Scripts
test-basic-args.py
test-basic-stdout.py
test-basic-streams.py
test-basic-this.py
test-errors-runtime.py
test-errors-syntax.py
test-status-bad.py
test-status-good.py
The other subdirectories contain any required inputs and any
generated outputs associated with scripts to be tested:
C:\...\PP4E\System\Tester>dir /B Args
test-basic-args.args
test-status-good.args
C:\...\PP4E\System\Tester>dir /B Inputs
test-basic-args.in
test-basic-streams.in
C:\...\PP4E\System\Tester>dir /B Outputs
test-basic-args.out
test-basic-stdout.out
test-basic-streams.out
test-basic-this.out
test-errors-runtime.out.bad
test-errors-syntax.out.bad
test-status-bad.out.bad
test-status-good.out
C:\...\PP4E\System\Tester>dir /B Errors
test-errors-runtime.err
test-errors-syntax.err
I won’t list all these files here (as you can see, there are many,
and all are available in the book examples distribution package), but to
give you the general flavor, here are the files associated with the test
script
test-basic-args.py
:
C:\...\PP4E\System\Tester>type Scripts\test-basic-args.py
# test args, streams
import sys, os
print(os.getcwd()) # to Outputs
print(sys.path[0])
print('[argv]')
for arg in sys.argv: # from Args
print(arg) # to Outputs
print('[interaction]') # to Outputs
text = input('Enter text:') # from Inputs
rept = sys.stdin.readline() # from Inputs
sys.stdout.write(text * int(rept)) # to Outputs
C:\...\PP4E\System\Tester>type Args\test-basic-args.args
-command -line --stuff
C:\...\PP4E\System\Tester>type Inputs\test-basic-args.in
Eggs
10
C:\...\PP4E\System\Tester>type Outputs\test-basic-args.out
C:\Users\mark\Stuff\Books\4E\PP4E\dev\Examples\PP4E\System\Tester
C:\Users\mark\Stuff\Books\4E\PP4E\dev\Examples\PP4E\System\Tester\Scripts
[argv]
.\Scripts\test-basic-args.py
-command
-line
--stuff
[interaction]
Enter text:EggsEggsEggsEggsEggsEggsEggsEggsEggsEggs
And here are two files related to one of the detected errors—the
first is its captured
stderr
, and
the second is its
stdout
generated
up to the point where the error occurred; these are for human (or other
tools) inspection, and are automatically removed the next time the
tester script runs:
C:\...\PP4E\System\Tester>type Errors\test-errors-runtime.err
Traceback (most recent call last):
File ".\Scripts\test-errors-runtime.py", line 3, in
print(1 / 0)
ZeroDivisionError: int division or modulo by zero
C:\...\PP4E\System\Tester>type Outputs\test-errors-runtime.out.bad
starting
Now, when run again without making any changes to the tests, the
test driver script compares saved prior outputs to new ones and detects
no regressions; failures designated by exit status and
stderr
messages are still reported as before,
but there are no deviations from other tests’ saved expected
output:
C:\...\PP4E\System\Tester>python tester.py
Start tester: Mon Feb 22 22:26:41 2010
in C:\Users\mark\Stuff\Books\4E\PP4E\dev\Examples\PP4E\System\Tester
passed: test-basic-args.py
passed: test-basic-stdout.py
passed: test-basic-streams.py
passed: test-basic-this.py
ERROR status: test-errors-runtime.py 1
ERROR stream: test-errors-runtime.py .\Errors\test-errors-runtime.err
ERROR status: test-errors-syntax.py 1
ERROR stream: test-errors-syntax.py .\Errors\test-errors-syntax.err
ERROR status: test-status-bad.py 42
passed: test-status-good.py
Finished: Mon Feb 22 22:26:43 2010
8 tests were run, 3 tests failed.
But when I make a change in one of the test scripts that will
produce different output (I changed a loop counter to print fewer
lines), the regression is caught and reported; the new and different
output of the script is reported as a failure, and saved inOutputs
as a “.bad” for
later viewing:
C:\...\PP4E\System\Tester>python tester.py
Start tester: Mon Feb 22 22:28:35 2010
in C:\Users\mark\Stuff\Books\4E\PP4E\dev\Examples\PP4E\System\Tester
passed: test-basic-args.py
FAILED output: test-basic-stdout.py .\Outputs\test-basic-stdout.out.bad
passed: test-basic-streams.py
passed: test-basic-this.py
ERROR status: test-errors-runtime.py 1
ERROR stream: test-errors-runtime.py .\Errors\test-errors-runtime.err
ERROR status: test-errors-syntax.py 1
ERROR stream: test-errors-syntax.py .\Errors\test-errors-syntax.err
ERROR status: test-status-bad.py 42
passed: test-status-good.py
Finished: Mon Feb 22 22:28:38 2010
8 tests were run, 4 tests failed.
C:\...\PP4E\System\Tester>type Outputs\test-basic-stdout.out.bad
begin
Spam!
Spam!Spam!
Spam!Spam!Spam!
Spam!Spam!Spam!Spam!
end
One last usage note: if you change thetrace
variable in this script to beverbose
, you’ll get much more output designed
to help you trace the programs operation (but probably too much for real
testing runs):
C:\...\PP4E\System\Tester>tester.py
Start tester: Mon Feb 22 22:34:51 2010
in C:\Users\mark\Stuff\Books\4E\PP4E\dev\Examples\PP4E\System\Tester
--------------------------------------------------------------------------------
C:\Users\mark\Stuff\Books\4E\PP4E\dev\Examples\PP4E\System\Tester
.\Scripts\test-basic-args.py
.\Scripts\test-basic-stdout.py
.\Scripts\test-basic-streams.py
.\Scripts\test-basic-this.py
.\Scripts\test-errors-runtime.py
.\Scripts\test-errors-syntax.py
.\Scripts\test-status-bad.py
.\Scripts\test-status-good.py
--------------------------------------------------------------------------------
C:\Python31\python.exe .\Scripts\test-basic-args.py -command -line --stuff
b'Eggs\r\n10\r\n'
--------------------------------------------------------------------------------
b'C:\\Users\\mark\\Stuff\\Books\\4E\\PP4E\\dev\\Examples\\PP4E\\System\\Tester\r
\nC:\\Users\\mark\\Stuff\\Books\\4E\\PP4E\\dev\\Examples\\PP4E\\System\\Tester\\
Scripts\r\n[argv]\r\n.\\Scripts\\test-basic-args.py\r\n-command\r\n-line\r\n--st
uff\r\n[interaction]\r\nEnter text:EggsEggsEggsEggsEggsEggsEggsEggsEggsEggs'
b''
0
passed: test-basic-args.py
...more lines deleted...
Study the test driver’s code for more details. Naturally, there is
much more to the general testing story than we have space for here. For
example, in-process tests don’t need to spawn programs and can generally
make do with importing modules and testing them intry
exception handler statements. There is
also ample room for expansion and customization in our testing script
(see its docstring for starters). Moreover, Python comes with two
testing frameworks,doctest
andunittest
(a.k.a. PyUnit), which
provide techniques and structures for coding regression and unit
tests:
An object-oriented
framework that specifies test cases, expected
results, and test suites. Subclasses provide test methods and use
inherited assertion calls to specify expected results.
Parses out and
reruns tests from an interactive session log that is
pasted into a module’s docstrings. The logs give test calls and
expected results; doctest essentially reruns the interactive
session.
See the Python library manual, the PyPI website, and your favorite
Web search engine for additional testing toolkits in both Python itself
and the third-party domain.
For automated testing of Python command-line scripts that run as
independent programs and tap into standard script execution context,
though, our tester does the job. Because the test driver is fully
independent of the scripts it tests, we can drop in new test cases
without having to update the driver’s code. And because it is written in
Python, it’s quick and easy to change as our testing needs evolve. As
we’ll see again in the next section, this “scriptability” that Python
provides can be a decided advantage for real
tasks.
Testing Gone Bad?
Once we learn about sending email from Python scripts in
Chapter 13
, you might also want to augment
this script to automatically send out email when regularly run tests
fail (e.g., when run from acron
job on Unix). That way, you don’t even need to remember to check
results. Of course, you could go further still.
One company I worked for added sound effects to compiler test
scripts; you got an audible round of applause if no regressions were
found and an entirely different noise otherwise. (See
playfile.py
at the end of this chapter for
hints.)
Another company in my development past ran a nightly test script
that automatically isolated the source code file check-in that
triggered a test regression and sent a nasty email to the guilty party
(and his or her supervisor). Nobody expects the Spanish
Inquisition
!