Read Harnessed: How Language and Music Mimicked Nature and Transformed Ape to Man Online
Authors: Mark Changizi
Tags: #Non-Fiction
Figure 1
.
(a)
The brain was shaped by natural selection for nature, and culture was shaped by cultural selection for the brain.
(b)
By shaping culture to look like nature, culture will tend to end up shaped well for the brain. And, importantly, we scientists can hope to get a handle on this without having to understand the detailed brain mechanisms. The arrow cutting through the brain and going from culture to nature is meant to symbolize my nature-harnessing theoretical approach, which drives this book. It means that my theory will pretend there is a single arrow like this, where culture has been selected to be shaped like nature. This is a simplification of the more detailed picture in
(a)
, and the greater simplicity is a boon to a scientist because the most complicated object in the universe—the brain—has been removed from the “equation.”
We have to be more careful, however, because brains optimized for nature can sometimes like nonnatural things as well. Our mechanisms have been selected for because they work very well on the inputs our ancestors would have experienced. When those natural stimuli are the input, our mechanisms work as they are supposed to—it’s their purpose (or “purp”).
But those same mechanisms don’t typically just sit quietly when nonnatural stimuli are inputted into them. They do
something
. And what they do depends entirely on the implementation details of the mechanism. Because the mechanism wasn’t designed to handle that kind of input, who knows what the mechanism might do? Mechanisms have
quirks
. For example, it is presumably a quirk that certain flashing lights have a propensity to induce seizure.
Brains were selected for their purps, but they end up with lots of quirks as well. When language and music culturally evolved to be structured for our brains, it didn’t matter whether it was the purps or the quirks that were harnessed, so long as the process worked. But if language and music actually came to harness our quirks more than our purps, then the strategy that culture uses would not be nature-harnessing so much as
quirk-
harnessing. And if
that
were the case, I wouldn’t have much of a book left! That is, in this book I am claiming that the principal strategy culture used to harness our brains for language and music is not quirk-harnessing, but purp-harnessing . . . and that culture did its purp-harnessing by mimicking nature, just the thing to ensure that our brain mechanisms run as “purposely” designed.
So, is harnessing about the purps or the quirks? Does culture harness the brain by looking and sounding like nature and thus making the brain function as intended, or does it harness the brain by shaping itself in a way that elicits the brain to function in some quirky accidental manner? Because, as I just said, cultural evolution doesn’t care what it harnesses so long as it works, both purps and quirks are surely both part of the full story of how language and music fit themselves to us. There’s no reason, then, to expect that the quirks should completely
dominate
the story of harnessing. And if that’s the case, then there’s a role for the purps, and for nature-harnessing. Whew!
Actually, I can say more than just that nature-harnessing is unlikely to be completely useless for understanding harnessing. On the contrary, I expect nature-harnessing to be the
key
to how cultural evolution harnessed us, and quirks to be just a small side story. There are two reasons why I don’t think the quirks are the main driver. The first reason is that quirks are not smart enough, and the second reason is that
I
am not smart enough.
Stupid quirks first. If I were to open up the “V” of a stapler, hold one end in my hand, and try to hit you with the swinging end, then I would have created a hitting device (and lost a reader). I would thereby have harnessed the stapler for a new function. But I would have harnessed a
quirk
of the stapler, not a purp. Staplers are not designed to be weapons, or to be swung around like that, at all. They are, accordingly, unlikely to be any good at it; at best, they’ll be
nowhere near
as efficient as tools designed for hitting. My stapler hitting device is, in essence, the worst pair of nunchucks ever devised. You don’t get powerful functionality by accident. If, instead, I were to use the stapler to fasten a pile of leaves together, that would be a case where I have harnessed the
purp
. Staplers may not be for stapling leaves, but leaves clearly resemble paper (in the respects relevant for staplers), which is just what staplers
were
designed for. So, the first reason why quirk-harnessing will be a minimal part of the story of harnessing is that cultural selection will favor the bits of us that are highly engineered masterpieces, not accidental side effects.
Quirks may be stupid, but cultural evolution may sometimes tap into them anyway. After all, who hasn’t tried to remove a staple with a pen tip, or tried to bang a nail in with the handle of a screwdriver? And this leads naturally to the second problem with quirks, which is that
I’m
not smart enough to figure them out. First, there’s no general characterization of the quirks. A quirk occurs whenever the brain is confronted with a nonnatural stimulus, and although there may be a “hard core” for natural stimuli, there are no core ways of being
un
natural. For example, pens can be used for stabbing, picking your teeth, scratching an itch, eyeliner, penny flicking, donut-hole making . . . clearly this list has no end. But the list of what pens are
for
is short: pens are for writing on paper.
And not only are there piles of quirky ways to use a mechanism, but there will typically be no simple characterization of how the mechanism will react in any specific case. Whereas the proper function of a pen can be activated by a mechanism characterized by a description something like this—“a hand holding the pen and lightly moving on the surface of the paper, leaving ink”—the mechanistic descriptions for different quirks will tend to differ wildly, and to refer to physical aspects of the pen that are not part of any description of writing. For example, good penny flicking depends on a pen’s rigidity being in the right range. And the pen I’m holding right now could serve as a container for sand, which depends on how the pen fits together so that it has room left over on the inside. These and many other peculiar characteristics of the mechanism aren’t relevant for understanding the proper function of the pen. And when it comes to the brain, we are woefully ignorant of its mechanisms, and so it is immensely difficult to determine which characteristics are central to its natural operation and which are not. The quirks are difficult to comprehend, but the purps are comparatively simple. I have a
hope
of wrapping my head around the fundamental core regularities found in nature and characterizing the brain’s likely response (the purps), but practically no hope of doing so for the quirks.
To sum up, there’s no reason to believe that harnessing is completely dominated by the quirks. On the contrary, because most quirks are not truly useful for anything, whereas focused usefulness is the very essence of purps, purps are far more likely to be harnessed. There will, inevitably, be some facets of language and music that are not mimicking nature, but are, rather, shaping themselves to fit the quirks. But in this book I’ll ignore these quirks, for the reasons I just went over. To the extent that language and music have come to harness quirks despite their deficiencies, I’ll leave that to future scientists to unravel, because it is far above my pay grade.
Now, with quirks out of the way, the fundamental argument structure of nature-harnessing can be illustrated by Figure 1b. If the brain in the story “from nature to brain to culture” is covered over, that leaves only nature and culture, highlighting the hypothesis that culture mimics nature.
Figure 2, on the following page, shows the three cases of nature-harnessing I have examined in my research: writing, speech, and music. It shows the mechanisms in the brain each harnesses, and also the natural stimuli the brain mechanisms were selected to process. Writing was covered in
The Vision Revolution
. The other two rows in Figure 2 are for speech and music, the cultural artifacts taken up in
this
book, with nature-harnessing as the overarching theme.
Figure 2
. The structure of the book’s argument. For example, for the first row, writing shaped itself (via cultural selection) for our visual object recognition mechanisms in the brain, and these mechanisms were, in turn, shaped (via natural selection) for recognizing three-dimensional scenes with opaque objects strewn about. Supposing that writing shaped itself mostly for the brain’s “purps” and not the quirks, then writing is expected to principally shape itself to look like three-dimensional scenes with opaque objects. The next two rows are the main topics of
this
book.
And now we’re ready for the meat of this book. In Chapter 2, I describe how speech sounds like solid-object physical events, and in Chapters 3, 4, and the Encore (at the end of the book), I describe how music sounds like people moving. In the fourth chapter of my previous book,
The Vision Revolution
, I described how writing looks like 3-D scenes with opaque objects. With these three cases made, the conclusion I would like the reader to draw is that nature-harnessing—not instinct, and not a general-purpose brain—is the general mechanism by which we came to have these powers.
Chapter 2
Speech Events
Grasshopper
I
n M. Night Shyamalan’s movie
The Village
, a young woman, Ivy, sets off on a journey into an unknown forest. She has persuaded the elders of her tribe to let her find other people on the far side of the forest, get medicine, and return to save the life of her sick lover. She has no knowledge of anything beyond the several acres of her village, except that beyond their meadow and inside the forest are chilling, otherworldly beasts that occasionally invade the village and carve up one of the pets.
As if this quest were not harrowing enough, there’s an important fact I left out: she is blind. Now, the village leaders know the truth about what’s beyond their meadow—no beasts (but the costumed elders themselves), just woods, and then modern civilization, from which they’ve sheltered their children. That’s why they allow her to go into the forest. But no one of Ivy’s generation knows this. And neither do we, the moviegoers. We’re terrified for her. As it turns out, terrifying things
do
happen to her in that forest, because a monster (really a man from the village in a monster costume) secretly follows her, and eventually attacks her.
The movie would be considerably less dramatic if our female heroine were deaf, rather than blind. Instead of a woman waving her arms and tramping about through the thorny tangles, we’d be watching a woman walking normally through the forest, keeping to deer trails. In fact, many of us regularly do just this, wearing headphones and blasting music as we deafly, yet deftly, jog through our local park. This would not quite elicit the thrill Shyamalan had in mind. A deaf person on a forest quest does not make a good movie. Being deaf just doesn’t seem like much of a big deal compared to blindness. If not for the inability to hear speech, we might hardly miss our auditory systems if they fell out through our ears.
Then again, there’s another twist to the story that may change one’s feeling about audition: our young blind heroine
defeats
her attacker. She kills him, in fact. She may look out of sorts crashing into trees, but her hearing makes it impossible for her attacker to sneak up on her. Especially in the forest. Had she been deaf, not blind, her attacker could have whistled “Dixie”
with an accordion accompaniment while following her through the woods and still taken her completely by surprise.
If deaf-maiden-alone-in-the-forest is not spine-tingling to movie audiences, it is only because we tend not to appreciate all that our ears do for us beyond language. Providing a sneakproof alert system is just one of the many powers of audition.
The greatest respect for our ears is found among blind kung fu masters. Every “Grasshopper” learns from his old blind master that by attending to and dissecting the ambient sounds around oneself, it is possible to sense how many attackers surround one, their locations, stances, weapons, intent, confidence level, and which one is the enemy mastermind. I once saw, in an old movie, one of these scrawny geezers defeat six men using only a baseball bat wielded upside down. But you don’t have to be a fictional blind kung fu master to have a mastery of audition and know how to sense the world with it. We all do; we just don’t get all “Grasshopper” about it. Our brains have a mastery of it even if we’ve never thought about it.
In fact, when I first began pondering whether speech might sound like natural events, I had great difficulty thinking of
any
important natural-event sounds. I was initially dumbfounded: what is so useful about having ears that nearly all vertebrates have them? It seemed to me that
I
primarily use my ears for listening to speech, and
that
obviously cannot explain why all those other vertebrates have ears as well. Sure, it is difficult to sneak up on me, but one hardly needs such a fine-tuned ear and auditory system for a simple alarm.
After some months of contemplation, however, I came to consciously appreciate my ability to use sound to recognize the world and what’s happening around me. I began to notice every tap, clink, rub, burble, and skid. And I noticed how difficult it was for me to do anything without making a sound that gave away what I was doing, like eating from my daughter’s Halloween stash. When you’re next at home and your family is active around you, close your eyes and listen. You will hear sounds such as the plink of a spoon in a coffee mug, the scrape of a drawer opening, or the scratch of crayons on drywall. It will typically take some time before you hear an event that you
cannot
recognize. In the late 1980s, the psychologist William Gaver played environmental sounds to listeners, and asked them to identify what they heard. He found that people are impressive at this: most are capable, for example, of distinguishing running upstairs from running downstairs. Research following in the tradition of work done by the psychologist William H. Warren in the mid-1980s has shown that people are even able to use sound to sense the shapes and textures of some objects.
Our ears and auditory systems are, then, highly designed for and competent at sensing and recognizing what is happening around us. Our auditory systems are priceless pieces of machinery, just the kind of hardware that cultural evolution shouldn’t let go to waste, perfect for harnessing. In this chapter, I sift through the sounds of nature and distill a host of regularities found there, regularities that apply nearly anywhere—in the jungle, on the tundra, or in a modern city. The idea is that our auditory system, having evolved in the presence of these regularities for hundreds of millions of years, will have evolutionarily “internalized” them; our auditory system will therefore work best when incoming sounds conform to these regularities. I will then ask whether the sounds of speech across human languages tend to respect these regularities.
That’s
what we expect if language harnesses us.
Over Hear
It can be difficult for students to attract my attention when I am lecturing. My occasional glances in their direction aren’t likely to notice a static arm raised in the standing-room-only lecture hall, and so they are reduced to jumping and gesturing wildly in the hope of catching my eye. And that’s why, whenever possible, I keep the house lights turned off. There are, then, three reasons why my students have trouble visually signaling me: (i) they tend to be behind my head as I write on the chalkboard, (ii) many are occluded by other people, are listening from behind pillars, or are craning their necks out in the hallway, and (iii) they’re literally in the dark.
These three reasons are also the first ones that come to mind for why languages everywhere employ audition (with the secondary exceptions of writing and signed languages for the deaf) rather than vision. We cannot see behind us, through occlusions, or in the dark; but we
can
hear
behind us, through occlusions, and in the dark. In situations where one or more of these—(i), (ii), and (iii) above—apply, vision fails, but audition is ideal. Between me and the students in my course lectures, all three of these conditions apply, and so vision is all but useless as a route to my attention. In such a scenario a student could develop a firsthand appreciation of the value of speech for orienting a listener. And if it weren’t for the fact that I wear headphones blasting Beethoven when I lecture, my students might actually learn this lesson.
The three reasons for vision’s failure mentioned above are good reasons why audition might be favored for language communication, but there is a much more fundamental reason, one that would apply to us even if we had eyes in the backs of our heads and lived on wide-open prairies in a magical realm of sunlit nights. To understand this reason, we must investigate what vision and audition are each good at.
Vision excels at answering the questions “What is it?” and “Where is it?” but not “What happened?” Each glance cannot help but inform you about what objects are around you, and where. But nearly everything you see isn’t
doing
anything. Mostly you just see nature’s set pieces, currently not participating in any event—and yet each one is visually screaming, “I’m here! I’m here!” There’s a simple reason for this: light is reflecting off all parts of the scene, whether or not the parts have anything interesting to say. Not only are all parts of a scene sending light toward you even when they are not involved in any event, but the visual stimulus often changes in dramatic ways even when the objects out there are not moving. In particular, this happens whenever
we
move. As we change position, objects in our visual field dynamically shift: their shapes distort, nearer objects move more quickly, and objects shift from visible to occluded and vice versa. Visual movement and change are not, therefore, surefire signals that an event has occurred. In sum, vision is not ideal for sensing events because events have trouble visually outshouting all the showy nonevents.
If visual nature is the loquacious coworker you avoid eye contact with, auditory nature is (ironically) the silent fellow who speaks up only to say, “Piano falling.” Audition excels at the “What’s happening?” sensing a signal only when there’s an event. Audition not only captures events we cannot see—like my (fictional) gesticulating students—but serves to alert us to events occurring even within our view. Nonevents may be screaming visually, but they are not actually making any noise, and so audition has unobstructed access to events—for the simple reason that sound waves are cast only when there is an event.
That’s why audition, but not vision, is intrinsically about “what’s happening.” Audition excels at event perception. And this is crucial to why audition, but not vision, is best suited for everyday language communication. Communication
is
a kind of event, and thus is a natural for audition. That is, everyday person-to-person language interactions are acute events intended to be comprehended
at
that moment
. Writing is not like this; it is a longer-term record of our thoughts. And when writing
does
try to be an acute person-to-person means of communication, it tends to take measures to ensure that the receiver gets the message
now
—and often this is done via an
auditory
signal, such as when one’s e-mail or text messaging beeps an alert that there is a new message.
That language is auditory and not visual is, in the broadest sense, a case of harnessing, or being like nature for the purpose of best utilizing our hardware. Language was culturally selected to utilize the auditory modality because sound is
nature’s
modality of event communication.
That’s nice as far as it goes, but it does not take us very far. The Morse code for electric telegraphy utilizes sound (dots and dashes), and even the world-record Morse code reader, Ted McElroy, could only handle reading 75.2 Morse code words per minute (a record set in 1939), whereas we can all comprehend speech comfortably at around 150 words per minute—and with effort, at rates approaching 750 words per minute. Fax machines and modems also communicate by sound, but no human language asks us to squeal and bleep like that. Clearly, not just any auditory communication will do. And that brings us to the main aim of this chapter: to say what auditory communication
should
sound like in order to best harness our auditory system. We move next to the first step in this project: searching for the
atoms
of natural sounds, akin to the contours in natural scenes on the visual side.
Nature’s Phonemes
By understanding the different evolutionary roles for vision and audition, we just saw that audition is the appropriate modality to harness for language: sound is nature’s standard event stream, and language therefore wants to utilize sound to make sure language utterances get received. But what kinds of sounds, more specifically, should language use to best harness our brains? The sounds of nature, of course. But the natural world has a large portfolio of sounds it can make, and people are good at mimicking a fair share of these sounds, mostly with their mouths, but sometimes with the help of their hands and underarms. Saying that a well-designed language will use sounds from nature is like saying one had “a sandwich” in a deli.
Which
sounds from nature? Wind blowing, water splashing, trees falling (when someone is around), leaves rustling, thunder, animal vocalizations, knuckle cracks, eggs breaking? Where is language to begin?
Although nature’s sounds are all over the map, there’s order to the cacophony. Most events we hear are built out of just three fundamental building blocks: hits, slides, and rings.
Hits happen whenever a solid object bumps into another object. When you walk, your feet hit the ground. When you knock, your knuckles hit the door. A tennis match is a game of hits—ball hits racket, ball hits net, ball hits ground. Hits make a distinctive sound. They happen suddenly, and the auditory signal consists of an almost instantaneous explosive burst of energy emanating from the impact.
Slides are the other common kind of physical interaction between solid objects. Slides occur whenever there is a long duration of friction contact between surfaces. If you drag your finger down the page of this book, you’re making a slide. If you push a box along the floor, that’s a slide. The auditory structure of slides differs from that of hits: Rather than a nearly instantaneous release of energy, slides have a non-sudden start and a white-noise-like sound that can last for a more extended period of time. Slides are less common than hits. First, they require a special circumstance, the extended interaction of two surfaces; hits, on the other hand, are what perception scientists call “generic,” because no special coincidences are needed to carry off a hit. Second, when slides
do
happen their friction tends to significantly lower the energy in the event, and therefore they commonly occur at the tail ends of events. Third, whereas a long sequence of hits is possible (with intervening rings, as discussed in a moment)—as when a ping pong ball bounces lower and lower, for instance—a long sequence of distinct slides is not typically possible; something would have to stop one slide to allow another one to start, but any such interference with a slide is likely to involve a hit.
Hits and slides are the only physical interactions among solid objects that we regularly experience, and they are certainly the primary ones our ancestors would have experienced. We are land mammals. Splashes, involving a solid and a liquid, are neither hits nor slides, and although they could shape the auditory system of otters, seals, and whales, they’re unlikely to be of central significance to our auditory system.