GT could not work without a very large preexisting corpus of translations. It is built upon the millions of hours of labor of human translators who produced the texts that GT scours. Google’s own promotional video doesn’t dwell on this at all. At present it offers two-way translation between fifty-eight languages, that is to say, 3,306 separate translation services, more than have ever existed in all human history to date. Most of these translation relations—Icelandic
Farsi, Yiddish
Vietnamese, and dozens more—are the newborn offspring of GT: there is no history of translation between them, and therefore no paired texts, on the Web or anywhere else. Google’s presentation of its service points out that given the huge variations among languages in the amount of material its program can scan to find solutions, translation quality varies according to the language pair involved.
5
What it does not highlight is that GT is as much the prisoner of global flows in translation as we all are. Its admirably smart probabilistic computational system can offer 3,306 translation directions only by using the same device as has always assisted intercultural communication: pivots, or intermediary languages. It’s not because Google is based in California that English is the main pivot. If you use statistical methods to compute the most likely match between languages that have never been matched directly before, you must use the pivot that can provide matches with both target and source.
The service that Google offers appears to flatten and diversify interlanguage relations beyond the wildest dreams of even the EU’s most enthusiastic language-parity proponents. But it is able to do so only by exploiting, confirming, and increasing the central role played by the most widely translated language in the world’s electronic databank of translated texts, which can only be the most consistently translated language in all other media, too.
A good number of English-language detective novels, for example, have probably been translated into both Icelandic and Farsi. They thus provide ample material for finding matches between sentences in the two foreign languages; whereas Persian classics translated into Icelandic are surely far fewer, even including those works that have themselves made the journey by way of a pivot such as French or German. This means that John Grisham makes a bigger contribution to the quality of GT’s Icelandic–Farsi translation device than Halldór Laxness or Rumi ever will. And the real wizardry of Harry Potter may well lie in his hidden power to support translation from Hebrew into Chinese.
GT-generated translations themselves go up on the Web and become part of the corpus that GT scans, producing a feedback loop that reinforces the probability that the original GT translation was acceptable. But it also feeds on human translators, since it always asks users to suggest a better translation than the one it provides—a loop pulling in the opposite direction, toward greater refinement. It’s an extraordinarily clever device. I’ve used it myself to check I had understood a Swedish sentence more or less correctly, for example, and it is used automatically as a Webpage translator whenever you use a search engine. Of course, it may also produce nonsense. However, the kind of nonsense a translation machine produces is usually less dangerous than human-sourced bloopers. You can usually see instantly when GT has failed to get it right, because the output makes no sense, and so you disregard it. (This is why you should never use GT to translate into a language you do not know very well. Use it only to translate into a language in which you are sure you can recognize nonsense.) Human translators, on the other hand, produce characteristically fluent and meaningful output, and you really can’t tell if they are wrong unless you also understand the source—in which case you don’t need the translation at all.
If you remain attached to the idea that a language really does consist of words and rules and that meaning has a computable relationship to them (a fantasy to which many philosophers still cling), then GT is not a translation device. It’s just a trick performed by an electronic bulldozer allowed to steal other people’s work. But if you have a more open mind, GT suggests something else.
Conference interpreters can often guess ahead of what a speaker is saying because speakers at international conferences repeatedly use the same formulaic expressions. Similarly, an experienced translator working in a familiar domain knows without thinking that certain chunks of text have standard translations that she can slot in. At an even more basic level, any translator knows that there are some regular transpositions between the two languages she is working with—the French impersonal pronoun
on
, for example, will almost always require the English sentence to be in the passive; adjectives following a noun in French will need to be put in front of the equivalent English noun; and so on. These automatisms come from practice and experience. Translators don’t reinvent hot water every day, and they don’t recalculate the transformation “French
on
English passive construction” each time it occurs. They behave more like GT—scanning their own memories in double-quick time for the most probable solution to the issue at hand. GT’s basic mode of operation is much more like professional translation than is the slow descent into the “great basement” of pure meaning that early machine-translation developers imagined.
GT is also a splendidly cheeky response to one of the great myths of modern language studies. It was claimed, and for decades it was barely disputed, that what was so special about a natural language was that its underlying structure allowed an infinite number of different sentences to be generated by a finite set of words and rules. A few wits pointed out that this was no different from a British motorcar plant capable of producing an infinite number of vehicles each one of which had something different wrong with it—but the objection didn’t make much impact outside Oxford. GT deals with translation on the basis not that every sentence is different but that anything submitted to it has probably been said before. Whatever a language may be in principle, in practice it is used most commonly to say the same things over and over again. There is a good reason for that. In the great basement that is the foundation of all human activities, including language behavior, we find not anything as abstract as “pure meaning” but common human needs and desires. All languages serve those same needs, and serve them equally well. If we do say the same things over and over again, it is because we encounter the same needs and feel the same fears, desires, and sensations at every turn. The skills of translators and the basic design of GT are, in their different ways, parallel reflections of our common humanity.
In September 2009, the new administration in the White House issued a science policy road map, titled
A Strategy for American Innovation
. The last section of this document calls for science and technology to be harnessed to address the “‘Grand Challenges’ of the 21st Century,” of which it gives half a dozen examples, such as solar cells “as cheap as paint” and intelligent prosthetics. The last line of the whole strategy puts among these long-range targets for national science policy the development of “automatic, highly accurate and real-time translation between the major languages of the world—greatly lowering the barriers to international commerce and collaboration.”
6
Not every science policy target is achieved, but with serious backing from the U.S. administration now in place for the first time since 1960, machine translation is likely to advance far beyond the state in which we currently know it.
A Fish in Your Ear: The Short History of Simultaneous Interpreting
Speech predates writing by eons, and oral translation is far, far older than the written kind. Because speech is such an ephemeral thing—it’s gone in a puff of warm air, which is all it is in the material sense—nothing can be known directly about speech translation for almost the entire duration of its history. Two things caused a huge change in the twentieth century: the invention of the telephone by Alexander Graham Bell in 1876, and a political need of the most pressing kind.
The Nuremberg Trials of Nazi war criminals in 1945 was one of the most important courts of law in modern history and also an unprecedented event in the history of translation. The panel of judges and the prosecuting teams came from the four Allied powers—the United States, Great Britain, France, and the Soviet Union—speaking three different languages, and the defendants spoke a fourth language, German. Nothing like this had ever happened before. In courts located in a national jurisdiction, interpreters read consecutively, repeating in the language of the court what the foreign defendant has just said, and then repeating what the court says to the defendant (when the client is not being addressed directly, it may be done at low volume in a “whisper translation,” or
chuchotage
). Two-way oral translation of this normal kind obviously slows down the proceedings. But four-way translation? In twelve directions? Consecutive interpreting would have so lengthened the International Military Tribunal’s case that everyone might have lost the thread. For the Nuremberg Trials, something new was needed.
Technology for speeding up multilingual interaction already existed. The Filene-Finlay Speech Translator had been tried out a few times in the 1920s by the International Labour Organization in Geneva. Users of the system had a telephone in front of them, and when a delegate could not understand what was being said she picked up the handset, dialed in to the exchange, and heard the speech in a different language (only two—French and English—were involved at that time). The translators sat at the back listening to the speech and speaking their translation of it into a soundproof awning called a Hushaphone, connected directly to the telephone exchange. The original Speech Translator was also used in 1934 for Adolf Hitler’s address to a Nazi Party rally in Nuremberg for live broadcast on French radio.
1
The Speech Translator was designed and promoted not for rapid two-way interaction in multiple languages but for speeches read aloud from prepared written text—what Germans call
gesprochene Sprechsprache
, “spoken speech language,” the standard genre of politicians and public figures the world over. The Filene-Finlay device was acquired by IBM in the 1930s, and the company offered a complete set of partly secondhand but much enhanced and extended equipment for free to the International Military Tribunal in Nuremberg. This act of generosity was to prove an epochal event in the way in which we now conceive the possibility of international communication.
Members of the court, including the defendants, were equipped with headphones and microphones, from which wires trailed over the courtroom floor to the exchange. Wires ran from the exchange to four separate translation teams in different compartments. That made for a lot of complicated wiring, but the real magic was what happened in the interpreters’ booths.
Members of the court had switch dials to select which language channel they wished to listen to. The output was produced by four teams of three interpreters each. The English team had a German interpreter, a Russian interpreter, and a French interpreter sitting side by side, listening on headphones, and repeating in English what was said in the other languages; the setup was the same in the three other booths. Altogether, thirty-six interpreters were recruited from among the three hundred language professionals hired by the court and the prosecution and defense teams to work at this brand-new and not obviously manageable task of instantaneous oral translation. Each of the twelve-strong teams worked eighty-five-minute shifts on two days out of three and was expected to rest in between. From the very start of the new profession, simultaneous interpreting was recognized as being one of the most exhausting things you can do with a human brain.
The difficulty is not only high-speed language transfer. The difficulty is that the sound of your own voice diminishes your ability to hear what the other person is saying. That’s why we take turns in conversation and speak over someone else only when we really do
not
want to hear what he has to say. A simultaneous interpreter must learn to overrule the natural tendency not to listen when talking, and not to talk when listening. Simultaneous interpreting exists only because some very adept people can train themselves to do such an unnatural thing. Try it yourself: switch on a TV news broadcast and repeat at your own normal speaking volume exactly what the newscaster says. If you can keep that up without losing a sentence for ten minutes or more, then maybe you, too, could be a simultaneous interpreter—provided you know another two languages extremely well. Millions of people know three languages well enough to be interpreters, but only a small proportion of them can manage the exhausting trick of dividing attention between what you are saying and what you are hearing—without missing a word.
The trickiest part of high-speed language transfer is that politicians and diplomats do not characteristically use short, simple sentences without subordinate clauses, or leave long gaps between them. They tend to drone on with sausagelike strings of evasive circumlocutions: “I am instructed by my ambassador to inform this august assembly that contrary to rumors reported in one of the organs of the capitalist press no authorized agent of the state has knowingly exported to any other country any materials covered by the international convention on …” Unfortunately, there is no convention on the export of long-windedness, and so interpreters have to begin reformulating sentences of this kind without knowing for sure where they will go, what their real point is, or what alteration to the structure of the starting point the end of the sentence will bring. Extremely sophisticated mental skills are required to “hold” features of meaning in provisional formulations until the real topic of the sentence is finally let out of the bag. An interpreter who has to repair a sentence after it has begun (as we all do in normal speech) loses valuable time. The ability to pick the right formulae in a flash and to keep the sentence loose enough to cope with what may crop up next is acquired by experience and practice—together with an uncommonly developed capacity for finding instant matches between sentence patterns that are grammatically and stylistically far apart.
Most of the people involved in preparing the Nuremberg Trials doubted this newfangled setup would work. We owe the modern world of conference interpreting more to the can-do attitude of the victorious U.S. Army than to the considered judgment of prosecutors, judges, and language professionals. Chief doubter among them was Richard Sonnenfeldt, the head of the U.S. prosecution team’s translation service. He’d been picked from a motor pool in Salzburg by General “Wild Bill” Donovan to serve as translator in the long interrogations of the defendants that preceded the trials. He’d interrogated the Nazi top brass on behalf of four-star generals and was asked to take charge of the simultaneous-interpreting team during the trials. Sonnenfeldt turned the job down because he was intimidated by the speed requirement and by his own lack of familiarity with legal terminology. But the main reason he backed off from running the world’s first simultaneous-interpreting service was his professional opinion that either the people or the system, or both, would break down.
2
He was right about the glitches. Microphones and headsets went on the blink; lawyers and witnesses (including the chief U.S. attorney, Robert H. Jackson) spoke too fast; on more than one occasion, an interpreter burst into tears on hearing testimony from Rudolf Höss, the ice-cold commandant of Auschwitz. But, despite the obstacles, the system worked. Hermann Göring is said to have remarked to Stefan Hörn, one of the court translators, “Your system is very efficient, but it will also shorten my life!”
3
The speech-translation system inaugurated at the Nuremberg Trials launched a new era in international communication. The interpreters’ achievements not only created a new skill and a new profession but had an immediate and far-reaching effect on world affairs. First of all, every new international agency wanted a simultaneous-translation system straightaway and thought it could just be bought at the store. In February 1946, when the Nuremberg speech-translation system was barely run in, the first General Assembly of the newborn United Nations Organization adopted as its second resolution that “speeches made in any of the six languages of the Security Council shall be interpreted into the other five languages.”
4
Thereafter all the dependent agencies—from the International Labour Organization to the Food and Agriculture Organization, from UNESCO to the World Bank—acquired the equipment and sought to recruit the personnel to produce the magical illusion that every delegate would always be able to understand what any other delegate was saying as he or she was in the process of saying it.
This led outsiders to take for granted that the diversity of languages was no longer an impediment to collective international action and world harmony. Insiders—diplomats and negotiators in all the new bodies set up by the UN—were under no such illusion. As one student of international law points out, texts and speeches produced in multilingual form at high speed may be grammatically correct, but they are never quite coherent. The small deviations that arise, over which delegates argue for hours on end, “intensify the collective awareness of the importance of translation.”
5
But the early years of simultaneous interpreting were also years of great hope for a new world order ruled by “jaw-jaw” in place of the preceding decades of “war-war.” In those circumstances, the general public easily forgot just what a fragile and mysterious feat was being accomplished by a very small group of language gymnasts in the glass boxes in the rear of the assembly hall.
It hardly needs explaining why simultaneity in translation is an illusion. You cannot translate anything until you have heard what it is: translation is always a “speaking after.” The impression of simultaneity is created by a bag of impressive language tricks. First, many speeches are read out from a prepared text. Diplomats sometimes provide the translation teams with the text in advance of the meeting—often only just in advance, but even a few minutes’ head start takes away a lot of stress. Second, international meetings are dominated by speeches of a fairly predictable kind. Once you acquire experience of the kind of business being conducted and of the formulaic language it uses, you can run ahead of what is actually said and give yourself a little brain space to listen for the all-important variations that the speaker might introduce. Contraction and change of orientation are also used for nonformulaic digressions: “The Soviet delegate has just made a joke” can replace the telling of a long Russian shaggy-dog tale. But, even so, the skill of the “conference interpreter” (the term that has come to replace
oral translator
,
simultaneous translator
, and
speech translator
) calls for high levels of concentration and mental agility. There are few people who can do it at all, and even fewer who want to do it day in and day out.
Sixty years of experience have not made it any easier to predict whether an individual can be turned into a conference interpreter or not. Even now, between half and three quarters of all students admitted to interpreter training courses fail to enter the profession.
6
At the beginning, in the aftermath of the Second World War, the disastrous history of the twentieth century had produced many thousands of people with outstanding language skills in several of the six official international languages (Spanish, English, French, Chinese, Russian, and Arabic)—children of refugees from the Russian Revolution brought up in Shanghai and educated at the Lycée Français, where they learned English, young refugees from German-occupied France who had spent months or years in Cuba or Mexico awaiting a U.S. visa before going to college in New York, and so on. The first generation of the elite of the translating professions consisted mostly of young people from backgrounds of that kind, who remained in post for thirty years and more. These founding mothers and fathers of the conference-interpreting community have now retired, and it has proved difficult to replace them. The lack of personnel is particularly acute for the two most-needed languages in world affairs today—Arabic and Chinese. Even the Russian- and French-into-English booths are getting harder to fill.
The structure of conference interpreting at the UN and its agencies and at most other international gatherings that can afford it is not now quite as it was at the Nuremberg Trials. The rules invented for that first experiment were that all interpreters should work only into their “native” language (now called their A language, “A” standing for “active”), and that all interpreting should be done from the “original.” With six UN languages currently in operation, that would require six teams of five translators, or thirty people in all, to service a single meeting. The job is now reckoned to be as stressful as the work of air traffic controllers; the eighty-five-minute slots used at Nuremberg have been replaced with a routine of alternating thirty-minute shifts (the Chinese and Arabic booths change over every twenty minutes) through a normal (short) working day—so that in fact you would need sixty people, not thirty, to service an international meeting if the original rules were still applied. There just aren’t sixty people with those high-level and variegated skills that can be gathered at any one time in any one place in the world, not even in New York City. The following schema allows the illusion of seamless language transfer to be achieved with a team of just fourteen members: