The utility of effective theories was recognized in quantum mechanics early on. For example, in an atomic analogy to the center-of-mass motion of the ball I just discussed, one of the classic methods of understanding the behavior of molecules in quantum mechanics—which goes back to at least the 1920s—is to separate molecules into “fast” and “slow” degrees of freedom. Since the nuclei in molecules are very heavy, their response to molecular forces will involve a smaller, and slower, variation than, say, the electrons speedily orbiting them. Thus one might follow a procedure such as this to predict their properties. First, imagine the nuclei are fixed and unchanging and then calculate the motion of the electrons about these fixed objects. Then, as long as the nuclei are slowly moving, one would not expect this motion significantly to affect the electrons’ configuration. The combined set of electrons will just smoothly track the motion of the nuclei, which will in turn be affected only by the average electron configuration. The effect of the individual electrons thus “decouples” from the motion of the nuclei. One can then describe an effective
theory of the nuclear motion keeping track only of the nuclear degrees of freedom explicitly and replacing all the individual electrons by some single quantity representing the average charge configuration. This classic approximation in quantum mechanics is called the Born-Oppenheimer theory, after the two well-known physicists who first proposed it, Max Born and Robert Oppenheimer. This is just like describing the motion of the ball by merely keeping track of the center of mass of the ball, plus perhaps also the collective motion of all the atoms about the center of mass—namely, the way the ball spins.
Take another, more recent example, related to superconductivity. I have described how, in a superconductor, electron pairs bind together into a coherent configuration. In such a state, one need not describe the material by keeping track of all the electrons individually. Because it takes so much energy to cause an individual electron to deviate from the collective pattern, one can effectively ignore the individual particles. Instead, one can build an effective theory in terms of just a single quantity describing the coherent configuration. This theory, proposed by London in the 1930s and further developed by the Soviet physicists Landau and Ginsberg in 1950, correctly reproduces all the major macroscopic features of superconducting materials—including the all-important Meissner effect, which causes photons to behave like massive objects inside superconductors.
I have already pointed out that separating a problem into relevant and irrelevant variables is not itself a new technique. What the integration of quantum mechanics and relativity has done, however, is to require the elimination of irrelevant variables. In order to calculate the results of any measurable microscopic physical process, we must ignore not just a few, but an
infinite
number of quantities. Thankfully, the procedure begun by Feynman
and others has demonstrated that these can be ignored with impunity.
Let me try to describe this key point in a more concrete context. Consider the “collision” of two electrons. Classical electromagnetism tells us that the electrons will repel each other. If the electrons are initially moving very slowly, they will never get close together, and classical arguments may be all that is necessary to determine correctly their final behavior. But if they are moving fast enough initially to get close together, on an atomic scale, quantum-mechanical arguments become essential.
What does an electron “see” when it reacts to the electric field of another electron? Because of the existence of virtual pairs of particles and antiparticles burping out of the vacuum, each electron carries a lot of baggage. Those positive particles that momentarily pop out of the vacuum will be attracted to the electron, while their negative partners will be repelled. Thus, the electron in some sense carries a “cloud” of virtual particles around with it. Since most of these particles pop out of the vacuum for an extremely short time and travel an extremely small distance, this cloud is for the most part pretty small. At large distances we can lump together the effect of all the virtual particles by simply “measuring” the charge on an electron. In so doing, we are then lumping into a single number the potentially complicated facets of the electric field due to each of the virtual particles that may surround an electron. This “defines” the charge on an electron that we see written down in the textbooks. This is the
effective
charge that we measure at large distances in an apparatus in the laboratory, by examining the motion of an electron in, say, a TV set, when an external field is applied.
Thus, the charge on the electron is a fundamental quantity only to the extent that it describes the electron as measured
on a
certain scale!
If we send another electron closer to the first electron, it can spend time inside the outskirts of this cloud of virtual particles and effectively probe a different charge inside. In principle, this is an example of the same kind of effect as the Lamb shift. Virtual particles can affect the measured properties of real particles. What is essential here is that they affect properties such as the electron’s charge differently depending upon the scale at which you measure it.
If we ask questions appropriate to experiments performed on a certain length scale or larger, involving electrons moving with energies that are smaller than a certain amount, then we can write down a complete effective theory that will predict every measurement. This theory will be QED with the appropriate free parameters, the charge on the electron, and so on, now
fixed
to be those appropriate to the scale of the experiment, as determined by the results of the experiment. All such calculations, however, of necessity effectively discard an infinite amount of information—that is, the information about virtual processes acting on scales smaller than our measurements can probe.
It may seem like a miracle that we can be so blasé about throwing out so much information, and for a while it seemed that way even to the physicists who invented the procedure. But upon reflection, if physics is to be possible at all, it has to work out this way. After all, the information we are discarding need not be correct! Every measurement of the world involves some scale of length or energy. Our theories too are
defined
by the scale of the physical phenomena we can probe. These theories may predict an infinite range of things at scales beyond our reach at any time, but why should we believe any of them until we measure them? It would be remarkable if a theory designed to explain the interactions of electrons with light were to be absolutely correct in all its
predictions, down to scales that are orders and orders of magnitude smaller than anything we currently know about. This might even be the case, but either way, should we expect the correctness of the theory at the scales we can now probe to be held hostage by its potential ability to explain everything on smaller scales? Certainly not. But in this case, all the exotic processes predicted by the theory to occur on much smaller scales than we can now probe had better be irrelevant to its predictions for the comparison with present experiments, precisely because we have every reason to believe that these exotic processes could easily be imaginary remnants of pushing a theory beyond its domain of applicability. If a theory had to be correct on all scales to answer a question about what happens at some fixed scale, we would have to know The Theory of Everything before we could ever develop A Theory of Something.
Faced with this situation, how can we know whether a theory is “fundamental”—that is, whether it has a hope of being true on all scales? Well, we can’t. All physical theories we know of must be viewed as effective theories precisely because we have to ignore the effects of new possible quantum phenomena that might occur on very small scales in order to perform calculations to determine what the theory predicts on larger, currently measurable, scales.
But as is often the case, this apparent deficiency is really a blessing. Just as we could predict, at the very beginning of this book, what the properties of a supercow should be by scaling up from the properties of known cows, so the fact that our physical laws are scale-dependent suggests that we might be able to predict how they, too, evolve as we explore ever smaller scales in nature. The physics of today then can give a clear signpost for the physics of tomorrow! In fact, we can even predict in advance when a new discovery is
required.
Whenever a physical theory either predicts nonsense or else becomes mathematically unmanageable as the effects of smaller and smaller scale virtual quantum-mechanical processes are taken into account, we believe that some new physical processes must enter in at some scale to “cure” this behavior. The development of the modern theory of the weak interaction is a case in point. Enrico Fermi wrote down in 1934 a theory describing the “beta” decay of the neutron into a proton, electron, and neutrino—the prototypical weak decay. Fermi’s theory was based on experiment and agreed with all the known data. However, the “effective” interaction that was written down to account for a neutron decaying into three other particles was otherwise ad hoc, in that it was not based on any other underlying physical principles beyond its agreement with experiment.
Once quantum electrodynamics had been understood, it soon became clear that Fermi’s weak interaction differed fundamentally in nature from QED. When one went beyond simple beta decay, to explore what the theory might predict to occur at smaller scales, one encountered problems. Virtual processes that could take place at scales hundreds of times smaller than the size of a neutron would render the predictions of the theory unmanageable once one tried to predict the results of possible experiments at such scales.
This was not an immediate problem, since no experiments would be directly capable of exploring such scales for over fifty years after Fermi invented his model. Nevertheless, well before this, theorists began to explore possible ways to extend Fermi’s model to cure its illnesses. The first step to take around this problem was clear. One could calculate the distance scale at which problems for the predictions of the theory would begin to be severe. This scale corresponded to about 100 times smaller than the
size of a neutron—much smaller than that accessible at any then-existing facility. The simplest way to cure the problem was then to suppose that some new physical processes, not predicted in the context of Fermi’s theory alone, could become significant at this scale (and not larger), and could somehow counter the bad behavior of the virtual processes in Fermi’s theory. The most direct possibility was to introduce new virtual particles with masses about 100 times the mass of the neutron, which would make the theory better behaved. Because these particles were so massive, they could be produced as virtual particles only for very short times, and could thus move over only very small distance scales. This new theory would give identical results to Fermi’s theory, as long as experiments were performed at scales that would not probe the structure of the interaction, that is, at scales greater than the distance traveled by the massive virtual particles.
We have seen that positrons, which were predicted to exist as part of virtual pairs of particles in QED, also exist as real measurable particles, if only you have enough energy to create them. So too for the new superheavy virtual particles predicted to exist in order to cure Fermi’s theory. And it is these superheavy W and Z particles that were finally directly detected as real objects in a very high energy particle accelerator built in Geneva in 1984, about twenty-five years after they were first proposed on theoretical grounds.
As I have described, the W and Z particles make up part of what is now known as the Standard Model of particle physics describing the three nongravitational forces in nature: the strong, weak, and electromagnetic forces. This theory is a candidate “fundamental” theory, in that nothing about the possible virtual processes at very small length scales that are predicted to occur in the theory directly requires new processes beyond those predicted in
the theory at these scales. Thus, while no one actually believes it, this theory could in this sense be complete. By the same token, nothing precludes the existence of new physics operating at very small length scales. Indeed, there are other strong theoretical arguments that suggest that this is the case, as I shall describe.
While theories like Fermi’s, which are “sick,” give clear evidence for the need for new physics, theories like the Standard Model, which are not, can also do so simply because their formulation is scale-dependent—namely, they depend intrinsically on what scale we perform the experiments to measure their fundamental parameters. As processes involving virtual particles acting on ever smaller scales are incorporated in order to compare with the results of ever more sensitive experiments, the value of these parameters is predicted to change, in a predictable way! For this reason, the properties of the electron that participates in atomic process on the atomic scales are not exactly the same as those of an electron that interacts with the nucleus of an atom on much smaller, nuclear scales. But most important, the difference is calculable!
This is a remarkable result. While we must give up the idea that the Standard Model is a single, inviolable theory, appropriate at all scales, we gain a continuum of effective theories, each appropriate at a different scale and all connected in a calculable way. For a well-behaved theory like the Standard Model, then, we can literally determine how the laws of physics should change with scale!
This remarkable insight about the scale-dependence of physics as I have described it is largely the work of Ken Wilson, dating only from the 1960s and for which he was awarded the Nobel Prize. It originated as much from the study of the physics of materials as it did from particle physics. Recall that the scaling behavior
of materials is the crucial feature that allows us to determine their properties near a phase transition. For example, my discussion of what happens when water boils was based on how the description of the material changed as we changed the scale on which we observed it. When water is in its liquid form, we may, if we observe on very small scales, detect fluctuations that locally make its density equivalent to its value in a gaseous state. However, averaged over ever larger and larger scales, the density settles down to be its liquid value once we reach some characteristic scale. What are we doing when we average over larger scales? We average out the effect of the smaller-scale phenomena, whose details we can ignore if all we are interested in is the macroscopic properties of liquid water. However, if we have a fundamental theory of water—one that can incorporate the small scale behavior—we can try to calculate exactly how our observations should vary with scale as the effects of fluctuations on smaller scales are incorporated. In this way, one can calculate all the properties of materials near a critical point where, as I have said, the scaling behavior of the material becomes all important. The same techniques applied to normal materials apply to the description of the fundamental forces in nature. Theories like QED contain the seeds of their own scale dependence.