Copyright (c) 2002 by Gary Felder
When Einstein formulated the general theory of relativity, he found that it was incompatible with a static universe; the equations predicted that the universe must either be expanding or shrinking. The prevalent bias against this conclusion was so strong that Einstein altered the equations of relativity in order to allow for a static solution. When Edwin Hubble found that the universe was indeed expanding, Einstein retracted this alteration, calling it the biggest blunder of his life. From that point forward the prevailing scientific viewpoint has been that of an expanding universe that at earlier times was much hotter and denser than it is today. Extrapolating this expansion backwards, we find that at a specific time in the past the universe would have been infinitely dense. This time, the beginning of the universe's expansion, has come to be known as the "big bang."
The big bang model has been extremely successful at explaining known aspects of the universe and correctly predicting new observations. Nonetheless, there are certain problems with the model. There are several features of our current universe that seem to emerge as strange coincidences in big bang theory. Even worse, there are some predictions of the theory that are in contradiction with observation. These problems have motivated people to look for ways to extend or modify the theory without losing all of the successful predictions it has made. In 1980 a theory was developed that solved many of the problems plaguing the big bang model while leaving intact its basic structure. More specifically, this new theory modified our picture of what happened in the first fraction of a second of the universe's expansion. This change in our view of that first fraction of a second has proven to have profound influences on our view of the universe and the big bang itself. This new theory is called inflation.
The rest of this paper is divided into two main sections. In the first I give a brief overview of the big bang model and discuss the various problems that led people to seek to modify it. If you are not already familiar with these ideas you might want to start by reading my previous paper "The Expanding Universe," which reviews in more detail the big bang model in general and in particular what it means to say the universe is expanding. In the second section I discuss the theory of inflation, explain how it resolves some of the problems of the big bang model, and talk about some of its implications for our understanding of the universe.
This paper assumes no prior knowledge of math or physics except for the basic "Big Bang" ideas found in my first paper. Occasionally I include footnotes with additional information for people with some background in math or physics, but nothing in the body of the text requires any information in the footnotes. I will at times ask you to take my word for certain results that arise from the mathematics of general relativity, but I will clearly identify all such cases and explain what the results mean and what their consequences are.
This model is in perfect accord with the theory of general relativity, which predicts that a homogeneous universe would expand and cool in exactly that way. Moreover, there have been many observational confirmations of the big bang model. These confirmations include the apparent motions of distant objects relative to us, the microwave radiation left over from the early universe, and the abundances of light elements formed in the first few minutes after the big bang.
I want to focus momentarily on the latter of these. An element is defined by the number of protons in a nucleusone for hydrogen, two for helium, and so on. For roughly three minutes after the big bang the temperature of the universe was so high that protons and neutrons couldn't bind together into nuclei; the particles all had so much energy that the forces that hold nuclei together were too weak to make them stick to each other. Thus for those first three minutes the only element in the universe was hydrogen, i.e. single protons not bound to anything else. (A neutron with no proton is not considered an element.) As the universe expanded and cooled it eventually reached a temperature where the protons and neutrons could bind together, and different elements were formed. The formation of these nuclei from their constituent particles (i.e. protons and neutrons) is known as nucleosynthesis.
Nuclear theory is well tested and understood. By applying it to a homogeneous, expanding medium at high temperature we can predict what relative abundances of different elements should have emerged when these nuclei were formed in the early universe. It turns out that only the three lightest elements, hydrogen, helium, and lithium, would have been able to form at that time. All of the heavier elements were formed much later in stars, and currently make up a tiny percentage of the matter we see in the universe. The predictions of the relative abundances of these light elements accurately match the observational data. This match is particularly important because it strongly suggests that the big bang model is an accurate description of the universe at least as far back as nucleosynthesis, i.e. three minutes after the predicted moment of the big bang. All of the other evidence for the theory, such as the microwave background and the motions of distant galaxies, relate to the universe at much later times, so we have no direct evidence for the accuracy of the big bang model before nucleosynthesis.
Despite this lack of direct evidence, it would be tempting to extrapolate further backwards and assume the big bang model to be an accurate description of the universe all the way back to the big bang. Such a complete extrapolation of the theory is not possible, however, because of certain limitations of our theories of high energy physics. When we talk about extrapolating backwards in the big bang model we are referring to running the equations of general relativity backwards to earlier times and higher densities. We know, however, that general relativity ceases to be valid when we try to describe a region of spacetime whose density exceeds a certain value known as the Planck density, roughly 1093 g/cm3. If we try to consistently apply quantum mechanics and general relativity at such a density we find that quantum fluctuations of spacetime become important, and we have no theory that describes such a situation.
The Planck density is enormous. It corresponds to the mass of 100 billion galaxies being squeezed into a space the size of an atomic nucleus. If we could extrapolate general relativity all the way back to the big bang the universe would have gone from infinite density to the Planck density in roughly 10-43 seconds. So saying something happened, say, three minutes after the big bang is equivalent to saying it happened three minutes after the time the universe was at Planck density.
Since we can't describe physics at densities higher than the Planck density, the best we can do in using the big bang model to describe the very early universe is the following: At some point in the past the density of the universe was above the Planck density. We don't know what physics governs such a case so we can make no predictions based on it. Somehow this super-Planckian state (sometimes called spacetime foam) gave rise to at least one region of sub-Planckian density with the right initial conditions to produce the universe we currently see. When I refer to the "initial conditions" for our universe, I will mean the state that the observable part of the universe was in after the density first became sub-Planckian and the universe (or at least this region of it) could thus be described by known theories of physics.
Even in this more limited sense, however, there are certain problems with extending the big bang model back to the beginning of the universe. These problems can be categorized as "initial condition problems" and "relic problems." I define these categories in the following two sections.
The early universe was nearly homogeneous, i.e. the same in different places. Our most direct measure of this uniformity comes from observing the microwave background radiation that was emitted when the universe was roughly 300,000 years old. The intensity of this radiation is a direct measure of how dense the universe was at that time. Looking at this background radiation coming to us from different directions shows that the largest density differences from one point to another were about one part in 100,000. If the universe had been less homogeneous it would not have given rise to the smooth distribution of galaxies we see filling the sky. If it had been exactly homogeneous, however, then clumps of matter like galaxies would never have emerged at all. The big bang model offers no explanation for why the universe emerged in this nearlybut not perfectlyhomogeneous state.
The second initial condition has to do with something called curvature. General relativity says that the universe can be closed, i.e. curved inward like the surface of a ball, open, meaning curved outward like the surface of a saddle, or flat, meaning it has no curvature. These different kinds of curvature cause the universe to evolve in different ways; a closed universe will eventually stop expanding and recollapse, while an open universe will tend to fly apart more and more quickly. For the big bang model to work the universe at the time of Planck density must have been almost precisely flat; the curvature couldn't have exceeded one part in 1059! If it were slightly more curved than this (closed) it would have recollapsed long ago and if it were slightly less so (open) it would have flown apart so quickly galaxies would never have formed. This apparent coincidencethe universe initially having exactly the curvature required to survive to later times and form galaxiesis known as the flatness problem.
It would be wonderful if a theory existed that with a minimum of assumptions could explain the initial conditions such as flatness and homogeneity, eliminate all high energy relic particles, and then segue into the big bang model itself by the time of nucleosynthesis. In 1980 Alan Guth proposed such a theory, known as inflation.1
Exponential growth can be much faster than power-law growth. In the simplest models of inflation the universe would have expanded by a factor of over ten to the ten million in a fraction of a second. There are two obvious questions raised by this idea: What mechanism would cause such an expansion to occur and what would be the consequences if it did?
In the next section I will discuss the cause of inflation. In the following section I will describe some of the basic consequences of inflation, including the resolution of all the problems raised in the previous section on the big bang model.
In general the expansion rate slows down as the universe expands because the average density decreases. If there are 1000 galaxies in some region of space and all distances double then the volume of space occupied by those galaxies will increase eight times. Since the galaxies have the same total mass as before their density will decrease by eight times. If the mass of galaxies were the only form of energy in the universe then every time distances doubled the doubling time would increase by a factor of the square root of eight. In short a universe whose energy consists entirely of mass will experience power law expansion.
It turns out, however, that other forms of energy behave differently as the universe expands. For example, the energy density contained in light (which is a form of electromagnetic radiation) decreases faster than the energy density of mass. Every time the universe expands by a factor of two the energy density of light decreases not by eight times, but actually by sixteen.3 So if there is a lot of light energy in the universe the doubling time increases faster than it would for a universe with only mass energy.
One problem that might occur to you with this fact is that the total energy of the universe is apparently not conserved. If a region of space doubled in radius and the energy density in that region did anything other than decrease by a factor of eight then the total energy would change. The resolution of this problem is a somewhat subtle issue in general relativity and involves a kind of gravitational energy, which we cannot directly observe, associated with the expansion of the universe. I'm not going to get into this issue in any detail here. Suffice it to say that while total energy including gravitational potential energy is still conserved, the amount of energy that we can observe in the universe can change as the universe expands. This gravitational energy is not included in the energy density that determines the expansion rate, and from here on when I refer to the energy density of the universe I will be referring to observable energy, whose density can change in a variety of ways as the universe expands.
What kind of energy would we need to have inflation? During inflation the expansion was exponential, or at least nearly so, meaning the doubling time during inflation didn't change much as the universe expanded. This in turn means that the energy density must have been changing very slowly. We know, however, that inflation did not last forever. Thus to explain inflation we would need to find a form of energy that changes very slowly for some period of time, but then begins decreasing rapidly.
We have never observed a kind of energy that acts like this, but according to our current theories of physics there is one. This kind of energy is in the form of a field, so to explain how this works I have to first describe what a field is. The most commonly known example of a field is a magnetic field. A magnetic field has some value everywhere in space. You can test what that value is by placing a magnet, e.g. a compass, in that spot and seeing how it reacts. Even in the absence of any objects for it to act on, though, a magnetic field by itself has a certain amount of energy. In addition to magnetic fields there are other types such as electric fields, gravitational fields, and so on. In general any field is defined by having some measurable value at every point in space and having an energy density that depends on that value.
Different kinds of fields react differently to the expansion of the universe. For example, I noted above that the energy density of electromagnetic radiation decreases faster than that of ordinary matter. It turns out that there is one particular kind of field with the property that when its energy density is very large that density decreases very slowly as the universe expands. When its energy density decreases past a certain point it stops behaving this way and starts decreasing at the same rate as ordinary matter. Such a field is called a scalar field. I'm not going to explain here what one is or why it behaves in this way; this is just one of those things that you will have to take my word about.4
Inflation doesn't require precise exponential expansion. Rather there is a set of mathematical criteria for how close to exponential the expansion needs to be during inflation, i.e. how much the doubling time can change each time distances double, in order for inflation to still have the consequences described below. Given a scalar field with a high enough energy density these conditions will be met and the expansion of the universe can be considered quasi-exponential. In general, however, the energy density of a scalar field is not perfectly constant as the universe expands. Rather it decreases more rapidly the smaller it is, such that eventually when it becomes small enough the universe enters a stage of power-law expansion.
So in order for inflation to have occurred it suffices that some scalar field exists and at some point in the past it had a very large energy density. It's true that we have never to date observed a scalar field, but physicists believe for a variety of theoretical reasons that many of them probably do exist and that we will start to see them in our next generation of particle accelerators. The second requirement, however, requires some thought. Having a scalar field with a large energy density is in many ways like having a very strong magnetic field. If I told you that the early universe was filled with strong magnetic fields you would be justified in wondering why that was so. Recall that the "initial conditions" for our universe were set by physics that we don't know occurring above the Planck scale. So it might be that somewhere in the universe a region emerged with a large value of a scalar field, but why would this have happened simultaneously throughout the whole universe?
The answer is that it wouldn't need to. Suppose that when the universe first started to have sub-Planckian density it was filled with many different regions in which all the fields had very different values. All we require for inflation is that somewhere there was one region, no matter how small, where the largest contribution to the energy came from a high-energy scalar field. If that happened then that small region would inflate, almost instantly growing much larger than all the other regions around it. Very soon this inflationary region would occupy nearly 100% of the total volume.
This point is possibly the most important one in the paper. For that reason it bears repeating, and I urge you to think about it carefully. In the standard big bang model the entire universe started expanding uniformly at the same moment. In the inflationary scenario I am describing this expansion began with an exponential growth in only one small part of the universe while the rest of it either grew in a power law or started shrinking. However this one inflationary region became so big that everything we can see, or will probably ever be able to see, lies within it. Thus the universe appears to us to be uniform, even though on much larger scales it is not.
To understand the impact of inflation on the homogeneity problem described above, consider the universe to be like a piece of rubber being stretched out. Imagine that before you stretch it the surface of the rubber is uneven, i.e. it contains hills and valleys. As you stretch out the rubber the height of these hills and valleys doesn't change, while their width increases dramatically. (On a real piece of rubber the height would decrease because of the tension in the rubber, but that's just a limitation of the analogy. Pretend it wouldn't.) The consequence of the expansion is that what looked like a steep hill beforehand ends up flat. If the sheet started out with a hill one foot wide and one foot tall and then you stretched out its width by a factor of several thousand, you wouldn't even be able to see that there was a hill there at all.
In our universe the hills and valleys are differences in energy density from one place to another and the stretching is the expansion of the universe. Suppose that before inflation there was a region one foot wide in which the energy density varied smoothly such that it was twice as great on one side as the other. During inflation the width of this region would expand by perhaps ten to the ten million times, becoming so vast that we could never hope to see from one end of it to the other. Suppose the region of the universe that we observe, which is roughly thirty billion light years across, were somewhere in the middle of this inflated region. The difference in energy density we would see from one side of the observable universe to the other would be far too small for us to measure.
Inflation also solves the flatness problem. It turns out that in exponential expansion the curvature of the universe decreases. On an intuitive level you can once again think of this in terms of rapidly stretching out a piece of rubber. For a closed universe the rubber is like the surface of a sphere and for an open one it is like a saddle shape, but in either case once you stretch it out enough it looks locally flat. This geometric picture is a somewhat simplified account of how inflation solves the flatness problem. The detailed solution requires solving the equations of general relativity, but when you do you find that after inflation the curvature of the universe will be far too small for us to measure.5
Perhaps more importantly, inflation solves the problem of relic particles. Our current theories of particle physics predict that in the hot, dense conditions that existed before inflation and even during its early stages various kinds of particles would be produced that we don't observe. There are numerous examples of such particles, with names like magnetic monopoles and gravitinos, and many of them presumably were created before inflation began. However, any particles that may have existed in this inflationary region before it began inflating are reduced to a density of essentially zero. In the example of the previous paragraph, if a one foot wide region contained ten to the thirty monopoles before inflation, the odds of a single one ending up within thirty billion light years of us is virtually zero. The same logic applies to particles produced during the early stages of inflation. At some point during inflation the energy density would have dropped enough that such particles could no longer be produced, which explains why we don't see any of them today.
All of these results of inflation are theoretically attractive as explanations of the features we observe in the universe at large scales. Probably the most important success of inflation, however, is its explanation of the origin of inhomogeneities in the universe. I noted in the rubber sheet analogy above that any fluctuations existing before inflation get stretched out so much that we can no longer detect them. However quantum mechanics predicts that some fluctuations will always be produced at small scales. During inflation quantum mechanics causes microscopic fluctuations to be generated and inflation stretches these fluctuations out to large distances. The quantum fluctuations produced early in inflation get stretched out so much that we can't see them. As before it's as if we were standing on a hill so wide it looks flat to us. Fluctuations generated close to the end of inflation, however, produce "hills" whose width is still small enough today for us to see from one end of them to the other. The height of these hills, i.e. the difference in energy density from one place to another, is small because the quantum fluctuations that produced them represented small perturbations in the energy density.
When inflationary theory was developed in the 1980s people used this theory of stretched out quantum fluctuations to predict the differences that should be seen in the microwave background radiation from one part of the universe to another. These differences are so small that they weren't detected at all until the 1990s. More than just predicting the existence of these fluctuations, inflation predicts in some detail the shape that they will have. Only in the last few years of the 1990s were measurements taken with enough accuracy to test the detailed predictions of inflation regarding these fluctuations, and to date the data match these predictions perfectly. 6
It is primarily the successful prediction of the form of the microwave background fluctuations that has caused inflation to be generally accepted today by most early universe physicists. Since these small inhomogeneities are what later gave rise to galaxies, stars, and us, all of the structure we see in the universe arose because of quantum fluctuations.
Recall that inflation occurs when in some region of space the largest contribution to the energy density comes from a high energy scalar field. In the tradition of naming particles and fields with the suffix on (proton, neutron, photon, ...) the scalar field that caused inflation is called the inflaton. During inflation the universe expands exponentially and the energy density of everything else drops to essentially zero, while the energy density of the inflaton field decreases only very slowly. Inflation ends when this field reaches a low enough energy density that it starts behaving like matter, i.e. when the universe starts experiencing power law expansion. So at the end of inflation essentially all of the energy of the universe is contained in this one, nearly homogeneous field.
Many fields and particles in the universe are unstable, meaning that after a short time they decay into other forms of energy. For example some types of uranium can decay into other elements. In fact most types of energy are unstable in this sense. The only reason we don't see decays like this happening all the time is because most of the unstable forms of energy in the universe decayed long ago, and the things we have left are generally the particles that either don't decay or do so very slowly. Once its energy density becomes small enough to allow power law inflation, the inflaton field becomes highly unstable. (This is another fact about scalar fields that can be derived from field theory, but which you will have to take my word about.) After inflation the energy in the inflaton field would have quickly decayed into other particles and fields until eventually the universe consisted mainly of long-lived forms of energy such as protons, neutrons, electrons, and electromagnetic radiation.
It may occur to you to wonder about relic particles in the context of reheating. The whole problem with these relics is that they can be produced in the hot, dense conditions of the early universe, but we don't see them today. The reason is that after inflation the density and temperature were too small to produce these particles. Any monopoles and gravitinos produced before inflation or during its early stages were spread out too thin for us to find, and by the time of reheating the density and temperature had dropped too much for them to be produced.
How long did it take after inflation for the inflaton to decay into the particles we see today? We don't know, but what we can say is that it had to be finished by the time of nucleosynthesis, i.e. about three minutes after the end of inflation. Assuming that to be true, the universe at the time of nucleosynthesis would have looked exactly like it does in the big bang model (minus particles like monopoles). The key difference is that inflation explains why the universe had many of the features it did at that time.
Will the theory of inflation survive those changes? I believe it's too early to answer that question with any confidence. As a model it has great appeal for a number of reasons. In particular, it explains a lot of features of the universe in a simple way with relatively few assumptions, and it seems to arise naturally in the context of our current theories of physics. In other words it seems highly likely that inflation would have occurred in the early universe, and if it did it would give rise to a universe much like the one we see. Moreover no other known theory can explain these features. Andrei Linde, one of the leading experts in inflation, once told me "Inflation hasn't won the race, but so far it's the only horse." My personal suspicion is that if in a hundred years the theory of inflation isn't part of our understanding of the early universe then it will have to have been replaced by something very similar to it.
In the meanwhile we can look forward to a lot of good tests of early universe physics in the next couple of decades. High sensitivity probes of the microwave background, searches for waves of gravity surviving from the early universe, and many other experiments are going to give us excellent tests, not only of inflation, but of our understanding of the universe in general.
1. Some of the key elements of inflation were discovered a few years earlier by Alexei Starobinsky. He failed to realize the significance of this discovery for the problems discussed here, however. Alan Guth coined the term "inflation" for his 1980 model and discussed how it solved the problems of the standard big bang model. Guth was probably unaware of Starobinsky's work at the time as communication was limited between the U.S. and the Soviet Union.
2. There's some evidence that the universe may currently be entering a new stage of exponential expansion, in which case the prediction that distances will double from their current values in about 30 billion years would be incorrect. That possibility, while very interesting, wouldn't change any of what this paper says about inflation and the early universe. In between the time of inflation and now the universe was almost certainly undergoing power law expansion.
3. For the more technical reader, the reason this occurs is that the frequency of the light is redshifted. Thus the density of photons (light particles) decreases by a factor of eight while the energy of each photon decreases by a factor of two. If you don't know what these terms mean I assure you this won't be necessary for anything else in the paper.
4. Technically, not all scalar fields would act this way. Every field is defined in part by a potential function that determines its behavior. If a particular scalar field has a potential function that satisfies certain conditions then it will behave in the way I'm describing. Whenever I refer to the scalar field responsible for inflation I will be implicitly assuming that it satisfies these criteria.
5. For readers comfortable with equations I can offer a more complete explanation of how inflation solves the flatness problem. The relevant equation is H2=Cr-k/a2, where H is the expansion rate (roughly one over the doubling time), C is just a constant, r is the energy density of the universe, k is a measure of the curvature of the universe, and a is something called the scale factor. The scale factor is a measure of the growth of the universe. You can think of it as the distance between any two arbitrarily chosen points, so that when distances double the scale factor doubles. For a matter dominated universe the energy density scales as 1/a3 and the curvature term, which scales as 1/a2, becomes relatively more important as the universe expands. (Note that a closed universe (k positive) will therefore rapidly reach the state H=0, meaning the expansion stops and the universe recollapses.) During inflation, however, the energy density is constant and thus the curvature term on the right hand side of the equation becomes relatively unimportant. If the scale factor increased by ten to the ten million times during inflation then the k/a2 term would have become so small that we wouldn't detect any curvature today, which is precisely what we observe.
6. I'm glossing over this point a bit. What inflation actually predicts is the spectrum of the fluctuations, which in this context means how much the intensity of the radiation differs as a function of distance. In other words inflation can tell you what the average difference in radiation intensity was between points separated by, say, one light year in the early universe.