|
Science and Reason: The Big Bang
Science and Reason: The Big Bang
Introduction
The big bang history of the universe
Cosmological models
The cosmological constant
Nucleosynthesis
Evidence for the big bang theory
Open questions
Recommended references: Web sites
Recommended references: Magazine/journal articles
Recommended references: Books
Introduction
The big bang theory of the initial state and subsequent development of
the universe was originated around 1931 by
Georges Lemaître and further developed in the 1940s by
George Gamow and his students Ralph Alpher and Robert Herman. Other
theories, especially Fred Hoyle's "steady-state" theory, have been
proposed to deal with the origin (if there was one) and evolution of
the universe, but only the big bang theory comes close to fitting a number of
key observations.
So, what exactly is the big bang theory? One way to answer that is to
consider the primary observational fact which suggested the theory in
the first place. The fact is that the universe appears to be expanding
in a uniform way.
This fact was established by Edwin Hubble in the late 1920s. He did this
by comparing two sorts of observations – the "redshifts" in
light from distant galaxies as determined by examining the spectra
of light from those galaxies, and the distances of those galaxies from us
as determined in a way that Hubble pioneered.
Let's consider redshift first. Star light, whether from our Sun or stars
in distant galaxies, contains distinct features in its spectrum. In
addition to the familiar continuous progression of colors from red to
violet, the spectrum contains dark lines at specific wavelenghts, called
"absorption lines" because they are the result of specific wavelengths of
light absorbed by cooler gas in the star's atmosphere. These lines
correspond to energy level transitions in orbital electrons of specific
elements found in stellar atmospheres.
We know, from laboratory work, at exactly which wavelenghts such lines
should occur. The lines, moreover, occur in a pattern of recognizable
groupings. What we observe in light from distant galaxies is that the
pattern of all lines we can identify is uniformly shifted to longer
wavelengths, i. e. toward the red end of the spectrum. By "uniformly",
we mean that every single line is observed at a wavelength which is the same
constant multiple of what it should be. For example, sodium has an
absorption line at a wavelength of 588.9 nm (nm = nanometers =
10-9
meters), in the yellow part of the spectrum. In light from a distant
galaxy we might instead observe this line at twice the expected
wavelength (1177.8 nm, which is in the infrared part of the spectrum, so
no longer visible to the eye). The redshift is defined as a number z
which is 1 less than the factor by which wavelength is expanded. In other
words,
(observed wavelength) / (emitted wavelength) = z + 1
A simple algebraic rearrangement gives the equivalent formula:
z = [(observed wavelength) - (emitted wavelength)]/(emitted wavelength)
z has another nice property for small enough values of z
(z ≤ 100, say, and certainly for z ≤ 8,
which is about the largest redshift of any object that has been
observed). Specifically,
there is a relationship of redshift to the age of the universe when
the light was emitted, given by 1/(z+1)3/2
= te/tnow,
where te is the period of time after the
big bang when the light was emitted and
tnow is the present time, about 13.7 billion
years. Hence for z = 8, the light was emitted when the universe
was about one 27th of its present age,
or about 500 million years after the big bang.
On the other hand, although the distance of an object increases
with its redshift, the precise relationship between them is not
quite so simple unless z is not much greater than 1.
Thus, in the given example, z = 1. The definition is arranged so that
z is 0 for no redshift at all, and z has a positive value for any other
redshift. (If there is a blueshift, with absorption lines shifted
towards blue, then z will be negative.)
Redshifts in the spectra of stars and galaxies had been studied
since Vesto Slipher began pioneering
this area about 1912. By 1925 he had measured the spectral redshifts
of about 40 galaxies. The natural assumption as to the cause of
these redshifts was that they represented the Doppler effect due to
motion away from us of stars and galaxies. For nearby objects, this
explanation is entirely correct. In fact, we can deduce that nearby galaxies
are rotating by observing that the spectrum of light from one side
of the galaxy is redshifted, while the spectrum from the other side
is blueshifted (since it is moving towards us). As it turns out,
the redshift of distant galaxies is actually due to the expansion
of space itself between us and the distant galaxy, rather than
relative motion in the usual sense, so the redshift is not quite
the same as a Doppler shift.
Now let's consider measuring distances to galaxies. This is the
area Hubble was pioneering in the 1920s.
Think first of determining how far away a single star is.
If you know how bright a particular star is intrinsically
and you know its apparent brightness
as seen through a telescope, then the
distance can be computed because the ratio of intrinsic to
apparent brightness is proportional to the square of the distance.
The problem is being able to determine the intrinsic brightness of a given
star. There's no good way to do this in general. But there is a certain
type of star, called a Cepheid variable, whose brightness fluctuates
periodically. The maximum brightness is proportional to
the length of the star's
period. The distance to nearby Cepheids can be measured by the method of
parallax (applying trigonometry to the difference in the star's apparent
position when the Earth is at opposite ends of its orbit). This makes it
possible to compute the intrinsic brightness of nearby Cepheids.
Hubble was the first to observe a Cepheid in a galaxy other than the Milky
Way – the Andromeda Galaxy, M31 – which enabled him to
deduce that the galaxy was located well outside our own. This settled a
very important open question of the time – the fact that spiral
galaxies like M31 are huge aggregates of stars much like our own
galaxy, instead of cloudlike "nebulae" inside our galaxy. Hence our
galaxy isn't unique, but is only one of many. Hubble went on to find
Cepheids in more distant galaxies.
Armed with distance data provided by the Cepheids, Hubble was then
able to consider a new question: whether there is any relation between
the amount of redshift associated with a galaxy and the distance of
the galaxy. Hubble found that there was indeed a simple
relationship – which is now known as Hubble's law.
Hubble's law
When Hubble compared the amount of redshift of a galaxy to its apparent
distance, he found a striking relationship. This relationship did not
exist for the galaxies closest to us, such as the Andromeda Galaxy,
which is about 2.9 million light years away and actually moving towards
us, so its light is blueshifted. But for galaxies substantially farther
away than M31, the amount of redshift was directly proportional to
the distance of the galaxy. Interpreting redshift as if it were a
Doppler shift, this means that the farther away an object is, the faster
it is receding from us. This is exactly the effect one would expect to
see in a three dimensional volume that is expanding at a uniform rate:
points close together move apart relatively slowly, while widely
separated points move apart more quickly. In fact, their rate of
separation is proportional to their distance from each other.
From these observations, Hubble stated the law that is named after
him: the velocity of recession of
a distant object is proportional to its distance, with constant of
proportionality denoted by H. That is:
recession velocity = H × distance
H is often referred to as the "Hubble constant". Actually, H varies with
time, as we will find later. And because the speed of light is finite,
the farther away from us something is, the earlier is the time in its
life that we are seeing it. Therefore, H itself depends (in a
complicated way) on the distance of the object from which it is
inferred. So it is somewhat more correct to refer to it as the
Hubble parameter. But for relatively nearby galaxies (except the
very closest ones), and especially all galaxies for which we could
obtain accurate distance estimates until very recently, the Hubble
parameter is nearly constant, so that there is a linear relationship
between recession velocity (inferred from redshift) and distance.
Although other explanations for this redshift are conceivable, the one
that eventually seemed compelling, in light of all the evidence, is that
it is because the universe is expanding.
If the universe is expanding homogeneously (i. e., at the same rate
as measured from any point in the universe), the distant objects
will appear to be receding from us at a rate that is proportional
to the distance. To repeat, the redshift is not due to a traditional
Doppler effect, but rather because the expansion of the universe
causes the wavelength of each photon of light to be stretched.
As noted, the end result is similar to the Doppler effect, though not quite
the same. (And incidentally, objects held together by chemical or
gravitational forces, such as molecules, planets, and galaxies, are not subject
to this expansion, even though photons are.)
Of course, we are not at any special place in the universe. This is what
is known as the cosmological principle – the universe looks
the same no matter where it is observed from. Everything is receding
from everything else according to the same Hubble law.
From this, we can draw a very interesting and rather disquieting
conclusion, which leads directly to the "big bang" idea.
The conclusion is that,
if we run the evolution of the universe backwards, there should
be a time in the distant past when everything we can now see was
a heck of a lot closer to everything else. It would have been very
crowded at some point in time. While this does not prove that everything
was bunched together in a state of very high density at some time long ago,
it certainly suggests the possibility. Physicists like George Gamow who
thought about such things made this and other inferences which,
eventually, led to... the theory of the big bang – the notion that
at some point in time about 14 billion years ago (by current reckoning),
all matter in the observable universe was in a state of extremely high
density and temperature, and subsequently "exploded", resulting in the
expansion we still see today. ("Explosion" is used metaphorically.
Exactly what did happen is still a major open question.)
How long ago, one can ask, might this event have occurred?
Well, notice that if H is the Hubble parameter, then 1/H has the dimensions
of time. We can rewrite Hubble's law as
distance = (1/H) × (recession velocity)
Though this is just a rough approximation, if the law holds all the way
back and if the rate of expansion has remained constant since the
beginning, 1/H should be about the amount of time required for the most distant
objects we can see in the
universe to now be at the distance they appear to be.
For many years there was a serious problem with that interpretation.
It turns out that, because of the
great difficulty of measuring cosmic distances, Hubble initially underestimated
distances by nearly an order of magnitude. Since there was little
uncertainty about the redshift, and hence the recession velocity, this
made the "age" of the universe about a factor of 10 too small.
Like, about a mere 2 or 3 billion years. Other estimates of the
age of the solar system and the Earth pointed to an age of 4 or 5 billion
years for our home planet. Oops.
Since Hubble's distance estimates were way off,
this paradox was only apparent.
In all fairness, measuring large cosmic distances, until recently, was
pretty difficult. The process is, essentially, to take the actual
luminosity of some object and compare it to the observed luminosity.
Knowing that luminosity should fall off by the inverse square law,
we can compute what the distance must be. However,
stars and galaxies come in a wide range of brightnesses.
Astronomers must rely on indirect means to estimate the
actual luminosity of a particular object. In Hubble's case this
involved measuring the periods of Cepheid variables in order to infer
their luminosity.
The problem was that there are actually two different types of Cepheids.
Hubble assumed a period-luminosity relationship appropriate for one
type, but most of the Cepheids observed in distant galaxies were of the
other type. Eventually, as these stars became better understood, and
telescopes became more powerful, the estimated distances of remote
galaxies were increased significantly. But even up until the 1990s, when
Type 1a supernovae could be used to judge distances, there was a lot of
uncertainty in the value of the Hubble parameter.
It might seem that a theory which held that the universe was in a process
of rapid expansion from a dense lump of matter for no obvious
reason might be a little
hard for astronomers and others to swallow... except for one additional fact.
Which is, that Einstein's fairly new, and even more newly verified, theory
of general relativity predicted exactly this state of affairs. Even so,
it was quite a stretch, which most cosmologists were reluctant to make.
The idea of being able to extrapolate features of the whole universe
from simple observations and principles seemed almost too good to
be true, so for a long time astronomers and cosmologists didn't
take their own predictions too seriously. This was true of the
prediction of general relativity that the universe was not static,
and there have been other important cases of the same thing. (In the
case of "cosmic microwave background" radiation, for example.)
Einstein himself realized that his theory predicted the universe should not
simply be static. It might be collapsing under the gravitational attraction
of everything for everything else, or (for reasons less apparent) expanding.
But it shouldn't just sit there, going nowhere, or levitating calmly
in midair. Yet in 1915, when the
theory was published, and a few years before Hubble came along, just
sitting still quietly was what everyone assumed the universe was doing.
To make this mathematically plausible, Einstein even added a new term
to his equation of general relativity, the famous "cosmological constant".
Then, around 1929 when Hubble showed that expansion was a fact, Einstein
decided his cosmological constant was just a big mistake, and he scrapped it.
(That may have been premature, as will be seen later.)
After Sir Arthur Eddington in 1919 confirmed a key prediction of the
general theory of relativity, namely the bending of light by gravity,
the overall theory began to be taken pretty seriously, even as unconventional
as it was. Various people began coming up with solutions of Einstein's
equation. Among them was an obscure Russian meteorologist
named Alexander Friedmann.
In the early 1920s he published solutions of the equation which
implied several forms of an expanding universe, exactly consistent
with Hubble's observations. Unfortunately, Friedmann's work was
at first ignored, and later disputed, even by Einstein himself. Although
Einstein quickly withdrew his objections, theorists remained reluctant
to take such seemingly outrageous models entirely seriously.
The big bang hypothesis
One of the very few who did take Friedmann's solution seriously was
Georges Lemaître, a Belgian cleric. He reasoned, as above,
that the universe, at some time long ago, must have been in a very
hot dense state, from which it has rapidly expanded until the present.
This was the first noteworthy appearance of the big bang hypothesis.
However, Lemaître was not able to make any testable predictions
from this idea.
And so, between the large uncertainties in Hubble's observations and the
apparently unrealistic nature of Friedmann's models, little progress was
made through the 1930s and early 1940s. Although the main ideas of the
big bang were sitting there just waiting to be developed, few scientists
– including Einstein –
showed much interest. Until George Gamow came along.
What interested Gamow was the problem of nucleosynthesis. That is,
the question of how nuclei of all the chemical elements came to
be formed. This accompanied the question of how energy was generated
inside of stars, because it became clear that the process of nuclear
fusion – in which heavier elements are built up, ultimately, by protons
(hydrogen nuclei) fusing together somehow – could explain the source
of stellar energy. Gamow and his students made substantial contributions
to both issues.
In particular, they made calculations in the late 1940s
of what might come out of
a very hot, dense "gas" of neutrons and protons – exactly the form
that matter would take at a certain point in a big bang scenario.
They succeeded in coming up with reasonable estimates, that only
hydrogen and helium would be produced in substantial amounts, and
more of the former than the latter.
Although these calculations were largely
ignored or forgotten for some time by the astrophysical community, they
have more recently been much refined, to the point that they produce
an excellent agreement with observation – and provide one of the key
lines of evidence in support of the big bang model.
Gamow and his students also calculated that
the ultrahot, ultradense state of matter in the early universe should
have produced radiation which might still be observable today, at an
apparent temperature of perhaps several tens of
degrees above absolute zero. (These estimates, as well as the
ones for nucleosynthesis, were somewhat off, but still remarkable.)
Yet these calculations predicting a microwave background radiation
were taken even less seriously. Even Gamow himself didn't pursue
the idea very vigorously, when it seemed it would be very difficult to
observe such background radiation.
Nevertheless, Gamow pushed on to fill out the general scenario of the
evolution of the universe according to the big bang model. Throughout
the 1950s and early 1960s it was locked in close competition with
the "steady state" theory of Fred Hoyle and others as a model
for the universe's evolution over time. The steady state theory was
equally radical in some respects. For instance, it postulated the ongoing
continuous creation of matter out of nothing. But there didn't seem to
be enough observational evidence to tip the scales decisively towards
either theory.
And then in 1964 Arno Penzias and Robert Wilson discovered, quite by
accident and without looking for it, the cosmic microwave background
radiation, largely as predicted by Gamow. (Penzias and Wilson got
a Nobel Prize for this; Gamow never received the prize.)
The rest is history (which is, basically, what the "steady state" theory
became).
The big bang history of the universe
Let's eliminate any suspense, and simply start out with a table of
the major events and time periods that make up the big bang model.
Each line in the table is one of the landmark events that took place
as the universe expanded. The second column contains the approximate
time after the big bang itself that the event occurred. The third
column is the approximate temperature of the universe at the time
of the event. (The temperature is equivalent to an average kinetic
energy that particles would have possessed.)
Following the table, we will explain in more detail what was going on
as the universe expanded and cooled.
Cosmological timeline
Name
Time
Temperature
Notes
Planck era
10-43 sec.
1032 K
Equality of all forces
Inflation begins
10-35 sec.
1028 K
GUT symmetry breaks
Inflation ends
10-32 sec.
1027 K
Strong & electroweak forces are separate
Electroweak symmetry breaks
10-12 sec.
1015 K
Separation of weak & electromagnetic force
Baryogenesis
10-6 sec.
1013 K
Quarks confined in hadrons, ending era of quark-gluon plasma
Quark confinement complete
10-5 sec.
3×1012 K
Only lightest hadrons remain (protons, neutrons);
antimatter annihilated
Nuclear binding
1 sec.
1010 K
Nuclear binding energy exceeds photon energy; neutrinos decouple
Nucleosynthesis begins
300 sec.
9×108 K
2H, 3H,
3He, 4He form
Nucleosynthesis ends
35 min.
3×108 K
Mostly H and 4He left
Matter dominance
47,000 yrs.
8000 K
Matter density exceeds radiation density
Recombination begins
240,000 yrs.
3700 K
Electrons begin to bind into atoms of H and 4He
Last photon scattering
350,000 yrs.
3000 K
Universe becomes fully transparent, no matter/photon interactions
Reionization
∼108 yrs.
30 K
First stars
First galaxies
∼5×108 yrs.
10 K
Corresponds to redshift z = 8
Present
13.7×109 yrs.
2.725 K
Internet, Chinese food, etc.
In the beginning: the Planck era
At the very beginning, cosmology and high energy particle physics
overlap completely, because of the extremely high energies (i. e.
high temperatures) that were characteristic of this initial period.
You may want to refer to our discussion of the
standard model of particle physics for
an overview of the relevant concepts of particle physics that we will
(mostly) not repeat here.
One of the basic ideas of the standard model of particle physics is
that there are four fundamental forces: gravity, the strong force,
the weak force, and the electromagnetic force. However, at the very
earliest time in the existence of the universe, all four forces
were "unified" and indistinguishable. This is similar to
the way we regard electromagnetism as a single "unified" force,
even though at one time (before the work of James Clerk Maxwell)
physicists regarded electric and magnetic force as distinct.
Contemporary physics is able to say almost nothing about the very first
instants of the universe. This is because the strength of the gravitational
force at the earliest times was the same as the strengths of the
other physical forces. Therefore, both quantum mechanics and general
relativity must be applied to describe the physics of this period.
But unfortunately, this is not possible, since we do not currently have a
workable theory of quantum gravity that would merge these two fundamental
theories.
About all physicists can do is make certain educated guesses. As an example
from more than 100 years ago, one very inspired
insight was due to the German physicist Max Planck. His insight provided
the foundations for quantum mechanics, and it also leads to a natural
scale for the time and energy level of the universe during the brief
instant when all four forces were unified.
Basic to all physical theories
are certain fundamental quantities: mass, length, and time. Almost
everything in physics can be described in terms of certain combinations
of these units. For instance, speed is length divided by time (e. g.
meters per second). If we denote these basic quantities with the letters M,
L, T, then speed can be represented as L/T. Other physical quantities
can be described as products of powers (including fractional and
negative powers) of these basic ones.
In 1899 Planck was working on the theory of blackbody radiation, that is,
the theory of how very hot objects such as molten rock or metal or
the filament of an incandescent light give
off visible (as well as invisible) light – or more generally,
electromagnetic radiation. The theory was in big trouble, since
as it existed, it predicted that the total amount of energy given off
by a hot object could be infinite. Planck's insight, for which he won
a Nobel Prize in 1918, was that the theory could be salvaged if
energy did not occur in continuous increments, but necessarily
occurred only in discrete units or quanta. This inspired
guess is also the basis on which Planck is considered the inventor
of quantum mechanics.
As part of his new theory, Planck introduced a new fundamental constant,
which of course is called Planck's constant, denoted by
the symbol ℎ. In symbols, Planck's result is that the energy of
some amount of monochromatic light can be expressed as
E = n × ℎ × ν
where E is energy, ν is frequency in cycles per second of the
monochromatic light, and n is some non-negative integer,
which represents the number of energy quanta.
Since ν has units of 1/T, and energy has units of
M(L/T)2, it follows that ℎ has
units of ML2/T.
Note that the actual value of the energy E depends on the units
of measure in which the various quantities are expressed.
Another brilliant idea Planck had was to use certain well known
fundamental constants to derive what he considered to be "natural"
units for quantities such as mass, length, time, energy, etc. – all
relative to his constant ℎ.
Two obvious fundamental constants were the speed of light, c (which
has units L/T), and Newton's gravitational constant G (which has
units L3/(T2M)).
Planck also threw in a factor of 2π (so that frequency could be
expressed in radians per second), to make the modified constant
ℏ = ℎ/2π, and noted that the
expression ℏG/c5 has units
(ML2/T)(L3/(T2M))(T5/L5)
= T2.
Therefore the quantity
(ℏG/c5)1/2
has units of time. This unit is called the Planck time unit, and it
has a value of about 5.39×10-44 seconds.
When he did this, Planck had no idea of what quantum mechanics would
eventually become and the important role that his constant ℏ would
ultimately play in the theory. Nevertheless, it turned out that the
Planck time unit was the smallest quantity of time that was meaningful
in the theory. If you multiply that unit by the speed of light
(expressed in whatever units were originally used in defining
ℏ), the result is the distance light can travel in one Planck
time unit. This distance is known as the Planck length.
It is about 1.616×10-35 meters, and
is the smallest unit of length that is meaningful to use – which at
about 20 orders of magnitude smaller than the size of
a proton is pretty darn small.
There is also a Planck energy, defined as
(ℏc5/G)1/2.
Expressed in terms of temperature it is about
1.417×1032 K.
(K stands for degrees Kelvin. This is just the ususal
Celsius scale, having 100 degrees between the freezing and boiling
points of water, except that it starts at 0 K, unlike Celsius,
whose zero point, at which water freezes, is
273.15 K. Thus "absolute zero", or 0 K, is
-273.15 C.)
The Planck time is the smallest amount of time that can meaningfully be
discussed in quantum mechanics, so it is used to define the Planck
era as the first instant after the big bang that we can hope to apply
any laws of physics to the universe. That is the era in which all the
fundamental forces – gravity, electromagnetism, the weak nuclear
force, and the strong nuclear force – were essentially the same.
As indicated above, the general theory of relativity cannot actually
describe gravity at that time, and we don't know what laws governed
the other forces either. All this is part of the unknown,
hypothetical "theory of everything".
What is known is that at a certain distance scale an order of magnitude
or two longer than the Planck length and hence (by Heisenberg's
uncertainty principle) at a similar proportion less than the Planck
energy, the gravitational force must have become distinct from, and
much weaker than, the other three forces, which remained unified.
Once gravitation becomes distinct from the other (still unified)
forces, it can be described by laws separate from those of quantum
mechanics. At that point, the theory of relativity and quantum mechanics
are able to coexist, so physicists have at least the possibility
of describing the universe with well-tested theories.
This possibility has not yet been achieved, however. There is still
no consistent, unified theory of the strong force, the weak force,
and electromagnetism. Physicists have tried very hard to construct
such a theory, which is generally called a Grand Unified Theory, or
GUT, even though it leaves out gravity.
Nevertheless, by extrapolating the way the strong, weak, and
electromagnetic forces grow with increasing energy levels, it is
possible to predict that all three forces should have equal strength
at a certain energy level three or four orders of magnitude less
than the Planck scale.
There are various equivalent ways to specify energy levels. One is
in terms of temperature.
Although "temperature" is a concept from thermodynamics it can be
related to energy levels, because a "black body" at a given temperature
radiates photons over a spectrum of energies, but the maximum is at
one specific energy level. This energy is taken to be representative
of the "typical" photon energy emitted by the black body.
One common unit of energy is the
electron-volt (eV), the amount of energy required to move an
electron across a one volt difference of electric potential.
(A million electron volts is "MeV", and a billion is "GeV".)
A conversion factor called the Boltzmann
constant, denoted by kB,
relates temperature to the associated energy level. This constant is
8.619×10-5 eV per degree K.
10-4 eV per degree K is a good
approximation, hence
10-13 GeV per degree K.
For example, the Planck temperature of
1.417×1032 K corresponds to
the Planck energy of 1.221×1028 eV =
1.221×1019 GeV.
Anyhow, three or four orders of magnitude below the Planck scale –
where extrapolation indicates the strong, weak, and electromagnetic
forces have equal strength – is
1015 GeV to 1016
GeV. That's about as close as physicists can estimate, because without
an actual consistent GUT, there's no way to compute the value
theoretically.
We refer to this energy level as the GUT energy scale.
One might expect that "something" significant should happen as the
temperature of the universe falls below this point, as it must, given
that the universe is expanding rapidly. For one thing, the strong
force separates off from the remaining weak and electromagnetic
forces. (The latter two forces remain unified for a relatively much
longer time, all the way down to an energy level of about 100 GeV,
in fact. This unified force is called the electroweak force.)
The era of inflation
It appears that this transition point was anything but unevenful.
Although an adequate theory to explain or describe this process does not
exist, it is thought that a very abrupt change in the universe occurred,
in which the size of the universe began to expand at a (literally)
exponential rate. This marks the beginning of the era of cosmic
inflation. The universe underwent a dramatic "phase change",
analogous to the abrupt freezing of water supercooled below 0 C.
Another name for this process is the "collapse of the false vacuum",
because although the universe had been in a temporarily stable state,
it was not truly a state of minimal energy. What had appeared to be
empty vacuum actually contained an enormous amount of latent energy
which was released when the temperature dropped below a critical
point, much as water releases heat energy when it freezes.
(In the opposite direction, to melt ice, it is necessary to
add heat.)
Further, in both cases a state of relatively high symmetry
undergoes an abrupt transition to a state of lower symmetry.
In one case, the symmetry between the strong and electroweak forces
was broken. In the other case, when water freezes, it goes from
being an isotropic substance (in which all directions are equivalent)
to one in which there are particular preferred directions: the
axes of the ice crystals.
There are some very good reasons why this dramatic phase change
followed by exponential inflation is
thought to have occurred, even though the physical principles that explain
it are still unknown. The idea of inflation was originated by
Alan Guth in December 1979. (Anticipating the importance of the
idea at the time, he described it as a "spectacular realization".)
In 1979, although the big bang theory
was widely favored over alternatives because of its success in
predicting the existence of a uniform presence of microwave radiation
at a blackbody temperature of about 2.725 K, cosmologists recognized
that it had several troublesome problems:
An equation known as the Friedmann equation – which we will
later explain in detail – is used to describe how the universe
expands with time. One of the important parameters in the equation is a
constant that describes how much space itself is curved. In the form of
the equation most often studied at the time, if the curvature is
positive, the universe eventually stops expanding and begins to
collapse. If the curvature is zero or negative, the universe continues
expanding indefinitely. This was considered to be a very important
distinction, as it predicted the ultimate "fate" of the universe.
Curvature is related to another quantity, denoted by Ω, which
reflects the density of matter and energy in the universe. The
curvature is positive, zero, or negative according
as &Omega is greater than 1, exactly 1, or less than 1. Unlike
curvature, Ω varies over the life of the universe. In 1980 the
density of matter in the universe, and hence Ω, could be measured
only crudely, but the best estimates were Ω ≈ .01. However,
this was a serious problem, because the Friedmann equation predicted
that because the universe had expanded so much since the big bang,
Ω would need to have been extremely close to 1 during the earliest
moments (less than a millisecond after the big bang, say). Since as far
as anyone knew, Ω could have any value at all, it seemed to be an
extraordinary coincidence if Ω were extremely close to 1 at very
early times. An explanation for this coincidence was needed. This was
called the flatness problem, since Ω ≈ 1 is
equivalent to having zero curvature – being completely flat, in
other words.
A second problem was that, observationally, the universe appears to
look very much the same in all directions. The number of galaxies seems
to be the same in all directions, as do the galactic redshifts which can
be measured, when correlated with apparent distance. Since it would be a
remarkable coincidence if this property, known as isotropy, were
unique to our vantage point, it should be expected to exist around any
vantage point. And in that case, the universe should actually be
homogeneous – essentially the same everywhere, on a
sufficiently large scale. The assumption of homogeneity wasn't merely
attractive philosophically. It was in fact an essential ingredient (as
we shall see) in deriving the Friedmann equation used to describe the
evolution of the universe. However, homogeneity is also a big problem.
The reason is that parts of the universe that we can now observe on
opposite sides of the sky were, during the first instants after the big
bang, too far apart from each other for light or any kind of information
to travel between them. Therefore, it is very hard to understand how
different parts of the universe which were not "causally connected" at
very early times could have turned out to have identical statistical
properties. This puzzle is known as the horizon problem, because
at very early times most parts of the universe would have been "beyond the
horizon" from each other.
There was one additional problem, and it arises from most theories
of the GUT type that attempt to describe the unification of the strong
and electroweak forces. It turns out that in most such theories
some freakish things are produced when the symmetry breaks. When water
freezes into ice as the temperature falls below 0 C, ice crystals
begin to form in many different regions at the same time. Eventually
these all join together, but the ice crystals that form in each
region all have different orientations. Therefore, discontinuities
occur when ice from different regions come into contact. The same
sort of thing should have occurred in the symmetry breaking predicted
by GUT theories, and the resulting discontinuities of space itself
are called topological defects. One particular type of
topological defect is a magnetic monopole – a minuscule
bit of magnetic material that has just one pole (north or south),
unlike all magnets anyone has ever observed. Most GUT theories
predict that magnetic monopoles should have been produced in vast
quantities. However, magnetic monopoles have never been observed
despite numerous attempts to find them. This is known as the
monopole problem
Alan Guth was working in high-energy physics rather than cosmology
in 1979, so he was especially interested in the monopole problem,
but he became aware of the big bang's other problems as well.
The inspiration occurred to him that all three of the problems
listed could be solved if the universe had undergone a dramatic
bout of expansion, far in excess of the expansion that was a part of
the big bang itself, just after the GUT phase transition (and presumably
as some consequence of it).
How much expansion? It must have been (literally) at an exponential
rate, that is, as a function of time, being proportional to
10Ht, where H is a constant – essentially the
Hubble parameter at that instant. The process can be thought of as a certain
number N of cycles in each of which the universe expands by a factor
of 10, so the total expansion would be
a factor of 10N. If the whole period of inflation
lasted Δt seconds, then N = HΔt. If N = 100, N cycles
would have inflated the universe by a factor of
10100. Suppose inflation began
at 10-35 seconds after the big bang, and
each cycle was that length of time. Then Δt =
10-33 seconds, so that is about when the
inflationary expansion would have ended. The Hubble parameter during this
period would have been H = N/Δt = 1035
(sec)-1.
Although we can only guess at what the actual numbers might have been,
an expansion factor of 10100 would certainly
have solved all of the problems listed above:
The flatness problem is solved, because if you look at a curved
piece of space at a magnification of 10100,
the piece looks very flat, just as the surface of the Earth does to
us living on it.
The horizon problem is solved, because all parts of the universe
that we can see today would have originated, before inflation, in a
single tiny speck of space that was causally connected. That is,
every part of that speck could have communicated with every other
part during the instant before inflation.
The monopole problem is solved, because all the monopoles that
could have been created in the GUT era would have become so diluted
that it would be extremely unlikely that even one of them could
exist in the whole universe observable today.
We will discuss inflation in much more detail
elsewhere, and consider what sort of
processes could have been responsible for such
an immense expansion of space.
For present purposes, we just want to note an important byproduct of
this drastic event. At the GUT scale, just before the symmetry was
broken, there was no essential distinction between one class of
particles known as quarks, and another class, known as leptons.
Quarks are the particles which combine in threesomes to make
nucleons (protons and neutrons). Leptons are particles such as
electrons and neutrinos. Each of these particle types comes in
three distinct subtypes known whimsically as "flavors". The flavors
are somewhat mutable, under the action of the weak force, but in
our universe as it is presently, there is no way that a
quark can turn into a lepton, or vice versa.
But this wasn't the case at the GUT scale. Since all the forces
(except gravity) were equivalent, there were particle reactions
which turned leptons into quarks and quarks into leptons.
Fundamental forces are said to be "mediated" by a special kind of
particle known as a boson. The effect of any force on
a particle like a quark or a lepton is caused by an interaction
between the particles in which the boson that mediates the force
is exchanged.
For example, the boson which mediates (or "carries") the electromagnetic
force is simply the quantum of the electomagnetic field, namely the
photon. The force of attraction (or repulsion) between charged particles
of opposite (or the same) electric charge can be understood as an
interaction between the particles and photons.
Similarly, at the GUT scale, before the GUT symmetry was broken,
it is hypothesized that there were very massive bosons, which are usually
known generically as X particles. When a quark interacts with an X
it becomes a lepton, and when a lepton interacts with an X it
becomes a quark. X particles must have been very massive, having
mass-energy close to the GUT energy level of perhaps
1015 GeV.
(Technically, since E = mc2,
the proper units for expressing mass are energy units
divided by the square of the speed of light. So to be correct,
masses should be expressed in terms of
eV/c2, GeV/c2,
and so forth. For simplicity, we will generally omit the
c2.)
Another way to think of the
breakdown of GUT symmetry is to realize that when the universe cooled
below a temperature equivalent to 1015 GeV,
there would quickly cease to be any more X bosons around. This is for the
simple reason that they would be unstable, and they could no longer be
created by the pair production process out of photons (or any
other particle), simply because there were no photons or other
particles with a sufficiently large amount of energy. Since all X
bosons would quickly disappear, there was eventually no way for quarks
and leptons to change into each other. And thus the GUT symmetry was
lost.
There are in fact a number of theoretical ideas about how inflation
actually worked, but all produce roughly the same result. We will
discuss some of them elsewhere. It is worth noting, however, that
in many cases there will be subtle differences in the results of
inflation that can potentially be used to discriminate between
different possible inflationary mechanisms.
One of the differences between different inflationary theories is
in the mechanism that brings inflation to an end and in how the
universe is reheated to almost the temperature it had before
inflation began, in spite of the vast amount of expansion.
Although theories differ as to what ended inflation, it clearly did
end. The result at the end of inflation is a universe which is
potentially much easier to understand from current knowledge of
particle physics. This is simply because the strong force is no longer
unified with the electroweak force, so that physicists can work with two
different forces (strong and electroweak). And this, in turn, is
actually an advantage, because these two forces are somewhat understood.
Since each of these forces is within the range
of existing accelerators, actual experiments can be conducted
that guide the development of the theory and allow physicists to
separate fact from conjecture.
As exotic as it may be, the post-inflation era of the universe is a much
more familiar environment than it was before inflation, though some very
notable mysteries remain.
The era of particles and forces
Because of the variety of theories, the duration of the inflationary
period and hence the time that it ended is hard to estimate,
but it must have been over by about 10-32
seconds after the big bang. At that time, there could have been
a vast number of types of different particles. Almost
all of them were unstable – it seems to be a general rule
that any particle is unstable unless there's a very good reason
it shouldn't be. For example, there are two other leptons (muons and
tauons) that are very like electrons, except for being significantly
more massive. Muons and tauons decay into electrons and neutrinos (via
the weak force). But electrons appear to be stable, since there are no
lighter leptons into which an electron could decay, and no mechanism for
an electron to become anything but another type of lepton.
(Neutrinos are leptons too, but conversions between electrons and
neutrinos are also ruled out.)
For reasons we will explain shortly, there are no ways to produce
new particles that have more mass-energy than is available given
the prevailing temperature of the universe at a particular time. So,
given that most particles ultimately decay, the diversity of particle
species continually decreases. Although a vast menagerie of particles
may have remained after inflation ended, we have little way to
conjecture what most might be. For most purposes, at least as far
as cosmology is concerned, such particles may as well have never
existed.
However, there are a few apparent facts about the universe as we know
it today that we have no good ideas how to explain. These facts may be
effects left by particles that once existed but are no more, or that,
at least, do not any longer interact significantly with the "ordinary"
matter that makes up galaxies and stars and planets. The list of such
mysterious facts includes:
On the basis of well-understood physics, at the present time there
should be roughly as many baryons (particles, such as protons and
neutrons, made up of quarks) in the universe as there are photons. There
are ways to estimate, from certain observations, how much matter does
exist in the form of baryons. But instead of roughly equal numbers,
there appear almost two billion times more photons than baryons. The
process through which this particular relative abundance of baryons came
about is known as baryogenesis.
There is a symmetry in physics called charge symmetry that relates
particles to antiparticles. It implies that there should be exactly as
many particles of each kind as corresponding antiparticles. Plainly this
couldn't have been true, and there must have been some asymmetry between
matter and antimatter, because otherwise every particle would have
annihilated with one of its antiparticles, and there would be no baryons
(or leptons either) remaining. The charge symmetry must have been broken
very slightly in order to leave just a sufficient excess of particles
over antiparticles to account for the baryons that are present now, in a
ratio of about one baryon per 1.8 billion photons. Physicists have ideas
about why this could be, but the details are very hazy. However, there
might be some relation to the dark matter issue.
All visible matter, which makes up gas clouds, planets, stars, and
galaxies, is composed of baryons. There must be some invisible dark matter
that is also made of baryons, but observations strongly imply that there
is much more matter in the form of nonbaryonic dark matter than exists
in the form of both visible and dark "ordinary" baryonic matter. The
ratio of nonbaryonic matter to baryonic matter, by mass, appears to be
about five to one.
The universe also appears to be full of something now called
dark energy that plays a role in the evolution of the universe
through a part of Einstein's equations of general relativity called
the cosmological constant. This dark energy, in fact, has an effect
on the universe that is more than twice as large as the effect of all
matter, both dark and ordinary. The dark energy definitely seems to
be around, but there are no good ideas why.
After inflation ended at about 10-32 seconds
following the big bang and the universe had reheated,
conditions were
utterly unlike those of any other time in its history,
and yet really very simple. To understand those conditions, we have
to refer again to some basic concepts from the
standard model of particle physics.
As already mentioned, this model consists of fundamental forces,
a variety of different types of elementary particles, and a mathematical
description of the behavior of those forces and particles. The forces,
again, are gravity, the strong force, the weak force, and the electromagnetic
force. At the very highest energy levels we can conceive of – at
the Planck scale – all of the forces are "unified", which means
they all have the same strength and obey the same equations. As the
temperature of the universe decreases along with its expansion, this
high state of symmetry is broken as the characteristics of the forces
become more dissimilar. Gravity separates from the other three forces even
before inflation begins. The subsequent separation of the strong force
is thought to be associated with the phenomenon of inflation.
So after inflation is done, there are three distinct forces: gravity,
the strong force, and the electroweak force – which comprises
the still undifferentiated weak force and electromagnetic force.
There are also a wide variety of elementary particles present. We'll
say more about them in a moment, but they include photons, electrons,
quarks, neutrinos, and far more exotic and massive particles as well.
We can associate an energy level with the universe at this time. There are
several ways to express this energy level, as explained above.
The temperature of the universe at the end of
inflation is estimated to be about 1027 K.
Using Boltzmann's constant, this can also be expressed as
1014 GeV.
However expressed, that is still a huge amount. For comparison, the
rest mass of a proton is .938 GeV – 14 orders of magnitude
smaller. Even the heaviest particle that can be produced in present
accelerators, the top quark, has a mass of only about 178 GeV.
The importance of these energy levels is that we don't have the
experimental tools to even come close to being able to study directly
what physics is like at the time just after inflation ended. The
gap is enormous. Not too surprisingly, physicists can't say very well
what might actually going on at such energy levels.
Lacking experimental evidence, it's tempting to suppose there
isn't a lot going on that really matters. For example,
the electroweak force does not split into a separate
weak force and electromagnetic force until the level of about 100
GeV, which is well within the energy range of present accelerators.
And according to the still untested idea of supersymmetry, the lightest
"supersymmetric" particles should weigh in under 1000 GeV (possibly a
lot less). So there
is an energy range of about 11 orders of magnitude above 1000 GeV
which we know very little about, except that it seems to have relatively
little effect on the physics we can actually study experimentally.
Particle physicists sometimes call this energy range the "desert",
since it doesn't seem to be all that interesting.
However, it's probably rather cavalier to suppose little important
is going on at energy ranges above 1000 GeV, even though we can't
presently study that range directly. And yet it's quite true that
in the range of energies below 1000 GeV (i. e., 1 TeV)
where experimentation is
currently possible, the standard model has an extremely good fit
with experimental results. It has proven extremely difficult to
find physical effects which require "new physics" at high energies
to explain.
Although we cannot presently perform experiments at very high energies
– and probably won't ever be able to at the GUT scale –
cosmology itself should be able to give us clues about what goes
on, if we could only recognize and understand them.
For example, while we can postulate many possible mechanisms by
which inflation may have occurred, without understanding physics
at the GUT scale, we can't determine what actual mechanism may
have been involved. If inflation is in fact a valid hypothesis and
if we can ever learn more about its effects, we may be able to
place limits on what GUT-scale physics must be like.
Similarly, the other "mysteries" listed above certainly have their
explanations rooted in physics that occurs above 1 TeV (trillion
electron volts). For example:
The facts that there does not appear to be any significant amount of
antimatter left in the universe, that baryonic matter does exist, but
that baryons are outnumbered by photons in a ratio of something like
two billion to
one are all related. The facts probably result from very slight
asymmetries in particle reactions that occur at energies well above 1
TeV. We have no idea what those reactions might be or what detailed
physical laws govern them. What we do know is that violations of both
charge symmetry by itself and in combination with another symmetry
(parity symmetry)
must have been involved. The combination of charge and parity
symmetry is called CP symmetry. We do know some reactions where CP is
violated at accessible energy levels, but curiously only for
interactions involving the weak force. It has been conjectured that
certain as-yet undiscovered particles called axions may account for
why we don't see violations of CP symmetry in interactions that involve
the strong force. Even more tantalizingly, it is possible that axions
make up some part of the nonbaryonic dark matter.
A leading candidate for (nonbaryonic) dark matter is particles
predicted by the theory of supersymmetry, particularly the lightest
supersymmetric partner, or LSP (since that may be the only
supersymmetric particle which is stable). The LSP is often conjectured
to be a particle called the neutralino.
If that accounts for most of
the dark matter and if we could understand supersymmetry, we should be able to
predict the actual ratio of nonbaryonic matter to baryonic matter,
given conditions that existed at energy levels around 1 TeV or higher.
The real mystery of dark energy, if it is represented by the
cosmological constant, is that the latter is not vastly larger than it
seems to be. Perhaps 120 orders of magnitude larger.
We have no acceptable theory at all of what generates the
cosmological constant. If we did it might be related to processes that
occurred during the time period under discussion here, perhaps some very
minuscule symmetry breaking like the very slight
breaking of CP symmetry.
A little later, when we come to discuss nucleosynthesis
– the process in which nuclei of elements such as helium and
heavy hydrogen (deuterium) are formed – we will see a concrete
example of how applying well-understood physical laws to conditions in
the universe within a few minutes after the big bang make very
testable predictions. The fact that these predictions have been
confirmed is solid evidence for the big bang model, since the model
tells us what the conditions of temperature and matter density
must have been.
But that kind of reasoning can be turned around. If physicists come
up with sufficiently detailed theories for things like supersymmetry and
CP symmetry breaking that make predictions about particle interactions,
then we can use the big bang model at higher energy levels to
predict observable effects – such as the ratios of
baryons to photons and baryonic to nonbaryonic matter, which have
both already been measured.
There is, thus, an interplay between cosmology and very high energy
physics. The early universe is the ultimate particle accelerator.
Observational facts from cosmology put useful constraints on the physics.
Conversely, anything we are able to learn about the physics of CP
symmetry breaking or supersymmetry (for example) might suggest new
phenomena to look for in cosmology.
One key connection between cosmology and particle physics has to do
with simply counting numbers of particles of different types.
Suppose p denotes any particle (not just a proton, as it usually does),
and p′ denotes its antiparticle. Everyone knows that when
a particle and its antiparticle interact, they annihilate each other
and produce massless particles of energy – photons. This
reaction can be written:
p + p′ ⇄ γ + γ
Here γ is a photon. (Two photons are always produced, in order to
conserve both energy and momentum.) The reaction can also proceed in
the opposite direction, as indicated by the bidirectional arrows. This
reverse reaction is called pair production, and it means that
any type of particle, together with its antiparticle, can be created
out of the interaction of two photons, provided only that the photons
have enough energy.
And therein lies the rub. As long as most photons have at least as
much energy as the rest mass energy of a particular type of particle,
the above reaction will go on constantly in both directions. Pairs
of any type of particle will ceaselessly be produced out of photons, and
will just as readily annihilate each other to give back photons.
There's nothing to limit the reaction in the left to right direction,
since photons can exist with any amount of energy however large or
small.
However, the same is not true in the opposite direction.
There is no such thing as a proton with an energy less than the proton
rest mass of 938.3 MeV, though a proton can have an arbitrarily
larger amount of energy in the form of kinetic energy. Therefore, as
soon as the universe has cooled to the point where there are almost
no photons left having that much energy, essentially no new protons
will be created. Ever. (Well, hardly ever. Virtual particles of any
kind can be created for very brief periods of time by virtue of the
Heisenberg uncertainty principle. Virtual particles can even affect
the "real" world, if they happen to interact with something else
during their brief existence. A virtual X particle, for example, could
possibly cause a proton to decay. But because the X is so massive
its lifetime must be extremely short. Therefore, the probability
of causing a proton to decay is extremely small, and the expected
lifetime of a proton is extremely long.) Likewise, essentially no additional
particles of any kind will be created once the universe has cooled
beyond the point that essentially all photons have less energy than the
rest mass of the particle. This includes neutrons (which have a rest
mass of about 939.6 MeV), and that is crucial for the theory of
nucleosynthesis, to be discussed later.
Of course, the same thing applies to heavier particles as well.
Quarks are somewhat of a more complicated case, since below a certain
energy level quarks apparently cannot exist except in bound states
of two quarks (mesons) or three quarks (baryons). But all hadrons
(either mesons or baryons) other than protons are unstable, and
eventually decay spontaneously. Therefore, after they can no longer
be created in the pair production process because photons of sufficient
energy are not available, they will disappear from the universe.
(Unless created in very unusual circumstances such as supernova
events or in the vicinity of a black hole.)
This is relevant to one curious fact – the existence of any
ordinary (baryonic) matter at all. The term baryogenesis refers
to the sequence of events that leads to the residual baryonic matter. As
previously noted, all different types of particles should exist in
approximately equal numbers, at some sufficiently high temperature. This
is simply because, at high enough energies, all particles, including
photons, should be able to convert into each other, at least indirectly.
For instance, a quark and an antiquark annihilate with each other,
producing two photons. So quark plus antiquark → two photons. The
two photons, in turn, can give rise (for example) to a muon plus
antimuon. So pair production suggests that two photons should exist for
each particle-antiparticle pair. (Photons are their own antiparticles,
but you can arbitrarily count half of them as antiparticles.)
When the universe becomes sufficiently cool (at a different
temperature for each particle type), it is no longer possible to produce
any more new particles of each type by pair production. (As mentioned,
particle-antiparticle pairs can still be created as virtual particles,
but that's a different process, and the particles exist for only very
brief instants.) After that point, all particles and antiparticles which
can annihilate with suitable partners of the same kind will do so.
If the number of particles and antiparticles is exactly the same, then
either equal numbers of particles and corresponding
antiparticles remain, or else neither remains at all. But many
observations indicate that antimatter does not exist at all in
significant amounts. Yet ordinary matter clearly does.
This is a problem, so the assumption of exactly equal
numbers of particles and corresponding antiparticles must be wrong.
Instead, there must have been slightly more quarks and leptons than
antiquarks and antileptons. But why? Pair production and annihilation
can't produce an imbalance between particles and antiparticles.
So the excess must due to unknown processes other than pair production.
And such other processes must exhibit some symmetry breaking, however
slight.
Quarks and leptons must have existed from the earliest
times when different particle types became distinguishable, after
the breaking of GUT symmetry. They would remain in equilibrium with
photons until the temperature level of the universe dropped to a
certain point. For leptons, what happens is simple. As soon
as the universe is cool enough that no new lepton-antilepton
pairs of a given type are created by pair production, the
reverse reaction proceeds until all antileptons are gone.
With quarks, however, the situation is a little more complicated.
The two lightest quarks are called "up" and "down". They have
rest masses of about 3 MeV and 6 MeV, respectively. A proton
consists of two up quarks and one down quark, while a neutron
consists of two down quarks and one up quark. It turns out that
the strong force does not allow free quarks to exist
below a certain energy level, somewhere around 1000 MeV.
(Which happens to be about the rest mass of protons and neutrons.)
So, for example, all up and down quarks become incorporated into
protons and neutrons at that level.
(The substantial excess rest mass of protons and neutrons over that of
the constituent quarks is due to "binding energy".)
This phenomenon is known as quark confinement.
Thus around the 1 GeV energy level all quarks become bound into
either pairs of quarks (mesons) or groups of three (baryons). And
therefore, long before the lightest quarks cease to be created by pair
production, they become confined. But the same principles apply to
the resulting hadrons, and
there are simply too few photons with sufficient energy to create many
baryons such as protons and neutrons by pair production after the
baryons appear as a result of quark confinement.
In any case, all particles and antiparticles of the same type that can
pair up annihilate with each other, yielding photons. Only a small
unpaired excess of matter particles remains in the form of baryons. From
calculations of the amount of deuterium that should be produced by
nucleosynthesis, there must have been about one baryon per 1.8 billion
photons in order to account for the amount of deuterium that is actually
observed.
Nucleosynthesis
About one second after the big bang, the universe was at a temperature
of about 1010 K (≈ .001 GeV = 1 MeV).
This is a very interesting energy level, as several critical
things happen close to this level:
The rest mass of a neutron is about 1.3 MeV more than that of a
proton
Neutrinos cease interacting with protons and neutrons and "fall
out of thermal equilibrium" with the nucleons
The ratio of neutrons to protons drops from 1:1 to about
1:5
Electrons have a rest mass of about .5 MeV
The binding energy of a deuterium nucleus (proton plus neutron)
is about 1 MeV per nucleon.
These facts all conspire to produce two important results:
It becomes possible for atomic nuclei consisting of
two or more protons and neutrons to exist, whereas at higher energies
any that happened to form would quickly be blasted apart by
energetic photons.
The relative abundances of certain light nuclei (hydrogen,
deuterium, helium-3, helium-4, and lithium-7) are determined based
on various conditions prevailing at the time, and these abundances
are fairly close to what can be measured today.
We are going to discuss the process of nucleosynthesis in much more
detail later, because the big bang model successfully predicts
certain key observational facts about the relative abundances of
several very light nuclei. And this is therefore one of the strongest
pieces of evidence in favor of the model. So we'll provide just a
brief outline at this point.
As mentioned, stable atomic nuclei (other than hydrogen, a single
proton) cannot exist at energies substantially higher than 1 MeV. But
even when the temperature falls into this range, the probability that a
nucleus will actually form depends on how fast a few critical reactions
proceed. And these reaction rates depend on factors such as the
temperature, the rate of expansion of the universe, the densities of
baryons and photons, and the ratio of existing protons to neutrons.
The situation is further complicated by the fact that neutrons which are
not bound into nuclei are unstable, with a half-life of 614 seconds.
Once substantially all uncombined neutrons have decayed (into a proton
plus an electron), nucleosynthesis stops. (In principle, two helium-4
nuclei, for example, could fuse to form beryllium-8, but such reactions
are unlikely, mainly due to the electrostatic repulsion between
positively charged nuclei.) Consequently, there is a race against the
clock. All nuclei heavier than hydrogen that are going to form at this
stage of the universe must form within the first half hour or so. Additional
nuclei (from beryllium through iron in atomic number) form much later in
stellar interiors. And still heavier nuclei can form only in supernova
events.
Yet another impediment to nucleosynthesis is that most reactions add
only one neutron at a time. And the first step is the
hardest, because one proton plus one neutron yields deuterium.
Unfortunately, deuterium is only barely stable
and has a very small binding energy, about 1 MeV per nucleon. Consequently,
deuterium cannot exist in significant amounts until the temperature
drops by a factor of 10 from 1010 K
(1 MeV). At this era, the energy of radiation (photons) governs
the expansion of the universe and hence the rate that temperature
decreases. Specifically, temperature is proportional to the
square root of time (we'll show later why that is), so
deuterium production can't begin until the universe is at least
100 seconds old. General nucleosynthesis doesn't really get going until
more like 300 seconds, when there are no longer enough energetic
photons to destroy deuterium, and enough deuterium nuclei have formed to
combine with additional neutrons to make helium-3, helium-4, and
lithium-7.
That, in a nutshell, is nucleosynthesis. Since the nuclear physics
required to compute reaction rates is well understood, it is in principle
simple to compute the relative abundances of different nuclei that
result from this process. Computationally, it is rather more complex,
since there are a number of interrelated factors involved. It's
fairly easy to compute that most of the resulting nuclei
are either hydrogen (protons) or helium-4. And further, given the
ratio of protons to neutrons (which is about 8 to 1 when the process
is well underway), the final result is that helium-4 makes up
about 25% of all matter, by weight, and hydrogen makes up most of the rest.
Everything else is only a trace. This prediction is verified by
observation, which is important evidence in favor of the big bang model.
By taking into account how other factors affect other relative
abundances, it is possible to infer certain other important facts,
such as:
The amount of deuterium produced is sensitively dependent on both
the density of baryons (protons and neutrons) and the ratio of photons
to baryons at the time of nucleosynthesis. If we take account of the
measured abundances of deuterium, helium-4, and the other light
elements, we can determine both the baryon density and the photon-baryon
ratio. The result for baryon density is such that only at most one part
in six of all dark matter can be in the form of baryons – a very
important result. This value agrees very well with an independent
estimate obtained from information in the cosmic microwave background
(CMB). As for the photon-to-baryon ratio, it must be about
1.8×109 to 1.
From the baryon density and the photon-baryon ratio we can compute
what the photon density was at the time of nucleosynthesis, and from
this the present photon density. The result implies a temperature of the
CMB very close to the value of 2.725 K
that is observed. The correctness of this prediction is very good
evidence for both the validity of nucleosynthesis calculations and the
big bang model in general. But 15 years before the CMB was discovered,
Gamow and his associates used a very rough estimate of the photon-baryon
ratio to estimate the photon density. From this they predicted that
photons having a current temperature less than 50 K or so would still
exist. (The rough estimate of the photon-baryon ratio and an
underestimate of the age of the universe account for most of the
difference from the correct value.) Had this prediction been taken more
seriously, the CMB would have been discovered earlier.
The density of neutrinos present during nucleosynthesis has an
effect on the amount of helium-4 produced. Since this latter amount has
been determined by observation, we can estimate the neutrino density.
But this in turn depends on the number of different types of neutrinos
there are (i. e. whether there exist any other neutrinos besides those
associated with electrons, muons, and tauons). The observed abundance of
helium-4 implies that there cannot be more than three types of
neutrinos, which is also a very important result.
After nucleosynthesis
Once nucleosynthesis ended, events moved much more slowly, and
nothing particularly dramatic happened for some time.
The first thing worth mentioning here is the transition of the
universe from a state where radiation was dominant to a state
where matter was dominant. The importance of this is that the two
constituents of the universe – matter and radiation – affect
the evolution in different ways.
We will explain the differing effects in much more detail in the
next major section. But basically what happens is that as the universe
expands, the density of energy in the form of radiation (photons) decreases
faster than the density of energy in the form of matter. The
reason for this is that the density of particles of matter decreases
as the cube of the length scale of the universe, since the number of particles
remains about the same (after the annihilation of antimatter), while
the volume increases as the cube the linear dimension. The number density
of photons also decreases as the cube of the length scale, but in addition
the wavelength of every photon is also stretched by the expansion
of the universe, and this decreases the energy of each photon
by an additional factor of length, so that altogether the energy
density of radiation decreases as the fourth power of length.
There was much more energy density in the form of radiation to start
out with. But at approximately 47,000 years after the big bang,
at a temperature of about 8000 K,
the energy densities of matter and radiation became equal. After that
point, the energy density of matter was higher. At that point we say
that the universe went from a state of energy dominance to one of
matter dominance. This change did not have any abrupt consequences.
All that happened was that the universe began to expand more rapidly.
More precisely, the length scale of the universe
transitioned smoothly from increasing in proportion to the
square root (one-half power) of the time to the two-thirds power of time.
The temperature of the universe began to decrease more
rapidly at this point, in a similar way.
This increase in the rate of expansion and temperature decline is
important to keep track of for purposes of calculating how
length scale and temperature vary. Apart from that, the universe
wasn't greatly affected by the change from radiation dominance to
matter dominance. Photons and matter continued to interact strongly
with each other.
The next important transition, however, has much more important
observational consequences. It is what gives rise to the
cosmic microwave background which, if our eyes were sensitive enough to
microwaves, would give the night sky a very distinctive pattern.
This effect is the direct result of a change that occurred in how
photons interact with matter when the temperature of the universe
was in a specific range.
Although nuclei of a few lightweight elements were formed in the
nucleosynthesis process, they were not yet atoms as we know them, since
they were fully ionized – electrons still moved about freely
instead of being bound to a nucleus. Since both nuclei and electrons
carry electric charge, photons (which are carriers of electromagnetic
force) interacted strongly with them. As a result, the still extrememly
hot plasma was opaque to light. No telescope, however powerful, could
see anything in the universe that occurred from the beginning to more
than 300,000 years after the big bang.
Indeed, the next major event occurred when electrons finally began to
bind with nuclei. This is called the recombination era,
even though it was in fact the first time that nuclei and
electrons combined. The process of recombination began only when
the temperature dropped far enough that a "typical" photon no longer
had enough energy to break the electrostatic attraction (also known
as Coulomb force) between
an electron and a nucleus. This process happened gradually,
because not all photons had the same energy. They were distributed
in a way characteristic of blackbody radiation, where the energy
range of photons is spread out on either side of a peak value.
The range is especially large on the high end, so there are always
a few very energetic photons around even after the "typical"
photon has an energy close to the peak of the curve.
The recombination process began around 240,000 years after the
big bang, when the temperature had dropped to very low levels,
relatively speaking – about 3700 K. By comparison, the surface
of the Sun has a temperature of about 6000 K. The peak of the spectrum of
a blackbody radiator at 6000 K is at roughly 500 nm (nanometers) –
photons with that wavelength correspond to yellow light, the color of
the Sun. At 3700 K the blackbody spectrum is flatter, but has a peak
around 700 nm, on the boundary between red and infrared. Remember, though,
that there would still be many photons with much higher energies, so
the probability was still high that an electron bound to a nucleus
would interact with a photon.
3700 K is the temperature at which half of the
nuclei should have a full complement of electrons, while the other half
was fully or partially ionized. A photon could still interact with an
electron bound to a nucleus without stripping the electron from
the nucleus; it might simply raise the energy level of the electron
rather than ejecting it from the atom.
The universe continued to expand, so the density of photons
and matter kept dropping. The simultaneous decrease of both density
and temperature made it increasingly less likely that a photon
would interact with matter. The probability of interaction can be expressed
in terms of the mean free path of a photon, which is the
expected distance that a photon can travel before interacting with
matter. When this mean free path became longer than the Hubble length
(defined as the speed of light divided by the Hubble parameter at the
time – which is the distance light could travel in the approximate
age of the universe at the time), the probability of interaction became
essentially nil. The time that this occurred is called the time of
photon decoupling. That was about 350,000 years after the big
bang, when the temperature was about 3000 K.
At the time of photon decoupling, the universe first became transparent,
for all practical purposes – light was essentially no longer
scattered by matter. This is the origin of cosmic background
radiation, also known as the cosmic microwave background (CMB).
(The "microwave" here is misleading – that is the part of the
spectrum in which the radiation is observed today, over 13 billion years later.
At the time of photon decoupling, the radiation was mostly in the
near infrared.)
Of course, the probability of a photon interacting with matter is
never quite zero. Photons from the CMB are still interacting with
matter today, such as the matter in one of our microwave antennas.
That is how we are able to "see" the CMB. The fact that we can
actually "see" the CMB is one of two important features about it.
It is important, because the existence of the CMB, at an equivalent
temperature of about 2.725 K, is a very strong piece of evidence
supporting the big bang model. The model predicts its existence,
so we must see it (as we do) to confirm the prediction.
It is also a feature which is very implausible to be found in
other proposed cosmological theories, such as the "steady state"
theory.
The other reason that the CMB is so important is that it carries a
wealth of information about the universe at the time of decoupling,
and much earlier as well. Some of what we have learned from
very careful measurements of the CMB includes:
The CMB is almost (but not quite) perfectly homogeneous. The
homogeneity, with temperature variations no more than 1 part in 10,000,
is what we expect from the hypothesis that the universe itself is
homogeneous. The small amount of inhomogeneity is exactly what we expect
from slight pressure waves that reverberated in the universe at the time
of decoupling.
The size of inhomogeneities (measured in angular size rather
than temperature variation from the mean) is just what we would
expect on the assumption the universe is nearly "flat", with no
measurable curvature. This is strong evidence for the hypothesis
of cosmic inflation. In addition, flatness is a very important property
influencing the evolution of the universe, as we will explain
at length in the next section.
The full size spectrum of inhomogeneities also has implications
for the relative amounts of baryonic and nonbaryonic dark matter,
as well as for the way that matter later clumped together (after
about half a billion years) to form the earliest galaxies.
The lack of repetitive patterns of any kind observable in
the temperature variations in the CMB implies that the universe
as a whole has a very simple topology (shape), rather than
something more complicated such as a torus.
We have an extensive discussion of the CMB
elsewhere, so that should be enough about
it for now.
The period after photon decoupling is called, simply, the period of the
"early universe". The universe was a very boring place then. Light
could move freely, but (at first) there was nothing to see –
no stars, no galaxies. The period is sometimes also called the
"dark age" – partly because the only light source was the
CMB, which was actually infrared at the time, hence not all that
bright, and partly because we know very little about the period.
It can be calculated that the existing hydrogen and helium gas
should have been able to condense into the first generation of
stars within 100 million years after the big bang, perhaps
even earlier. It can also be calculated that the very first stars
would have been rather different from typical stars today. They
would have been much more massive. But like very massive stars
today, they would have converted matter to energy in thermonuclear
reactions quite rapidly, and so they would have been very bright,
with very high surface temperatures. The stars would have been
so hot that the photons they emitted would be energetic enough
to reionize lots of interstellar gas, producing glowing gas clouds such
as still exist around very hot stars. Accordingly, this period
is known as the period of reionization.
Still later, maybe around 500 million years after the big bang,
stars began to cluster together in galaxies. This could not have
occurred so early without the presence of vast amounts of nonbaryonic
dark matter. It is supposed that there were regions of relatively
higher and lower densities of such dark matter. The regions of
higher density were massive enough to cause existing (baryonic
matter) stars and gas to fall into them – making galaxies.
Some of these very first galaxies may actually have been imaged
by the Hubble space telescope, at a redshift of 8 or more.
We will discuss elsewhere in more detail the processes that went on in the
early universe to produce stars and
galaxies, so this is an appropriate time to bring our overview of
the first few hundred million years of the universe to a close.
Cosmological models
And now for something completely differernt.
In order to give a more detailed explanation of how the universe evolves
under the big bang scenario, we need to look at the general features
of any cosmological model. And to do that, we need to say a few words about
the use of "models" in science.
A model in science is really quite simple. It typically consists of
a set of concepts that either implicitly or explicitly refer to some
aspect of the "real world". In addition to the concepts, a model
explicitly contains some assumptions about properties of
the real world things to which the concepts refer. Generally, these
assumptions make some sort of intuitive sense, or may even seem
fairly self-evidently true – but not necessarily. A good case in
point would be Einstein's special theory of relativity. Two of the
key assumptions there are that the speed of light (in a vacuum)
is always the same and that nothing, neither physical objects nor
information, can travel faster than the speed of light. The first
assumption seems reasonable enough, though it certainly requires
experimental confirmation. The second assumption is not so intuitively
obvious, and it certainly isn't easy to imagine how it could be
definitely proven by experiment. Nevertheless, the theory actually
works very well: it makes testable predictions, and its
predictions have survived all experimental tests – so far.
What's the difference between a "model" and a "theory", then? Nothing,
really, other than semantics. The term "theory" carries with it various
differing connotations. Sometimes the term is used as in "It's only a
theory", implying that the theory has not been rigorously confirmed,
and may in fact be somewhat doubtful. However, nothing in science has
in fact been absolutely confirmed beyond any possibility of doubt.
Science doesn't claim to be able to do that. It claims only, at best,
to be able to confirm some theories very, very well, beyond any
"reasonable" doubt. Quite a few "theories" are actually in this
category, such as Newton's "theory" of gravity (at least as an
approximation – some of its details are known to be not 100%
accurate), and the special theory of relativity.
With a model, however, one doesn't worry excessively whether all of its
assumptions seem intuitively true or can even be confirmed directly.
The only thing that really matters is that the model has testable
predictions, and that its predictions, when tested, range from "pretty
accurate" to "correct up to the limits of measurement". The "theory
of quantum electrodynamics", for example, which describes the behavior
of light and electric charge, is often cited as one of the most
exact theories in science, and its predictions have been confirmed to
more than 10 decimal places.
But many times, a scientific model may be far less accurate and
definitive. Physicists, for instance, often make various simplifying
assumptions about a situation in order to make numerical predictions
feasible. From this arise such known impossibilities as a "perfect gas"
or a "frictionless plane". Part of the folk humor of the physics
community is the story of the physicist who, when asked by a dairy
farmer for advice, began by saying, "Consider a spherical cow..."
Often a model will contain many somewhat arbitrary adjustable parameters
and arbitrary mathematical equations that relate parts of the model.
Typically, the equations can't be solved exactly, and the whole model
is programmed into computer code. You run the code, and if the
results are a good match with details that can be measured empirically,
the model is considered a success. The better the match, and the
wider and more general the set of circumstances covered, the more
successful the model is considered to be. There are a vast number
of such models used in science today, from the "standard model" of
particle physics to weather and climate models.
But this doesn't really have anything to do with computers and certainly
isn't new in science. It's actually just an instance of what used to
be called "inductive reasoning". If you make a series of observations
and consequently are able to formulate a set of rules which accurately
describe the results, then you have a "theory" or a "model" or whatever
you want to call it. For example, between 1609 and 1618 Johannes Kepler
enunciated three "laws of planetary motion". These laws accurately
described mathematically the motion of planets in the solar system.
Kepler was an astronomer (actually, an astrologer)
who began as an assistant to Tycho Brahe. He
devised his laws inductively, based on the extensive observations made
by Tycho and himself. The result was quite a good model of planetary
motion. Kepler could not explain or justify why his laws worked. It was
good enough that they did. But of course, about 50 years later, Isaac
Newton came along and demonstrated mathematically why Kepler's laws
worked – provided you accepted Newton's theory of gravity and his
more general "laws of motion". Newton's theory
is considered to be "deductive", even though his own laws of motion
themselves are essentially just plausible assumptions which are
verifiable and which, when analyzed mathematically, imply Kepler's
empirical laws among their predictions.
Newton's theory of gravity itself was eventually superceded by Einstein's
general theory of relativity. And this theory, like Newton's before it,
starts by making various sweeping but plausible assumptions. One of these
was the special theory of relativity. Others concerned the mathematical
form that a theory of gravity should have. But what eventually resulted
was not only a theory which generalized Newton's theory, but which was
also able to make correct predictions where Newton's theory failed
(such as the orbit of Mercury), and to make entirely new and surprising
predictions, such as the bending of light by matter.
Fundamental assumptions
With all this as preamble, it should be a little clearer what is meant
when we discuss the big bang as a type of cosmological model – in fact,
a family of closely related models – and so successful that it is called
the standard cosmological model. Although some of the assumptions
may seem a little arbitrary, and it does use a few adjustable parameters,
the model has made a number of successful predictions, and it is by a
wide margin the best description we currently have of what our universe
is "really" like.
The main assumption that is made in the model is what is known as the
cosmological principle. This says that the universe looks,
approximately, the same from any vantage point. More specifically, the
assumption is, first, that the universe is homogeneous –
essentially the same everywhere. And second, the universe is
isotropic – it looks essentially the same in any direction.
We hedge a little, with words like "approximately" and "essentially",
because clearly on a small scale the assumptions are not true. Our part
of the universe, near a medium size star in a rather average spiral
galaxy, is certainly not like the interior of a black hole or the
empty space between galaxies. However, the assumption is that on
very large scales the universe has no preferred location and no
preferred direction.
The two parts of the assumption are not redundant. The universe could
be isotropic when viewed from one location
without being homogeneous. For instance, it could be
much denser (on a large scale) at some distances than others. And it
could be homogeneous without being isotropic if, for example, all
galaxies were lined up parallel to each other. But it is true that if
the universe is isotropic everywhere, it must in fact be homogeneous.
These two conditions amount to the universe having certain types of
symmetry: everything about the universe is symmetric (at large scales)
under the symmetry operations of spatial translation and rotation.
One consequence of the homogeneity condition is that the big bang
event (if such existed, which isn't in fact absolutely required in
all versions of the model) didn't happen at one particular point in
space. It must have happened "everywhere". And although the universe
certainly appears to be expanding away from us in all directions, the
same must be true from all other vantage points as well. This is the
familiar analogy to spots on an inflating balloon.
Note that the universe is not assumed to have such
symmetry with respect to time, even though spacetime is still
assumed to be 4-dimensional just as described in special relativity.
The symmetries pertain strictly to the 3 spatial dimensions. It is
in fact definitely assumed that the universe must have been very
different at times sufficiently far back in the past, as well as
(probably) far in the future. The so-called "steady state" cosmological
model which assumes time symmetry – that the universe has always
looked about the same and always will – seems to conflict in a
number of ways with what we actually see around us.
For instance, there are several compelling reasons to believe that the
universe must have been much hotter and denser in the past than it
is now.
This brings us to another assumption: that the universe is not spatially
static but is in fact expanding, in a certain sense that we will make
more precise. As we noted above, this assumption is founded on the
observations first made by Edwin Hubble of galactic redshifts, and which,
since his time, have been confirmed in many ways. Yet in some sense, the
assumption of an expanding universe remains "just" an assumption. There
are still astronomers and cosmologists today who seek alternative
explanations for the observed redshifts. If one of these alternatives
were correct, it might not be necessary to assume the universe is
expanding, as it appears to be. However, if the universe is not in
fact expanding, then a variety of other things we can observe, such
as the cosmic microwave background (CMB) have no obvious explanation.
So we make the assumption of expansion not only because that's the
simplest explanation of Hubble's observations, but also because it
yields a consistent model that explains many other things too –
such as the apparent age of the oldest stars, the chemical composition
of the universe, and the CMB.
It's also worth noting that Einstein himself, shortly after developing
the general theory of relativity, assumed the universe was not
expanding, since before Hubble almost everyone assumed the universe
was static. Einstein was in fact much annoyed because his theory
strongly implied that the universe shouldn't be static, and
ought to be either expanding or contracting. After all, an object
thrown into the air from the surface of the Earth doesn't just hang in
midair. At first it rises, and then it falls back under the force of
gravity. It would be very surprising if the universe, under the force
of gravity, didn't behave similarly. Einstein even went so far as to
modify his theory to allow for the universe to be static by adding
the cosmological constant to it. As it turned out, after Hubble
proved the universe wasn't static, it seemed that adding the cosmological
constant was a mistake. (Though it has returned, with a vengeance,
as a result of fairly recent observations, as we will explain at some
length.)
General relativity, geometry, coordinate systems
The general theory of relativity is very much a fundamental ingredient
of any viable cosmological model today. The theory "explains" gravity
in terms of geometry. We discuss the theory in much more detail
elsewhere. But the general idea is that
4-dimensional spacetime is curved. Gravity is "nothing but" an
apparent force that results from the curvature of spacetime. And yet,
gravitating matter is also what "causes" spacetime to curve. As it is
sometimes put, the curvature of space is what tells matter how to
move, but it is matter that tells spacetime how to curve.
The mathematical way to describe space and curvature is a subject
known as Riemannian geometry, after Berhnard Riemann
(1826-66), who developed the necessary mathematics about 50 years
before Einstein availed himself of it. In modern terminology, the
theory is all about objects known as Riemannian manifolds.
The key feature of such an object is that it possesses something known
as a metric, which provides a way of measuring distances within
the manifold, independently of any external geometrical constructs, such
as a global Cartesian coordinate system. For example, it is possible to
describe distances and all other geometric aspects of the 2-dimensional
surface of a sphere (such as the surface of the Earth), without
referring to how the surface is embedded in 3-dimensional space.
Likewise, the 4-dimensional spacetime that figures in the special
and general theories of relativity can be described completely in
terms of a metric, and things like curvature can be defined
rigorously, even though it is not easy to visualize just how a
4-dimensional space is able to curve.
From our assumption that the universe is homogeneous, we can deduce
that the same sort of local coordinate system can be used everywhere.
Further, the metric used to define distances much have the same form
everywhere. This isn't to say that one single coordinate system
applies everywhere, only the same sort of coordinate system.
We can make this more concrete. Imagine any point within the universe,
such as the current location of the Earth. You can make this the
origin of a spherical coordinate system, involving radial distance
and two angular coordinates – essentially
radial distance from the center of the Earth, plus latitude and
longitude. (This is only a way to describe 3-dimensional space. For the
moment, don't worry about the time dimension of spacetime.)
By virtue of the assumed homogeneity of the universe, it simply does
not matter where we place the center of the coordinate system.
It could be here on Earth, or it could be somewhere a few billion
light-years away. Any description of the universe should be
essentially the same no matter where we place the center.
There is one special type of coordinate system which is particularly
useful. It is called a comoving coordinate system. Still
assuming a spherical type of coordinate system as well, given some
arbitrary point as a center, we can identify any other point in the
universe by a 3-tuple of numbers: (x, θ, &phi).
The quantity x is still
the distance from the center to the other given point, and θ,
φ are the two angular coordinates. The thing about a comoving
coordinate system is that, with both the center and the other point
fixed, the 3-tuple of coordinates never changes as a result of
deformations (such as expansion and contraction) of space itself. You
could imprint the coordinate system onto space, and the numbers would
never change. Of course, an actual particle momentarily located at a
particular point can move around within space, but the points through
which the particle moves keep the same coordinates no matter what space
does. That is, x, θ, and φ are constants.
However, we are still assuming that space actually is expanding (or maybe
contracting). Therefore, the "real" distance r (sometimes called
proper distance) from the origin to the selected point will
change as space expands or contracts. r = r(t) is actually a function
of time alone. By virtue of our assumption
that the universe is isotropic (on a large scale), the "real" angles θ
and φ of the selected point never change. This means that we
assume the universe isn't expanding at different rates in different directions
that we might look. (r(t), θ, φ)
is the time-dependent vector that defines the real location of the
selected point as time passes. It follows from our assumptions of
homogeneity and isotropy that r(t) = a(t)x for some function a(t)
that doesn't depend on the origin of the coordinate system,
x, θ, or φ – only on time.
Note that a(t) is a ratio of distances, and so it is a dimensionless
quantity. However, we are free to pick the time at which the proper
distance is measured. It's only natural to pick this to be the present
time, which means that right now real distance = comoving distance,
and so a(tnow) = 1. This fixes a(t)
completely (up to uncertainty of measurement).
a(t) is the most important mathematical quantity in the big bang model.
It is called the scale factor. a(t) describes how space is
expanding or contracting everywhere (by virtue of homogeneity)
solely as a function of time, t, measured since the instant of the
big bang. For some values of t, space may be expanding (a(t) is
increasing) and at other values space may be contracting (a(t) is
decreasing). If we knew a(t), then we'd know, in principle,
exactly what space is doing at any point in time. Other
interesting quantities one might care about, such as the
rate of expansion (i. e., the Hubble "constant"), the density of
matter and energy, etc., can be related to a(t).
A simple way to think of the scale factor a(t) is that it is a
kind of ruler one may use to measure distances in the universe.
This ruler has the useful property that, regardless of when you
measure the distance between two particular distantly-separated
objects, like galaxies, the distance never changes. (The objects
should be far enough apart that actual relative motions are small
enough to be negligible in comparison to the expansion of space.)
So this ruler
measures comoving distance. But it is a peculiar sort of ruler,
which itself expands (or contracts) just as space does.
Since the scale factor measures the expansion of space, it will affect
the wavelength of all photons. Therefore it must be related to redshift
z, which was defined so that z + 1 is the ratio of a photon's
observed wavelength to its wavelength at the time of emission. This
ratio is the same as the ratio between the present scale factor, which
is 1, and the scale factor when the photon was emitted. Therefore z + 1 =
1/a(t). So a is about 1/z for large z (i. e., early in the history
of the universe).
There is another, equivalent, way to conceptualize a(t). It is
based on the mathematical forumlation of general relativity.
General relativity can be summarized in a single equation:
Rμν -
gμνR/2 =
(8πG/c4)Tμν
This is an equation involving things called tensors, which are
just fancy higher-dimensional vectors.
Rμν
is called the Ricci tensor. It describes the curvature
of spacetime. (μ and ν are indices taking integer values from 0 to 3.)
gμν is
the coordinate tensor; it consists of numbers that specify the metric
used to measure distance.
R is the Ricci scalar, an averaged value of curvature.
G is Newton's gravitational constant, c is the speed of light, and
Tμν
is the energy-momentum tensor that describes some given distribution of
matter and energy, in terms of things like momentum, density, and pressure.
The tensor notation simply makes the equation more compact. In reality,
this tensor equation is the same as a system of up to 10 separate but
related equations.
From this, assuming a particular metric tensor,
if a distribution of matter and energy is specified, the
curvature of space can (in principle) be calculated. Conversely,
given complete information about curvature, the distribution of
matter and energy can be calculated.
One way to use the Einstein equation is
to assume a metric tensor of a specific form and some distribution
of matter and energy, and from these derive simpler differential
equations that can be solved, at least approximately.
So, as promised, the equation relates the curvature of space to the
distribution of matter and energy. (Note that energy as well as matter
affects how space curves.)
The first step in applying the equation is to put some constraints
on the form of the metric tensor. For purposes of cosmology, we want
to have a metric which
corresponds to a spacetime that is spatially homogeneous and
isotropic. It can be shown that the metric must have the form:
ds2 =
-c2dt2 +
a(t)2
[dr2/(1-kr2) +
r2 (dθ2 +
sin2θ dφ2)]
Here, r, θ and φ are comoving coordinates already mentioned.
dr, dθ, and dφ are infinitesimal changes in coordinate values,
in the sense of calculus. ds is the resulting infinitesimal change in
distance. c is the speed of light, and k is a constant that
reflects curvature. k may be positive, negative, or zero, but it must
be constant if the universe is to be homogeneous. That is, the
curvature must be the same everywhere (on a large scale).
The only other free parameter in this metric is the function
a(t), which is none other than the scale factor.
This metric is known as the Robertson-Walker metric, after
Howard Robertson and Arthur Walker, who derived it in the 1930s.
Usually this metric isn't worked with directly in cosmology. Instead,
one uses the metric and the Einstein equation to derive simpler
differential equations, most importantly one called the Friedmann
equation (after Alexander Friedmann), which involves quantities of
more direct cosmological relevance. The Friedmann equation is rigorously
derivable from Einstein's equation and
the Robertson-Walker metric, but in fact it can be derived,
with a little hand-waving, from Newton's original theory of gravity.
(Unfortunately, we have to assume you've been exposed to the
mathematical form of some of Newton's laws involving energy, as well as
a tiny bit of calculus in order to fully understand this part.)
The Friedmann equation
To derive the Friedmann equation, we begin by picking an arbitrary
point in space as the origin of a spherical coordinate system. It
doesn't matter where this point is, because space is assumed to be
homogeneous. And the orientation of the coordinate system
corresponding to angles θ = 0 and φ = 0 doesn't matter
either, since space is assumed to be isotropic.
Our plan for deriving the equation is to consider a particle of matter,
a "test particle", having mass m and coordinates (r, θ, φ) in
our chosen coordinate system. The test particle may be at an arbitrarily
large distance r from the origin of the coordinate system, even billions
of light years. Therefore, a sphere centered at the origin will contain
a certain amount of matter. Because we are assuming that
space is homogeneous, we don't need to specify where each individual
lump of matter is, only that all matter is distributed uniformly with
average density of ρ (mass per unit volume, expressed in units
compatible with those used for r
and m). It is perfectly true that this idealized situation is not
exactly correct in the real universe, but – like such things as
"frictionless planes" – it is close enough that we actually get a
useful result that is a good approximation for our purposes when we make
the simplifying assumptions.
Now, a basic theorem of Newton's theory of gravitation is that the
motion of the test particle depends only on the matter located inside a
sphere about the origin whose radius r is the same as the particle's
distance from the origin, so that the particle lies on the surface of
that sphere. In other words, all matter farther from the origin than the
test particle is irrelevant to how the particle moves (as long as all
that matter is distributed uniformly). Moreover, the particle's motion
is the same as what it would be if the entire mass inside the sphere
were located at the origin. This reduces the whole problem to a
situation involving only two particles.
Let M be the mass of the matter inside the sphere, so that
M = 4πρr3/3, since the volume of the
sphere is 4πr3/3. Newton's law then says
that the force acting on the test particle is
F = GMm/r2 = 4πGρrm/3
where G is Newton's constant.
Most of the time when one wants to derive equations for the motion
of a particle, it is done by appealing to the principle of energy
conservation – the total energy of the system is constant in time.
The total energy is made up of two parts: kinetic energy and potential
energy – energy due to motion and energy due to gravitational force.
The mass M at the origin is assumed not to move
(in the chosen coordinate system), so it has no
kinetic energy. Since the test particle moves as though only one
other particle is involved, all its motion is in the radial direction,
and its velocity is dr/dt = r′, so its kinetic energy is
mr′2/2.
The potential energy is defined in terms of the two masses together, and
is given by
V = -GMm/r = -4πGρr2m/3
Conservation of energy means that the total energy:
E = mr′2/2
- 4πGρr2m/3
is constant as a function of time.
We can now switch to a co-moving coordinate system where x (which
is related to r(t) by r(t) = a(t)x) is the unvarying distance of the
test particle from the origin. Taking derivatives, r′ =
a′x (since x = x(t) is constant). Substituting this into
the above gives
E = m(a′x)2/2
- 4πGρ(ax)2m/3
Dividing both sides of the equation by a2
and rearranging gives
(a′/a)2mx2/2 =
4πGρx2m/3 + E/a2
and so
(a′/a)2 =
8πGρ/3 + (2E/mx2)/a2
We have written the equation in this form to isolate the quantity
(a′/a)2 for reasons that will be clear
momentarily. But first, note that a = a(t) was a function of t that didn't
depend on the value of x (or m), so the same is true of
(a′/a)2. The first term on the right side
of the equation is obviously independent of m and x as well. Therefore,
2E/mx2 is also independent of m and x.
Hence E (the total energy) is proportional to
mx2. Hence there must be a constant k
such that
-2E/c2 = k(mx2),
so k = -2E/(mc2x2).
We threw in c2, the square of the speed of
light, so that the mass-energy mc2 of the
test particle appears in the equation for k, showing that k has units of
(length)-2. Although E depends on m and x, it
is independent of time – since energy is conserved. Hence k is
independent of time as well. k turns out to be a constant related to
the overall curvature of space. Using k we can also rewrite the
equation for (a′/a)2 as
(a′/a)2 =
8πGρ/3 - kc2/a2
This is the Friedmann equation, which is the | |