Está en la página 1de 128

THE GALAXY

Notes for Lecture Courses


ASTM002 and MAS430
Queen Mary University of London
Bryn Jones
Prasenjit Saha
January March
2006
Chapter 1
Introducing Galaxies
Galaxies are a slightly dicult topic in astronomy at present. Although a great amount is
known about them, galaxies are much less well understood than, say, stars. There remain
many problems relating to their formation and evolution, and even with aspects of their
structure. Fortunately, extragalactic research currently is very active and our understanding
is changing, and improving, noticeably each year.
One reason that galaxies are dicult to understand is that they are made of three very
dierent entities: stars, an interstellar medium (gas with some dust, often abbreviated as
ISM), and dark matter. There is an interplay between the stars and gas, with stars forming
out of the gas and with gas being ejected back into the interstellar medium from evolved
stars. The dark matter aects the other material through its strong gravitational potential,
but very little is known about it directly. We shall study each of these in this course, and to
a small extent how they inuence each other.
An additional factor complicating an understanding of galaxies is that their evolution
is strongly aected by their environment. The gravitational eects of other galaxies can be
important. Galaxies can sometimes interact and even merge. Intergalactic gas, for example
that found in clusters of galaxies, can be important. These various factors, and the feedback
from one to another, mean that a number of important problems remain to be solved in
extragalactic science, but this gives the subject its current vigour.
One important issue of terminology needs to be claried at the outset. The word
Galaxy with a capital G refers to our own Galaxy, i.e. the Milky Way Galaxy, as does
the word Galactic. In contrast, galaxy, galaxies and galactic with a lower-case G
refer to other galaxies and to galaxies in general.
1.1 Galaxy Types
Some galaxies (more of them in earlier epochs) have active nuclei which can vastly outshine
the starlight. We shall not go into that here we shall conne ourselves to normal galaxies
and ignore active galaxies.
There are three broad categories of normal galaxies:
elliptical galaxies (denoted E);
disc galaxies, i.e. spiral (S) and lenticular (S0) galaxies;
irregular galaxies (I or Irr).
Classications often include peculiar galaxies which have unusual shapes. These are mostly
the result of interactions and mergers.
These classications are based on the shapes and structures of galaxies, i.e. on the mor-
phology. They are therefore known as morphological types.
1
1.2 Disc Galaxies
Disc galaxies have prominent attened discs. They have masses of 10
9
M

to 10
12
M

. They
include spiral galaxies and S0 (or lenticular) galaxies. Spirals are gas rich and this gas takes
part in star formation. S0s on the other hand have very little gas and no star formation,
while their discs are more diuse than those of spirals.
1.2.1 Spiral galaxies
Spiral galaxies have much gas within their discs, plus some embedded dust, which amounts
to 1-20% of their visible mass (the rest of the visible mass is stars). This gas shows active
star formation. The discs contain stars having a range of ages as a result of this continuing
star formation. Spiral arms are apparent in the discs, dened by young, luminous stars
and by HII regions. Spiral galaxies have a central bulge component containing mostly old
stars. These bulges supercially resemble small elliptical galaxies. The disc and bulge are
embedded in a fainter halo component composed of stars and globular clusters. The stars
of this stellar halo are very old and metal-poor (decient in chemical elements other than
hydrogen and helium). Some spirals have bars within their discs. These are called barred
spirals and are designated type SB. Non-barred, or normal spirals are designated type S or
SA. The spectra are dominated by F- and G-type stars but also show prominent emission
lines from the gas: the spectra show the absorption lines from the stars with the emission
lines superimposed. The spiral disc is highly attened and the gas is concentrated close to
the plane. The spiral arms are associated with regions of enhanced gas density that are
usually caused by density waves. Discs rotate, with the stars and gas in near-circular orbits
close to the plane, having circular velocities 200 to 250 kms
1
. In contrast, the halo stars
have randomly oriented orbits with speeds 250 kms
1
. Spiral galaxies are plentiful away
from regions of high galaxy density (away from the cores of galaxy clusters). All disc galaxies
seem to be embedded in much larger dark haloes; the ratio of total mass to visible stellar
mass is 5, but we do not really have a good mass estimate for any disc galaxy.
The discs surface brightness I tends to follow a roughly exponential decline with radial
distance R from the centre, i.e.,
I(R) = I
0
exp(R/R
0
)
where I
0
10
2
L

pc
2
is the central surface brightness, and R
0
is a scale length for the
decline in brightness. The scale length R
0
is 3.5 kpc for the Milky Way.
There are clear trends in the properties of spirals. Those containing the smallest amounts
of gas are called subtype Sa and tend to have large, bright bulges compared to the discs,
tightly wound spiral arms, and relatively red colours. More gas-rich spirals, such as subtype
Sc, have small bulges, open spiral arms and blue colours. There is a gradual variation in
these properties from subtype Sa through Sab, Sb, Sbc, to Sc. Some classication schemes
include more extreme subtypes Sd and Sm. Subtypes of barred spirals are denoted SBa,
SBab, SBb, SBbc, SBc, ...
Because the spiral arms mark regions of recent star formation, the stars in the arms
are young and blue in colour. Therefore, spiral arms are most prominent when a galaxy
is observed in blue or ultraviolet light, and less prominent when observed in the red or
infrared. Figure 1.5 shows three images of a spiral galaxy recorded through blue, red and
infrared lters only. The spiral pattern is strong in the blue image, but weak in the infrared.
Observations show that there is a correlation between luminosity L (the total power
output of galaxies due to emitted light, infrared radiation, ultraviolet etc.) and the maximum
rotational velocity v
rot
of the disc for spiral galaxies. The relationship is close to
L v
4
rot
2
E0 E2 E4 E6
NGC 1407 NGC 1395 NGC 584 NGC 4033
Figure 1.1: the sequence of morphological types of elliptical galaxies. Elliptical galaxies are
classied according to their shape. [Created with blue-band data from the SuperCOSMOS
Sky Survey.]
Sa Sb Sc Sd Sm
ESO286-G10 NGC 3223 M74 NGC 300 NGC 4395
Figure 1.2: the sequence of normal (non-barred) spiral types. Spiral galaxies are classied
according to how tightly would their arms are, the prominence of the central bulge, and
the quantity of interstellar gas. [Created with blue-band data from the SuperCOSMOS Sky
Survey.]
SBa SBb SBc SBd SBm
NGC 4440 NGC 1097 NGC 1073 NGC 1313 NGC 4597
Figure 1.3: the sequence of barred spiral types. [Created with blue-band data from the
SuperCOSMOS Sky Survey.]
NGC 1569 NGC 4214 NGC 4449 NGC 7292
Figure 1.4: examples of irregular galaxies. [Created with blue-band data from the Super-
COSMOS Sky Survey.]
3
Blue Red Infrared
Figure 1.5: the spiral galaxy NGC2997 in blue, red and infrared light, showing that the
prominence of the spiral arms varies with the observed wavelength of light. The blue spiral
arms are most evident in the blue image, while the old (red) stellar population is seen more
clearly in the infrared image. [The picture was produced using data from the SuperCOSMOS
Sky Survey of the Royal Observatory Edinburgh, based on photography from the United
Kingdom Schmidt Telescope.]
This is known as the Tully-Fisher relationship. It is important because it allows the lumi-
nosity L to be calculated from the rotation velocity v
rot
using optical or radio spectroscopy.
A comparison of the luminosity and the observed brightness gives the distance to the galaxy.
1.2.2 S0 galaxies
S0 galaxies, sometimes also known as lenticular galaxies, are attened disc systems like
spirals but have very little gas or dust. They therefore contain only older stars. They have
probably been formed by spirals that have lost or exhausted their gas.
1.3 Elliptical Galaxies
These have masses from 10
10
M

to 10
13
M

(not including dwarf ellipticals which have lower


masses). They have elliptical shapes, but little other structure. They contain very little gas,
so almost all of the visible component is in the form of stars (there is very little dust). With
so little gas, there is no appreciable star formation, with the result that elliptical galaxies
contain almost only old stars. Their colours are therefore red. K-type giant stars dominate
the visible light, and their optical spectra are broadly similar to K-type stars with no emission
lines from an interstellar medium: the spectra have absorption lines only. Dark matter is
important, and there is probably an extensive dark matter halo with a similar proportion of
dark to visible matter as spirals.
Ellipticals are classied by their observed shapes. They are given a type En where n
is an integer describing the apparent ellipticity dened as n = int[10(1 b/a)] where b/a
is the axis ratio (the ratio of the semi-minor to semi-major axes) as seen on the sky. In
practice we observe only types E0 (circular) to E7 (most attened). We never see ellipticals
atter than about E7. The reason (as indicated by simulations and normal mode analyses)
seems to be that a stellar system any atter is unstable to buckling, and will eventually
settle into something rounder. Note that these subtypes reect the observed shapes, not the
three-dimensional shapes: a very elongated galaxy seen end-on would be classied as type
E0.
Luminous ellipticals have very little net rotation. The orbits of the stars inside them
are randomly oriented. The motions of the stars are characterised by a velocity dispersion
along the line of sight, most commonly the velocity dispersion at the centres
0
. These
4
ellipticals are usually triaxial in shape and have dierent velocity dispersions in the directions
of the dierent axes. Less luminous ellipticals can have some net rotation.
Their surface brightness distributions are more centrally concentrated than those of spi-
rals. There are various functional forms around for tting the surface brightness, of which
the best known is the de Vaucouleurs model,
I(R) = I
0
exp
_

_
R
R
0
_1
4
_
,
with I
0
10
5
L

pc
2
for giant ellipticals. (To t to observations, one typically un-squashes
the ellipses to circles rst. Also, the functional forms are are only tted to observations over
the restricted range in which I(R) is measurable. So dont be surprised to see very dierent
looking functional forms being t to the same data.)
Ellipticals have large numbers of globular clusters. These are visible as faint star-
like images superimposed on the galaxies. These globular clusters have masses 10
4
M

to
few10
6
M

.
Ellipticals are plentiful in environments where the density of galaxies is high, such as in
galaxy clusters. Isolated ellipticals are rare.
Observations show that there is a correlation between three important observational
parameters for elliptical galaxies. These quantities are the scale size R
0
, the central surface
brightness I
0
, and the central velocity dispersion
0
. The observations show that
R
0
I
0.9
0

1.4
0
constant .
This relation is known as the fundamental plane for elliptical galaxies. (The relation is also
often expressed in terms of the radius R
e
containing half the light of the galaxy and the
surface brightness I
e
at this radius.)
An older, and cruder, relation is that between luminosity L and the central velocity
dispersion:
L
4
0
.
This is known as the Faber-Jackson relation. The Faber-Jackson relation, and particularly
the fundamental plane, are very useful in estimating the distances to elliptical galaxies: the
observational parameters I
0
and
0
give an estimate of R
0
, which in turn with I
0
gives the
total luminosity of the galaxy, which can then be used with the observed brightness to derive
a distance.
1.4 Irregular Galaxies
Irregular galaxies have irregular, patchy morphologies. They are gas-rich, showing strong star
formation with many young stars. Ionised gas, particularly HII regions, is prominent around
the regions of star formation. They tend to have strong emission lines from the interstellar
gas, and their starlight is dominated by B, A and F types. As a result their colours are
blue and their spectra show strong emission lines from the interstellar gas superimposed on
the stellar absorption-line spectrum. Their internal motions are relatively chaotic. They are
denoted type I or Irr.
1.5 Other Types of Galaxy
As already noted, some galaxies have unusual, disturbed morphologies and are called pecu-
liar. These are mostly the result of interactions and mergers between galaxies. They are
particularly numerous among distant galaxies.
5
Figure 1.6: examples of the optical spectra of elliptical, spiral and irregular galaxies. The
elliptical spectrum shows only absorption lines produced by the stars in the galaxy. The
spiral galaxy has absorption lines from its stars and some emission lines from its interstellar
gas. In contrast, the irregular galaxy has very strong emission lines on a weaker stellar
continuum. [Produced with data from the 2dF Galaxy Redshift Survey.]
6
Figure 1.7: The tuning fork diagram of Hubble types. The galaxies on the left are known as
early types, and those on the right as late types.
Clusters of galaxies often have a very luminous, dominant elliptical at their cores, of a
type called a cD galaxy. These have extensive outer envelopes of stars.
Low-luminosity galaxies are called dwarf galaxies. They have masses 10
6
to 10
9
M

.
Common subclasses are dwarf irregulars having a large fraction of gas and active star forma-
tion, and dwarf ellipticals which are gas poor and have no star formation. Dwarf spheroidal
galaxies are very low luminosity, very low surface brightness systems, essentially extreme
versions of dwarf ellipticals. Our Galaxy has several dwarf spheroidal satellites.
1.6 The Hubble Sequence
On the whole, galaxy classication probably should not be taken as seriously as stellar clas-
sication, because there are not (yet) precise physical interpretations of what the gradations
mean. But some physical properties do clearly correlate with the so-called Hubble types.
The basic system of classication described above was dened in detail by Edwin Hubble,
although a number of extensions to his system are available. Figure 1.7 shows the Hubble
tuning fork diagram which places the various types in a sequence based on their shapes.
Ellipticals go on the left, arranged in a sequence based on their ellipticities. Then come the
lenticulars or disc galaxies without spiral arms: S0 and SB0. Then spirals with increasingly
spaced arms, Sa etc. if unbarred, SBa etc. if barred.
The left-hand galaxies are called early types, and the right-hand ones late types. People
once thought this represented an evolutionary sequence, but that has long been obsolete.
(Our current understanding is that, if anything, galaxies tend to evolve towards early types.)
But the old names early and late are still used.
Note that for spirals, bulges get smaller as spiral arms get more widely spaced. The
theory behind spiral density waves predicts that the spacing between arms is proportional
to the discs mass density.
Several galaxy properties vary in a sequence from ellipticals to irregulars. However, the
precise shapes of ellipticals are not important in this: all ellipticals lie in the same position
in the sequence:
7
E S0 Sa Sb Sc Sd Irr
(all Es)
Early type Late type
Old stars Young stars
Red colour Blue colour
Gas poor Gas rich
Absorption-line Strong emission
spectra lines in spectra
Some evolution along this sequence from right to left (late to early) can occur if gas is
used up in star formation or gas is taken out of the galaxies.
1.7 A Description of Galaxy Dynamics
Interactions between distributions of matter can be very important over the lifetime of a
galaxy, be these the interactions of stars, the interactions between clouds of gas, or the
interactions of a galaxy with a near neighbour.
An important distinction between interactions is whether they are collisional or colli-
sionless. Encounters between bodies of matters are:
collisional if interactions between individual particles substantially aect their mo-
tions;
collisionless if interactions between individual particles do not substantially aect
their motions.
Gas is collisional. If two gas clouds collide, even with the low densities found in astron-
omy, individual atoms/molecules interact. These interactions on the atomic scale strongly
inuence the motions of the two gas clouds.
Stars are collisionless on the galactic scale. If two stellar systems collide, the interactions
between individual stars have little eect on their motions. The particles are the stars in
this case. The motions of the stars are mostly aected by the gravitational potentials of the
two stellar systems. Interactions between individual stars are rare on the scale of galaxies.
Stars are therefore so compact on the scale of a galaxy that a stellar system behaves like a
collisionless uid (except in the cores of galaxies and globular clusters), resembling a plasma
in some respects.
This distinction between stars and gas leads to two very important dierences between
stellar and gas dynamics in a galaxy.
1. Gas will tend to settle into rotating discs within galaxies. Stars will not settle in this
way.
2. Star orbits can cross each other, but in equilibrium gas must follow closed paths which
do not cross (and in the same sense). Two streams of stars can go through each other
and hardly notice, but two streams of gas will shock (and probably form stars). You
could have a disc of stars with no net rotation (just reverse the directions of motion of
some stars), but not so with a disc of gas.
The terms rotational support and pressure support are used to describe how material
in galaxies balances its self-gravity. Stars and gas move in roughly circular orbits in the
discs of spiral galaxies, where they achieve a stable equilibrium because they are rotationally
supported against gravity. The stars and gas in the spiral discs have rotational velocities of
250 kms
1
, while the dispersion of the gas velocities locally around this net motion is only
10 kms
1
.
8
In contrast, stars in luminous elliptical galaxies maintain a stable equilibrium because
they are moving in randomly oriented orbits. Drawing a parallel with atoms/molecules in a
gas, they are said to be pressure supported against gravity. Random velocities of 300 kms
1
are typical. The velocities for pressure support need not be isotropically distributed. It is
also possible to have a mixture of rotational and pressure support for a system of stars, with
some appreciable net rotation.
1.8 A Brief Overview of Galaxy Evolution Processes
Galaxies formed early in the history of the Universe, but exactly when is not known with
certainty. They were probably formed by the coalescence of a number of separate clumps
of dark matter that also contained gas and stars, rather than by the collapse of single large
bodies of dark matter and gas. Galaxies have evolved with time to give us the galaxy
populations seen today. Some of the processes driving the evolution of galaxies are briey
stated in this section. Exactly how these processes aect galaxies is not understood precisely
at present.
Gas in a galaxy will fall into a rotationally supported disc if it has angular momentum.
Subsequent star formation in the gaseous disc will form the stellar disc of a spiral galaxy.
Stars formed in gas clouds falling radially inwards during the formation of a galaxy can con-
tribute to the stellar halo of a spiral galaxy. Stars in smaller galaxies falling radially inwards
in a merging process can also contribute to the stellar halo of a spiral galaxy. Dierential
rotation in a spirals disc will generate spiral density waves in the disc, leading to spiral
arms. Spiral discs without a bulge can be unstable, and can buckle and thicken, with mass
being redistributed into a bulge, giving the bulge some rotation in the process. Continuing
star formation in the disc gives rise to a range of ages for stars in the disc, while bulges will
tend to be older. If a spiral uses up most of its gas in star formation, it will have a stellar
disc but no spiral arms.
Mergers and interactions between galaxies can be important. In a merger two galaxies
fuse together. In an interaction, however, one galaxy interacts with another through their
gravitational eects. One or both galaxy may survive an interaction, but may be altered in
the process. If two spiral galaxies merge, or one spiral is disrupted by a close encounter with
another galaxy, the immediate result can be an irregular or peculiar galaxy with strong star
formation. This can produce an elliptical galaxy if and when the gas is exhausted.
If a merger between two galaxies produces an elliptical with no overall angular momen-
tum, it will be a pressure-supported system. If a merger produces an elliptical with some
net angular momentum, the elliptical will have an element of rotational support. Ellipti-
cals, although conventionally gas-poor, can shed gas from their stars through mass loss and
supernova remnants, which can settle into gas discs and form stars in turn.
Almost all galaxies have some very old stars. Most galaxies appear to have been formed
fairly early on (> 10 Gyr ago) but some have been strongly inuenced by mergers and
interactions since then.
These processes are not well understood at present. Understanding the evolution of galax-
ies is currently a subject of much active research, from both a theoretical and observational
perspective. Then there is dark matter ...
1.9 The Galaxy: an Overview
1.9.1 The Structure of the Galaxy
We live in a spiral galaxy. It is relatively dicult to measure its morphology from inside,
but observations show that it is almost certainly a barred spiral. The best assessment of its
9
Figure 1.8: A sketch of the Galaxy seen edge on, illustrating the various components.
Figure 1.9: The Galaxy showing its geometry.
morphological type puts it of type SBbc (intermediate between SBb and SBc).
The Sun lies close to the plane of the Galaxy, at a distance of 8.0 0.4 kpc from the
Galactic Centre. It is displaced slightly to the north of the plane. The overall diameter of
the disc is 40 kpc.
The structure of the Galaxy can be broken into various distinct components. These are:
the disc, the central bulge, the bar, the stellar halo, and the dark matter halo. These are
10
illustrated in Figure 1.8. The disc consists of stars and of gas and dust. The gas and dust
are concentrated more closely on the plane than the stars, while younger stars are more
concentrated around the plane than old stars. The disc is rotationally supported against
gravity. The bulge and bar are found within the central few kpc. They consist mostly of old
stars, some metal-poor (decient in heavy elements). The Galactic Centre has a compact
nucleus, probably with a black hole at its core. The stellar halo contains many isolated stars
and about 150 globular clusters. These are very old and very metal-poor. The stellar halo
is pressure supported against gravity. The entire visible system is embedded in an extensive
dark matter halo which probably extends out to > 100 kpc. Its properties are not well
dened and the nature of the dark matter is still uncertain.
We shall return to discuss the structure of the Galaxy and its individual components in
detail later in the course.
1.9.2 Stellar Populations
It was realised in the early 20th century that the Galaxy could be divided into the disc
and into a spheroid (which consists of the bulge and stellar halo). The concept of stellar
populations was introduced by Walter Baade in 1944. In his picture, the stars in the discs
of spiral galaxies, which contain many many young stars, were called Population I. This
included the disc of our own Galaxy. In contrast, elliptical galaxies and the spheroids of
spiral galaxies contain many old stars, which he called Population II. Population I systems
consisted of young and moderately young stars which had chemical compositions similar to
the Sun. Meanwhile, Population II systems were made of old stars which were decient in
heavy elements compared to the Sun. Population I systems were blue in colour, Population II
were red. Spiral discs and irregular galaxies contained Population I stars, while ellipticals
and the haloes and bulges of disc galaxies were Population II.
This picture was found to be rather simplistic. The population concept was later rened
for our Galaxy, with a number of subtypes replacing the original two classes. Today, it is
more common to refer to the stars of individual components of the Galaxy separately. For
example, we might speak of the halo population or the bulge population. The disc population
is often split into the young disc population and the old disc population.
1.9.3 Galactic Coordinates
A galactic coordinate system is frequently used to specify the positions of objects within the
Galaxy, and the positions of other galaxies on the sky. In this system, two angles are used to
specify the direction of objects as seen from the Earth, called galactic longitude and galactic
latitude (denoted l and b respectively). The system works in a way that is very similar to
latitude and longitude on the Earths surface: galactic longitude measures an angle in the
plane of the galactic equator (which is dened to be in the plane of the Galaxy), and galactic
latitude measures the angle from the equator.
Note that the galactic coordinate system is centred on the Earth. Galactic longitude is
expressed as an angle l between 0

and 360

. Galactic latitude is expressed as an angle b


between 90

and +90

. Zero longitude is dened to be the Galactic Centre. Therefore


the Galactic Centre is at (l, b) = (0

, 0

). The direction on the sky opposite to the Galactic


Centre is known as the Galactic Anticentre and has coordinates (l, b) = (180

, 0

). The North
Galactic Pole is at b = +90

, and the South Galactic Pole is at b = 90

. Note again that


these are positions on the sky relative to the Earth, and not relative to the Galactic Centre.
11
1.10 Density proles versus surface brightness proles
The observed surface brightness proles of galaxies are the result of the projection of three-
dimensional density distributions of stars. For example, the observed surface brightness of an
elliptical galaxy is well tted by the de Vaucouleurs R
1/4
law, as was discussed earlier. This
surface brightness is the projection into two-dimensions on the sky of the three-dimensional
density distribution of stars in space.
Let us consider an example which demonstrates this. An observer on the Earth observes
a spherically symmetric galaxy, such as an E0-type elliptical galaxy. The mean density of
stars in space at a radial distance r from the centre of the galaxies is (r) (measured in
M

pc
3
or kg m
3
, and smoothed out over space). Consider a sight line that passes a
tangential distance R from the galaxys centre.
Consider an element of the path length dl at a distance
l from the point on the sight line closest to the nucleus.
Let the contribution to the surface brightness from the
element be
dI(R) = k (r) dl (1.1)
where k is a constant of proportionality.
dI(R) = k (r)
dr
sin
on subs. dl =
dr
sin
= k (r)
r dr

r
2
R
2
on subs. sin =

r
2
R
2
r
.
Integrating along the line of sight from B (r = R) to innity (r ),
I
B
=
_

R
k (r)
r

r
2
R
2
dr = k
_

R
r (r)

r
2
R
2
dr .
This has neglected the near side of the galaxy. From symmetry, the total surface brightness
along the line of sight is I(R) = 2 I
B
(R).
I(R) = 2k
_

R
r (r) dr

r
2
R
2
. (1.2)
In general, the density prole of a galaxy will not be spherical and we need to take account
of the prole (r) as a function of position vector r.
The calculation of the surface brightness prole from the density prole is straightforward
numerically, if not always analytically. The inverse problem, converting from an observed
surface brightness prole to a density prole, often has to be done numerically.
12
Chapter 2
Stellar Dynamics in Galaxies
2.1 Introduction
A system of stars behaves like a uid, but one with unusual properties. In a normal uid two-
body interactions are crucial in the dynamics, but in contrast star-star encounters are very
rare. Instead stellar dynamics is mostly governed by the interaction of individual stars with
the mean gravitational eld of all the other stars combined. This has profound consequences
for how the dynamics of the stars within galaxies are described mathematically, allowing for
some considerable simplications.
This chapter establishes some basic results relating to the motions of stars within galaxies.
The virial theorem provides a very simple relation between the total potential and kinetic
energies of stars within a galaxy, or other system of stars, that has settled down into a steady
state. The virial theorem is derived formally here. The timescale for stars to cross a system
of stars, known as the crossing time, is a simple but important measure of the motions of
stars. The relaxation time measures how long it takes for two-body encounters to inuence
the dynamics of a galaxy, or other system of stars. An expression for the relaxation time is
derived here, which is then used to show that encounters between stars are so rare within
galaxies that they have had little eect over the lifetime of the Universe.
The motions of stars within galaxies can be described by the collisionless Boltzmann
equation, which allows the numbers of stars to be calculated as a function of position and
velocity in the galaxy. The equation is derived from rst principles here. Similarly, the
Jeans Equations relate the densities of stars to position, velocity, velocity dispersion and
gravitational potential.
2.2 The Virial Theorem
2.2.1 The basic result
Before going into the main material on stellar dynamics, it is worth stating and deriving
a basic principle known as the virial theorem. It states that for any system of particles bound
by an inverse-square force law, the time-averaged kinetic energy T) and the time-averaged
potential energy V ) satisfy
2 T) + V ) = 0 , (2.1)
for a steady equilibrium state. T) will be a very large positive quantity and V ) a very large
negative quantity. Of course, for a galaxy to hold together, the total energy T) +V ) < 0 ;
the virial theorem provides a much tighter constraint than this alone. Typically, T) and
V ) 10
50
to 10
54
J for galaxies.
In practice, many systems of stars are not in a perfect nal steady state and the virial
theorem does not apply exactly. Despite this, it does give important, approximate results
13
for many astronomical systems.
2.2.2 Deriving the virial theorem from rst principles
To prove the virial theorem, consider a system of N stars. Let the ith star have a mass m
i
and a position vector x
i
. The velocity of the ith star is x
i
dx
i
/dt, where t is the time.
Consider a parameter
F
N

i=1
m
i
x
i
x
i
, (2.2)
(F is related to the moment of inertia of the system, as we shall see below.)
Dierentiating with respect to time t,
dF
dt
=
d
dt
_

i
m
i
x
i
x
i
_
=

i
d
dt
_
m
i
x
i
x
i
_
=

i
m
i
d
dt
_
x
i
x
i
_
assuming that the masses m
i
do not change
=

i
m
i
_
x
i
x
i
+ x
i
x
i
_
from the product rule
=

i
m
i
x
i
x
i
+

i
m
i
x
i
2
(2.3)
The kinetic energy of the ith particle is
1
2
m
i
x
i
2
. Therefore the total kinetic energy of the
entire system of stars is
T =

i
1
2
m
i
x
2
i

i
m
i
x
2
i
= 2 T
Substituting this into Equation 2.3,
dF
dt
= 2 T +

i
m
i
x
i
x
i
, (2.4)
at any time t.
We now need to remember that the average of any parameter y(t) over time t = 0 to is
y) =
1

_

0
y(t) dt
Consider the average value of dF/dt over a time interval t = 0 to .
_
dF
dt
_
=
1

_

0
_
2 T +

i
m
i
x
i
x
i
_
dt
=
2

_

0
T dt +
1

_

0

i
m
i
x
i
x
i
dt
= 2 T) +

i
m
i

_

0
x
i
x
i
dt assuming m
i
is constant over time
= 2 T) +

i
m
i
x
i
x
i
) (2.5)
We can dene a moment of inertia of the system of particles about the origin as
I

i
m
i
x
i
x
i
.
14
(Note that this denition of moment of inertia is dierent from the moment of inertia about
a particular axis that is commonly used to study the rotation of bodies.)
Dierentiating with repect to time,
dI
dt
=
d
dt

i
m
i
x
i
x
i
=

i
m
i
d
dt
_
x
i
x
i
_
=

i
m
i
_
x
i
x
i
+ x
i
x
i
_
= 2

i
m
i
x
i
x
i
Substituting for F =

i
m
i
x
i
x
i
from Equation 2.2,
F =
1
2
dI
dt
When the system of stars eventually reaches equilibrium, the moment of inertia will be
constant. Therefore, F = 0 at all times after equilibrium has been reached. So, dF/dt) = 0.
(An alternative way of visualising this is by considering that F will be bounded in any
physical system. Therefore the long-time average
dF
dt
) will vanish as becomes large, i.e.
lim

dF
dt
) = lim

(
1

0
dF
dt
dt) 0.)
Substituting for dF/dt) = 0 into Equation 2.5,
2 T) +

i
m
i
x
i
x
i
) = 0 . (2.6)
The term

i
m
i
x
i
x
i
) is related to the gravitational potential. We next need to show how.
Newtons Second Law of Motion gives for the ith star,
m
i
x
i
=

j
j=i
F
j
where F
j
is the force exerted on the ith star by the jth star. Using the law of universal
gravitation,
m
i
x
i
=

j
j=i

Gm
i
m
j
[x
i
x
j
[
3
(x
i
x
j
) .
Taking the scalar product (dot product) with x
i
,
m
i
x
i
x
i
=
_

j
j=i

Gm
i
m
j
[x
i
x
j
[
3
(x
i
x
j
)
_
x
i
Summing over all i,

i
m
i
x
i
x
i
=

j
j=i
Gm
i
m
j
[x
i
x
j
[
3
(x
i
x
j
)x
i
=

i,j
i=j
Gm
i
m
j
[x
i
x
j
[
3
(x
i
x
j
)x
i
(2.7)
Switching i and j, we have

j
m
j
x
j
x
j
=

j,i
i=j
Gm
j
m
i
[x
j
x
i
[
3
(x
j
x
i
) x
j
(2.8)
Adding Equations 2.7 and 2.8,

i
m
i
x
i
x
i
+

j
m
j
x
j
x
j
=

i,j
i=j
Gm
i
m
j
[x
i
x
j
[
3
(x
i
x
j
)x
i

i,j
i=j
Gm
j
m
i
[x
j
x
i
[
3
(x
j
x
i
)x
j
15
2

i
m
i
x
i
x
i
=

i,j
i=j
Gm
i
m
j
[x
i
x
j
[
3
_
(x
i
x
j
) x
i
+ (x
j
x
i
) x
j
_
But
(x
i
x
j
) x
i
+ (x
j
x
i
) x
j
= (x
i
x
j
) x
i
(x
i
x
j
) x
j
= (x
i
x
j
) (x
i
x
j
) (factorising)
= [x
i
x
j
[
2
2

i
m
i
x
i
x
i
=

i,j
i=j
Gm
i
m
j
[x
i
x
j
[
3
[x
i
x
j
[
2

i
m
i
x
i
x
i
=
1
2

i,j
i=j
Gm
i
m
j
[x
i
x
j
[
(2.9)
We now need to nd the total potential energy of the system.
The gravitational potential at star i due to star j is

i j
=
Gm
j
[x
i
x
j
[
Therefore the gravitational potential at star i due to all other stars is

i
=

j
j=i

i j
=

j
j=i

Gm
j
[x
i
x
j
[
Therefore the gravitational potential energy of star i due to all the other stars is
V
i
= m
i

i
= m
i

j
j=i
Gm
j
[x
i
x
j
[
The total potential energy of the system is therefore
V =

i
V
i
=
1
2

i
_
_
_
_
m
i

j
j=i
Gm
j
[x
i
x
j
[
_
_
_
_
The factor
1
2
ensures that we only count each pair of stars once (otherwise we would count
each pair twice and would get a result twice as large as we should). Therefore,
V =
1
2

i,j
i=j
Gm
i
m
j
[x
i
x
j
[
Substituting for the total potential energy into Equation 2.9,

i
m
i
x
i
x
i
= V
Equation 2.6 uses time-averaged quantities. So, averaging over time t = 0 to ,
1

_

0

i
m
i
x
i
x
i
dt = V )
16

i
m
i
1

_

0
x
i
x
i
dt = V )

i
m
i
x
i
x
i
) = V )
Substituting this into Equation 2.6,
2 T) + V ) = 0
This is Equation 2.1, the Virial Theorem.
2.2.3 Using the Virial Theorem
The virial theorem applies to systems of stars that have reached a steady equilibrium state.
It can be used for many galaxies, but can also be used for other systems such as some
star clusters. However, we need to be careful that we use the theorem only for equilibrium
systems.
The theorem can be applied, for example, to:
elliptical galaxies
evolved star clusters, e.g. globular clusters
evolved clusters of galaxies (with the galaxies acting as the particles, not the individual
stars)
Examples of places where the virial theorem cannot be used are:
merging galaxies
newly formed star clusters
clusters of galaxies that are still forming/still have infalling galaxies
The virial theorem provides an easy way to makes rough estimates of masses, because
velocity measurements can give T). To do this we need to measure the observed velocity
dispersion of stars (the dispersion along the line of sight using radial velocities obtained from
spectroscopy). The theorem then gives the total gravitational potential energy, which can
provide the total mass. This mass, of course, is important because it includes dark matter.
Virial masses are particularly important for some galaxy clusters (using galaxies or atoms in
X-ray emitting gas as the particles).
But it is prudent to consider virial mass estimates as order-of-magnitude only, because
(i) generally one can measure only line-of-sight velocities, and getting T =
1
2

i
m
i
x
2
i
from
these requires more assumptions (e.g. isotropy of the velocity distribution); and (ii) the
systems involved may not be in a steady state, in which case of course the virial theorem
does not applyclusters of galaxies are particularly likely to be quite far from a steady state.
Note that for galaxies beyond our own, we cannot measure three-dimensional velocities
of stars directly (although some projects are attempting to do this for some Local Group
galaxies). We have to use radial velocities (the component of the velocity along the line of
sight to the galaxy) only, obtained from spectroscopy through the Doppler shift of spectral
lines. Beyond nearby galaxies, radial velocities of individual stars become dicult to obtain.
It becomes necessary to measure velocity dispersions along the line of sight from the observed
widths of spectral lines in the combined light of millions of stars.
17
2.2.4 Deriving masses from the Virial Theorem: a naive example
Consider a spherical elliptical galaxy of radius R that has uniform density and which consists
of N stars each of mass m having typical velocities v.
From the virial theorem,
2 T) + V ) = 0
where T) is the time-averaged total kinetic energy and V ) is the average total potential
energy.
We have
T =
N

i=1
1
2
mv
2
=
1
2
Nmv
2
and averaging over time, T) =
1
2
Nmv
2
also.
The total gravitational potential energy of a uniform sphere of mass M and radius R (a
standard result) is
V =
3
5
GM
2
R
where G is the universal gravitational constant. So the time-averaged potential energy of
the galaxy is
V ) =
3
5
GM
2
R
where M is the total mass. Substituting this into the virial theorem equation,
2
_
1
2
Nmv
2
_

3
5
GM
2
R
= 0
But the total mass is M = Nm.
v
2
=
3
5
NGm
R
=
3
5
GM
R
The calculation is only approximate, so we shall use
v
2

NGm
R

GM
R
. (2.10)
This gives the mass to be
M
v
2
R
G
. (2.11)
So an elliptical galaxy having a typical velocity v = 350 kms
1
= 3.5 10
5
ms
1
, and
a radius R = 10 kpc = 3.1 10
20
m, will have a mass M 6 10
41
kg 3 10
11
M

.
2.2.5 Example: the fundamental plane for elliptical galaxies
We can derive a relationship between scale size, central surface brightness and central velocity
dispersion for elliptical galaxies that is rather similar to the fundamental plane, using only
assumptions about a constant mass-to-light ratio and a constant functional form for the
surface brightness prole.
We shall assume here that:
the mass-to-light ratio is constant for ellipticals (all E galaxies have the same M/L
regardless of size or mass), and
elliptical galaxies have the same functional form for the mass distribution, only scalable.
18
Let I
0
be the central surface brightness and R
0
be a scale size of a galaxy (in this case,
dierent galaxies will have dierent values of I
0
and R
0
). The total luminosity will be
L I
0
R
2
0
,
because I
0
is the light per unit projected area. Since the mass-to-light ratio is a constant for
all galaxies, the mass of the galaxy is M L .
M I
0
R
2
0
.
From the virial theorem, if v is a typical velocity of the stars in the galaxy
v
2

GM
R
0
.
The observed velocity dispersion along the line sight,
0
, will be related to the typical velocity
v by
0
v (because v is a three-dimensional space velocity). So

2
0

M
R
0
. M
2
0
R
0
.
Equating this with M I
0
R
2
0
from above,
2
0
R
0
I
0
R
2
0
.
R
0
I
0

2
0
constant .
This is close to, but not the same, as the observed fundamental plane result R
0
I
0.9
0

1.4
0

constant. The deviation from this virial prediction presumably has something to do with a
varying mass-to-light ratio, but why it is a very good correlation in practice is not understood
in detail.
2.3 The Crossing Time, T
cross
The crossing time is a simple, but important, parameter that measures the timescale for
stars to move signicantly within a system of stars. It is sometimes called the dynamical
timescale.
It is dened as
T
cross

R
v
, (2.12)
where R is the size of the system and v is a typical velocity of the stars.
As a simple example, consider a stellar system of radius R (and therefore an overall size
2R), having N stars each of mass m; the stars are distributed roughly homogeneously, with
v being a typical velocity, and the system is in dynamical equilibrium. Then from the virial
theorem,
v
2

NGm
R
.
The crossing time is then
T
cross

2R
v

2R
_
NGm
R
2
_
R
3
NGm
. (2.13)
But the mass density is
=
Nm
4
3
R
3
=
3Nm
4R
3
.

R
3
Nm
=
3
4
.
19
T
cross
= 2
_
3
4G
So approximately,
T
cross

1

G
. (2.14)
Although this equation has been derived for a particular case, that of a homogeneous sphere,
it is an important result and can be used for order of magnitude estimates in other situations.
(Note that here is the mass density of the system, averaged over a volume of space, and
not the density of individual stars.)
Example: an elliptical galaxy of 10
11
stars, radius 10 kpc.
R 10 kpc 3.1 10
20
m
N = 10
11
m 1 M

2 10
30
kg
T
cross
2
_
R
3
NGm
gives T
cross
10
15
s 10
8
yr.
The Universe is 14 Gyr old. So if a galaxy is 14 Gyr old, there are few 100 crossing
times in a galaxys lifetime so far.
2.4 The Relaxation Time, T
relax
The relaxation time is the time taken for a stars velocity v to be changed signicantly by
two-body interactions. It is dened as the time needed for a change v
2
in v
2
to be the
same as v
2
, i.e. the time for
v
2
= v
2
. (2.15)
To estimate the relaxation time we need to consider the nature of encounters between stars
in some detail.
2.5 Star-Star Encounters
2.5.1 Types of encounters
We might expect that stars, as they move around inside a galaxy or other system of stars, will
experience close encounters with other stars. The gravitational eects of one star on another
would change their velocities and these velocity perturbations would have a profound eect
on the overall dynamics of the galaxy. The dynamics of the galaxy might evolve with time,
as a result only of the internal encounters between stars.
The truth, however, is rather dierent. Close star-star encounters are extremely rare and
even the eects of distant encounters are so slight that it takes an extremely long time for
the dynamics of galaxies to change substantially.
We can consider two dierent types of star-star encounters:
strong encounters a close encounter that strongly changes a stars velocity these
are very rare in practice
weak encounters occur at a distance they produce only very small changes in a
stars velocity, but are much more common
20
2.5.2 Strong encounters
A strong encounter between two stars is dened so that we have a strong encounter if, at
the closest approach, the change in the potential energy is larger than or equal to the initial
kinetic energy.
For two stars of mass m that approach to a distance r
0
, if the change in potential energy
is larger than than initial kinetic energy,
Gm
2
r
0

1
2
mv
2
,
where v is the initial velocity of one star relative to the other.
r
0

2Gm
v
2
.
So we dene a strong encounter radius
r
S

2Gm
v
2
. (2.16)
A strong encounter occurs if two stars approach to within a distance r
S
2Gm/v
2
.
For an elliptical galaxy, v 300 kms
1
. Using m = 1M

, we nd that r
S
310
9
m
0.02 AU. This is a very small gure on the scale of a galaxy. The typical separation between
stars is 1 pc 200 000 AU.
For stars in the Galactic disc in the solar neighbourhood, we can use a velocity dispersion
of v = 30 kms
1
and m = 1M

. This gives r
S
3 10
11
m 2 AU. This again is very
small on the scale of the Galaxy.
So strong encounters are very rare. The mean time between them in the Galactic disc
is 10
15
yr, while the age of the Galaxy is 13 10
9
yr. In practice, we can ignore their
eect on the dynamics of stars.
2.5.3 Distant weak encounters between stars
A star experiences a weak encounter if it approaches another to a minimum distance r
0
when
r
0
> r
S

2Gm
v
2
(2.17)
where v is the relative velocity before the encounter and m is the mass of the perturbing
star. Weak encounters in general provide only a tiny perturbation to the motions of stars in
a stellar system, but they are so much more numerous than strong encounters that they are
more important than strong encounters in practice.
We shall now derive a formula that expresses the change v in the velocity v during a
weak encounter (Equation 2.19 below). This result will later be used to derive an expression
for the square of the velocity change caused by a large number of weak encounters, which
will then be used to obtain an estimate of the relaxation time in a system of stars.
Consider a star of mass m
s
approaching a perturbing star of mass m with an impact
parameter b. Because the encounter is weak, the change in the direction of motion will be
small and the change in velocity will be perpendicular to the initial direction of motion. At
any time t when the separation is r, the component of the gravitational force perpendicular
to the direction of motion will be
F
perp
=
Gm
s
m
r
2
cos ,
where is the angle at the perturbing mass between the point of closest approach and the
perturbed star. Let the component of velocity perpendicular to the initial direction of motion
be v
perp
and let the nal value be v
perp f
.
21
Making the approximation that the speed along the trajectory is constant, r

b
2
+v
2
t
2
at time t if t = 0 at the point of closest approach. Using cos = b/r b/

b
2
+v
2
t
2
and
applying F = ma perpendicular to the direction of motion we obtain
dv
perp
dt
=
Gmb
(b
2
+v
2
t
2
)
3/2
,
where v
perp
is the component at time t of the velocity perpendicular to the initial direction
of motion. Integrating from time t = to ,
_
v
perp
_
v
perp f
0
= Gmb
_

dt
(b
2
+v
2
t
2
)
3/2
.
We have the standard integral
_

(1 + s
2
)
3/2
ds = 2 (which can be shown using the
substitution s = tan x). Using this standard integral, the nal component of the velocity
perpendicular to the initial direction of motion is
v
perp f
=
2Gm
bv
. (2.18)
Because the deection is small, the change of velocity is v [v[ = v
perp f
. Therefore the
change in the velocity v is given by
v =
2Gm
bv
, (2.19)
where G is the constant of gravitation, b is the impact parameter and m is the mass of the
perturbing star.
As a star moves through space, it will experience a number of perturbations caused by
weak encounters. Many of these velocity changes will cancel, but some net change will occur
over time. As a result, the sum over all v will remain small, but the sum of the squares
v
2
will build up with time. It is this change in v
2
that we need to consider in the denition
of the relaxation time (Equation 2.15). Because the change in velocity v is perpendicular
to the initial velocity v in a weak encounter, the change in v
2
is therefore v
2
v
2
f
v
2
=
[v+v[
2
v
2
= (v+v) (v+v)v
2
= v v+2v v+v vv
2
= 2v v+(v)
2
= (v)
2
,
where v
f
is the nal velocity of the star. The change in v
2
resulting from a single encounter
that we need to consider is
v
2
=
_
2Gm
bv
_
2
. (2.20)
22
Consider all weak encounters occurring in a time period t that have impact parameters in
the range b to b + db within a uniform spherical system of N stars and radius R.
The volume swept out by impact parameters b to b + db in time t is 2 b db v t.
Therefore the number of stars encountered with impact parameters between b and b +db in
time t is
(volume swept out) (number density of stars) =
_
2 b db v t
_
N
4
3
R
3
=
3 b v t N db
2R
3
The total change in v
2
caused by all encounters in time t with impact parameters in the
range b to b + db will be
v
2
=
_
2Gm
bv
_
2
_
3 b v t N db
2 R
3
_
Integrating over b, the total change in a time t from all impact parameters from b
min
to b
max
is
v
2
(t) =
_
bmax
b
min
_
2Gm
bv
_
2
_
3 b v t N db
2 R
3
_
=
3
2
_
2Gm
v
_
2
v t N
R
3
_
bmax
b
min
db
b
v
2
(t) = 6
_
Gm
v
_
2
v t N
R
3
ln
_
b
max
b
min
_
. (2.21)
It is sometimes useful to have an expression for the change in v
2
that occurs in one crossing
time. In one crossing time T
cross
= 2R/v, the change in v
2
is
v
2
(T
cross
) = 6
_
Gm
v
_
2
v
R
3
_
2R
v
_
N ln
_
b
max
b
min
_
= 12 N
_
Gm
Rv
_
2
ln
_
b
max
b
min
_
. (2.22)
The maximum scale over which weak encounters will occur corresponds to the size of the
system of stars. So we shall use b
max
R.
v
2
(T
cross
) = 12 N
_
Gm
Rv
_
2
ln
_
R
b
min
_
. (2.23)
23
We are more interested here in the relaxation time T
relax
. The relaxation time is dened
as the time taken for v
2
= v
2
. Substituting for v
2
from Equation 2.21 we get,
6
_
Gm
v
_
2
v T
relax
N
R
3
ln
_
b
max
b
min
_
= v
2
.
T
relax
=
1
6N ln
_
bmax
b
min
_
(Rv)
3
(Gm)
2
, (2.24)
or putting b
max
R,
T
relax
=
1
6N ln
_
R
b
min
_
(Rv)
3
(Gm)
2
. (2.25)
Equation 2.25 enables us to estimate the relaxation time for a system of stars, such as
a galaxy or a globular cluster. Dierent derivations can have slightly dierent numerical
constants because of the dierent assumptions made.
In practice, b
min
is often set to the scale on which strong encounters begin to operate,
so b
min
1 AU. The precise values of b
max
and b
min
have relatively little eect on the
estimation of the relaxation time because of the log dependence.
As an example of the calculation of the relaxation time, consider an elliptical galaxy.
This has: v 300 kms
1
= 3.0 10
5
ms
1
, N 10
11
, R 10 kpc 3.1 10
20
m and
m 1 M

2.0 10
30
kg. So, ln(R/b
min
) 21 and T
relax
10
24
s 10
17
yr. The
Universe is 14 10
9
yr old, which means that the relaxation time is 10
8
times the age of
the Universe. So star-star encounters are of no signicance for galaxies.
For a large globular cluster, we have: v 10 kms
1
= 10
4
ms
1
, N 500 000, R 5 pc
1.610
17
m and m 1 M

2.010
30
kg. So, ln(R/b
min
) 15 and T
relax
510
15
s
10
7
yr. This is a small fraction (10
3
) of the age of the Galaxy. Two body interactions
are therefore signicant in globular clusters.
The importance of the relaxation time calculation is that it enables us to decide whether
we need to allow for star-star interactions when modelling the dynamics of a system of stars.
Modelling becomes much easier if two-body encounters can be ignored. A system where
these interactions are not important is called a collisionless system. This is why stars on
the scale of galaxies were described as being collisionless in Chapter 1. Fortunately, we can
ignore these star-star interactions when modelling galaxies and this makes possible the use
of a result called the collisionless Boltzmann equation later.
2.6 The Ratio of the Relaxation Time to the Crossing Time
An approximate expression for the ratio of the relaxation time to the crossing time can be cal-
culated easily. Dividing the expressions for the relaxation and crossing times (Equations 2.25
and 2.12),
T
relax
T
cross
=
1
12N ln
_
R
b
min
_
R
2
v
4
(Gm)
2
.
For a uniform sphere, from the virial theorem (Equation 2.10),
v
2

NGm
R
and setting b
min
equal to the strong encounter radius r
S
= 2GM/v
2
(Equation 2.16), we get,
T
relax
T
cross
=
1
12N ln
_
Rv
2
2GM
_
R
2
v
4
(Gm)
2

N
2
12N ln(N)
24

T
relax
T
cross

N
12 ln N
. (2.26)
For a galaxy, N 10
11
. Therefore T
relax
/T
cross
10
9
. For a globular cluster, N 10
5
and
T
relax
/T
cross
10
3
.
2.7 The Nature of the Gravitational Potential in a Galaxy
The gravitational potential in a galaxy can be represented as essentially having two compo-
nents. The rst of these is the broad, smooth, underlying potential due to the entire galaxy.
This is the sum of the potentials of all the stars, and also of the dark matter and the inter-
stellar medium. The second component is the localised deeper potentials due to individual
stars.
We can eectively regard the potential as being made of a smooth component with very
localised deep potentials superimposed on it. This is illustrated guratively in Figure 2.1.
Figure 2.1: A sketch of the gravitational potential of a galaxy, showing the broad potential
of the galaxy as a whole, and the deeper, localised potentials of individual stars.
Interactions between individual stars are rare, as we have seen, and therefore it is the
broad distribution that determines the motions of stars. Therefore, we can represent the dy-
namics of a system of stars using only the smooth underlying component of the gravitational
potential (x, t), where x is the position vector of a point and t is the time. If the galaxy has
reached a steady state, is (x) only. We shall neglect the eect of the localised potentials
of stars in the following sections, which is an acceptable approximation as we have shown.
2.8 Gravitational potentials, density distributions and masses
2.8.1 General principles
The distribution of mass in a galaxy including both the visible and dark matter determines
the gravitational potential. The potential at any point is related to the local density by
Poissons Equation,
2
( ) = 4 G . This means that if we know the density (x)
as a function of position across a galaxy, we can calculate the potential , either analytically
or numerically, by integration. Alternatively, if we know (x), we can calculate the density
prole (x) by dierentiation. In addition, because the acceleration due to gravity g is
related to the potential by g = , we can compute g(x) from (x) and vice-versa.
Similarly, substituting for g = in the Poission Equation gives g = 4G .
These computations are often done for some example theoretical representations of the
potential or density. A number of convenient analytical functions are encountered in the
literature, depending on the type of galaxy being modelled and particular circumstances.
25
The issue of determining actual density proles and potentials from observations of galax-
ies is much more challenging, however. Observations readily give the projected density distri-
butions of stars on the sky, and we can attempt to derive the three-dimensional distribution
of stars from this. However, it is the total density (x), including dark matter
DM
(x), that
is relevant gravitationally, with (x) =
DM
(x) +
V IS
(x). The dark matter distribution can
only be inferred from the dynamics of visible matter (or to a limited extent from gravita-
tional lensing of background objects). In practice, therefore, the three-dimension density
distribution (x) and the gravitational potential (x) are poorly known.
2.8.2 Spherical symmetry
Calculating the relationship between density and potential is much simpler if we are dealing
with spherically symmetric distributions, which are appropriate in some circumstances such
as spherical elliptical galaxies. Under spherical symmetry, and are functions only of the
radial distance r from the centre of the distribution. Therefore,

2
=
1
r
2
d
dr
_
r
2
d
dr
_
= 4 G
because is independent of the angles and in a spherical coordinate system (see Ap-
pendix B).
Another useful parameter for spherically symmetric distributions is the mass M(r) that
lies inside a radius r. We can relate this to the density (r) by considering a thin spherical
shell of radius r and thickness dr centred on the distribution. The mass of this shell is
dM(r) = (r)surface areathickness = 4r
2
(r)dr. This gives us the dierential equation
dM
dr
= 4 r
2
, (2.27)
often known as the equation of continuity of mass. The total mass is M
tot
= lim
r
M(r).
The gravitational acceleration g in a spherical distribution has an absolute value [g[ of
g =
GM(r)
r
2
, (2.28)
at a distance r from the centre, where G is the constant of gravitation (derived in Ap-
pendix B), and is directed towards the centre of the distribution.
2.8.3 Two examples of spherical potentials
The Plummer Potential
A function that is often used for the theoretical modelling of spherically-symmetric galaxies
is the Plummer potential. This has a gravitational potential at a radial distance r from
the centre that is given by
(r) =
GM
tot

r
2
+a
2
, (2.29)
where M
tot
is the total mass of the galaxy and a is a constant. The constant a serves to
atten the potential in the core.
For this potential the density at a radial distance r is
(r) =
3M
tot
4
a
2
(r
2
+a
2
)
5/2
, (2.30)
which can be derived from the expression for using the Poisson equation
2
= 4G.
This density scales with radius as r
5
at large radii.
26
The mass interior to a point M(r) can be computed from the density using dM/dr =
4r
2
, or from the potential using Gausss Law in the form
_
S
dS = 4GM(r) for a
spherical surface of radius r. The result is
M(r) =
M
tot
r
3
(r
2
+a
2
)
3/2
. (2.31)
The Plummer potential was rst used in 1911 by H. C. K. Plummer (18751946) to
describe globular clusters. Because of the simple functional forms, the Plummer model is
sometimes useful for approximate analytical modelling of galaxies, but the r
5
density prole
is much steeper than elliptical galaxies are observed to have.
The Isothermal Sphere
The density distribution known as the isothermal sphere is a spherical model of a galaxy
that is identical to the distribution that would be followed by a stable cloud of gas having
the same temperature everywhere. A spherically-symmetric cloud of gas having a single
temperature T throughout would have a gas pressure P(r) at a radius r from its centre that
is related to T by the ideal gas law as P(r) = n
p
k
B
T, where n
p
(r) is the number density of
gas particles (atoms or molecules) at radius r and k
B
is the Boltzmann constant. The cloud
will be supported by hydrostatic equilibrium, so therefore
dP
dr
=
GM(r)
r
2
(r) , (2.32)
where M(r) is the mass enclosed within a radius r. The gradient in the mass is dM/dr =
4r
2
(r).
These equations have a solution
(r) =

2
2Gr
2
, and M(r) =
2
2
G
r , where
2

k
B
T
m
p
, (2.33)
where m
p
is the mass of each gas particle. The parameter is the root-mean-square velocity
in any direction.
The isothermal sphere model for a system of stars is dened to be a model that has
the same density distribution as the isothermal gas cloud. Therefore, an isothermal galaxy
would also have a density (r) and mass M(r) interior to a radius r given by
(r) =

2
2Gr
2
, and M(r) =
2
2
G
r , (2.34)
where is root-mean-square velocity of the stars along any direction.
The isothermal sphere model is sometimes used for the analytical modelling of galaxies.
While it has some advantages of simplicity, it does suer from the disadvantage of being
unrealistic in some important respects. Most signicantly, the model fails totally at large
radii: formally the limit of M(r) as r is innite.
2.9 Phase Space and the Distribution Function f(x, v, t)
To describe the dynamics of a galaxy, we could use:
the positions of each star, x
i
the velocities of each star, v
i
where i = 1 to N, with N 10
6
to 10
12
. However, this would be impractical numerically.
If we tried to store these data on a computer as 4-byte numbers for every star in a galaxy
having N 10
12
stars, we would need 6 4 10
12
bytes 2 10
13
bytes 20 000 Gbyte.
27
This is such a large data size that the storage requirements are prohibitive. If we needed to
simulate a galaxy theoretically, we would need to follow the galaxy over time using a large
number of time steps. Storing the complete set of data for, say, 10
3
10
6
time steps would
be impossible. Observationally, meanwhile, it is impossible to determine the positions and
motions of every star in any galaxy, even our own.
In practice, therefore, people represent the stars in a galaxy using the distribution function
f(x, v, t) over position x and velocity v, at a time t. This is the probability density in the
6-dimensional phase space of position and velocity at a given time. It is also known as
the phase space density. It requires only modest data resources to store the function
numerically for a model of a galaxy, while f can also be modelled analytically.
The number of stars in a rectangular box between x and x + dx, y and y + dy, z and
z +dz, with velocity components between v
x
and v
x
+dv
x
, v
y
and v
y
+dv
y
, v
z
and v
z
+dv
z
,
is f(x, v, t) dx dy dz dv
x
dv
y
dv
z
f(x, v, t) d
3
x d
3
v . The number density n(x, v, t)
of stars in space can be obtained from the distribution function f by integrating over the
velocity components,
n(x, v, t) =
_

f(x, v, t) dv
x
dv
y
dv
z
=
_

f(x, v, t) d
3
v . (2.35)
2.10 The Continuity Equation
We shall assume here that stars are conserved: for the purpose of modelling galaxies we shall
assume that the number of stars does not change. This means ignoring star formation and
the deaths of stars, but it is acceptable for the present purposes.
The assumption that stars are conserved results in the continuity equation. This expresses
the rate of change in the distribution function f as a function of time to the rates of change
with position and velocity. The equation becomes an important starting point in deriving
other equations that relate f to the gravitational potential and to observational quantities.
Consider the x v
x
plane within the 6-dimensional phase space (x, y, z, v
x
, v
y
, v
z
) in
Cartesian coordinates. Consider a rectangular box in the plane extending from x to x +x
and v
x
to v
x
+ v
x
.
But the velocity v
x
means that stars
move in x (v
x
dx/dt).
So there is a ow of stars through
the box in both the x and the v
x
directions.
28
We can represent the ow of stars by the continuity equation:
f
t
+

x
_
f
dx
dt
_
+

y
_
f
dy
dt
_
+

z
_
f
dz
dt
_
+

v
x
_
f
dv
x
dt
_
+

v
y
_
f
dv
y
dt
_
+

v
z
_
f
dv
z
dt
_
= 0 . (2.36)
This can be abbreviated as
f
t
+
3

i=1
_

x
i
_
f
dx
i
dt
_
+

v
i
_
f
dv
i
dt
_ _
= 0 , (2.37)
where x
1
x, x
2
y, x
3
z, v
1
v
x
, v
2
v
y
, and v
3
v
z
. It is sometimes also abbreviated
as
f
t
+

x

_
f
dx
dt
_
+

v

_
f
dv
dt
_
= 0 , (2.38)
where, in this notation, for any vectors a and b with components (a
1
, a
2
, a
3
) and (b
1
, b
2
, b
3
),

a
b
3

i=1
b
i
a
i
. (2.39)
(Note that it does not mean a direct dierentiation by a vector).
It is also possible to simplify the notation further by introducing a combined phase space
coordinate system w = (x, v) with components (w
1
, w
2
, w
3
, w
4
, w
5
, w
6
) = (x, y, z, v
x
, v
y
, v
z
).
In this case the continuity equation becomes
f
t
+
6

i=1

w
i
(f w
i
) = 0 . (2.40)
The equation of continuity can also be expressed in terms of the momentum p = mv, where
m is mass of an element of gas, as
f
t
+

x

_
f
dx
dt
_
+

p

_
f
dp
dt
_
= 0 . (2.41)
29
2.11 The Collisionless Boltzmann Equation
2.11.1 The importance of the Collisionless Boltzmann Equation
Equation 2.25 showed that the relaxation time for galaxies is very long, signicantly longer
than the age of the Universe: galaxies are collisionless systems. This, fortunately, simplies
the analysis of the dynamics of stars in galaxies.
It is possible to derive an equation from the continuity equation that more explicitly
states the relation between the distribution function f, position x, velocity v and time t.
This is the collisionless Boltzmann equation (C.B.E.), which takes its name from a similar
equation in statistical physics derived by Boltzmann to describe particles in a gas.
2.11.2 A derivation of the Collisionless Boltzmann Equation
The continuity equation (2.37) states that
f
t
+
3

i=1
_

x
i
_
f
dx
i
dt
_
+

v
i
_
f
dv
i
dt
_ _
= 0 ,
where f is the distribution function in the Cartesian phase space (x
1
, x
2
, x
3
, v
1
, v
2
, v
3
). But
the acceleration of a star is given by the gradient of the gravitational potential :
dv
i
dt
=

x
i
in each direction (i.e. for each value of i for i = 1, 2, 3). (This is simply dv/dt = g =
resolved into each dimension.)
We also have
dx
i
dt
= v
i
, so,
f
t
+
3

i=1
_

x
i
(fv
i
) +

v
i
_
f

x
i
_ _
= 0 .
But v
i
is a coordinate, not a value associated with a particular star: we are using the
continuous function f rather than considering individual stars. Therefore v
i
is independent
of x
i
. So,

x
i
(fv
i
) = v
i
f
x
i
.
The potential (x, t) does not depend on v
i
: is independent of velocity.


v
i
_
f
d
dx
i
_
=

x
i
f
v
i

f
t
+
3

i=1
_
v
i
f
x
i


x
i
f
v
i
_
= 0 .
But
dv
i
dt
=

x
i
, so,
f
t
+
3

i=1
_
v
i
f
x
i
+
dv
i
dt
f
v
i
_
= 0 . (2.42)
This is the collisionless Boltzmann equation. It can also be written as
f
t
+
3

i=1
_
dx
i
dt
f
x
i
+
dv
i
dt
f
v
i
_
= 0 . (2.43)
30
Alternatively it can expressed as,
f
t
+
6

i=1
w
i
f
w
i
= 0 , (2.44)
where w = (x, v) is a 6-dimensional coordinate system, and also as
f
t
+
dx
dt

f
x
+
dv
dt

f
v
= 0 , (2.45)
and as
f
t
+
dx
dt

f
x
+
dp
dt

f
p
= 0 . (2.46)
Note the use here of the notation
dx
dt

f
x

i=1
dx
i
dt
f
x
i
= 0 , etc. (2.47)
2.11.3 Deriving the Collisionless Boltzmann Equation using Hamiltonian
Mechanics
The collisionless Boltzmann equation can also be derived from the continuity equation using
Hamiltonian mechanics. This derivation is given here. It has the advantage of being neat.
However, do not worry if you are not familiar with Hamiltonian mechanics: this is given as
an alternative to Section 2.11.2.
Hamiltons Equations relate the dierentials of the position vector x and of the (gener-
alised) momentum p to the dierential of the Hamiltonian H:
dx
dt
=
H
p
,
dp
dt
=
H
x
. (2.48)
(In this notation this means
dx
i
dt
=
H
p
i
and
dp
i
dt
=
H
x
i
for i = 1 to 3, (2.49)
where x
i
and p
i
are the components of x and p.)
Substituting for dx/dt and dp/dt into the continuity equation,
f
t
+

x

_
f
H
p
_
+

p

_
f
H
x
_
= 0 .
For a star moving in a gravitational potential , the Hamiltonian is
H =
p
2
2m
+ m(x) =
p p
2m
+ m(x) . (2.50)
where p is its momentum and m is its mass. Dierentiating,
H
p
=
d
dp
_
p p
2m
_
+
d
dp
(m)
=
p
m
+ 0 because (x, t) is independent of p
=
p
m
and
H
x
=

x
_
p
2
2m
_
+ m

x
= 0 + m

x
because p
2
= p p is independent of x
= m

x
.
31
Substituting for H/p and H/x,
f
t
+

x

_
f
p
m
_


p

_
fm

x
_
= 0

f
t
+
p
m

f
x
m

x

f
p
= 0
because p is independent of x, and because /x is independent of p since (x, t).
But the momentum p = mdx/dt and the acceleration is
1
m
dp/dt = /x (the gradient
of the potential).


x
=
1
m
dp
dt
.
So,
f
t
+
m
m
dx
dt

f
x
m
_

1
m
dp
dt
_

f
p
= 0

f
t
+
dx
dt

f
x
+
dp
dt

f
p
= 0 .
The left-hand side is the dierential df/dt. So,
f
t
+
dx
dt

f
x
+
dp
dt

f
p

df
dt
= 0 (2.51)
the collisionless Boltzmann equation.
While this equation is called the collisionless Boltzmann equation (or CBE) in stellar
dynamics, in Hamiltonian dynamics it is known as Liouvilles theorem.
2.12 The implications of the Collisionless Boltzmann Equa-
tion
The collisionless Boltzmann equation tells us that df/dt = 0. This means that the density
in phase space, f, does not change with time for a test particle. Therefore if we follow a star
in orbit, the density f in 6-dimensional phase space around the star is constant.
This simple result has important implications. If a star moves inwards in a galaxy as it
follows its orbit, the density of stars in space increases (because the density of stars in each
of the components of the galaxy is greater closer to the centre). df/dt = 0 then tells us that
the spread of stellar velocities around the star will increase to keep f constant. Therefore
the velocity dispersion around the star increases as the star moves inwards. The velocity
dispersion is larger in regions of the galaxy where the density of stars is greater. Conversely,
if a star moves out from the centre, the density of stars around it will decrease and the
velocity dispersion will decrease to keep f constant.
The collisionless Boltzmann equation, and the Poisson equation (which is the gravita-
tional analogue of Gausss law in electrostatics) together constitute the basic equations of
stellar dynamics:
df
dt
= 0 ,
2
(x) = 4G(x) , (2.52)
where f is the distribution function, t is time, (x, t) is the gravitational potential at point
x, (x, t) is the mass density at point x, and G is the constant of gravitation.
The collisionless Boltzmann equation applies because star-star encounters do not change
the motions of stars signicantly over the lifetime of a galaxy, as was shown in Section 2.5.
Were this not the case and the system was collisional, the CBE would have to be modied
by adding a collisional term on the right-hand side.
32
Though f is a density in phase space, the full form of the collisionless Boltzmann equation
does not necessarily have to be written in terms of x and p. We can express
df
dt
= 0 in
any set of six variables in phase space. You should remember that f is always taken to be
a density in six-dimensional phase space, even in situations where it is a function of fewer
variables. For example, if f happens to be a function of energy alone, it is not the same as
the density in energy space.
2.13 The Collisionless Boltzmann Equation in Cylindrical Co-
ordinates
So far we have considered Cartesian coordinates (x, y, z, v
x
, v
y
, v
z
). However, the form
f
t
+
3

i=1
_
dx
i
dt
f
x
i
+
dv
i
dt
f
v
i
_
= 0 ,
for the collisionless Boltzmann equation of Equation 2.43 applies to any coordinate system.
For a galaxy, it is often more convenient to use cylindrical coordinates with the centre of
the galaxy as the origin.
The coordinates of a star are (R, , z). A cylindrical system is particularly useful for
spiral galaxies like our own where the z = 0 plane is set to the Galactic plane. (Note the
use of a lower-case as a coordinate angle, whereas elsewhere we have used a capital to
denote the gravitational potential.)
The collisionless Boltzmann equation in this system is
df
dt
=
f
t
+
dR
dt
f
R
+
d
dt
f

+
dz
dt
f
z
+
dv
R
dt
f
v
R
+
dv

dt
f
v

+
dv
z
dt
f
v
z
= 0 , (2.53)
where v
R
, v

, and v
z
are the components of the velocity in the R, , z directions.
We need to replace the dierentials of the velocity components with more convenient
terms. dv
R
/dt, dv

/dt and dv
z
/dt are related to the acceleration a (but are not actually the
components of the acceleration for the R and directions). The velocity and acceleration
in terms of these dierentials in a cylindrical coordinate system are
v =
dr
dt
=
dR
dt
e
R
+ R
d
dt
e

+
dz
dt
e
z
a =
dv
dt
=
_
d
2
R
dt
2
R
_
d
dt
_
2
_
e
R
+
_
2
dR
dt
d
dt
+R
d
2

dt
2
_
e

+
d
2
z
dt
2
e
z
(2.54)
33
where e
R
, e

and e
z
are unit vectors in the R, and z directions (a standard result for any
cylindrical coordinate system, and for any velocity, acceleration or force). Representing the
velocity as v = v
R
e
R
+v

+v
z
e
z
and equating coecients of the unit vectors,
dR
dt
= v
R
,
d
dt
=
v

R
,
dz
dt
= v
z
. (2.55)
The acceleration can be related to the gravitational potential with a = (because the
only forces acting on the star are those of gravity). In a cylindrical coordinate system,
e
R

R
+ e

1
R

+ e
z

z
. (2.56)
Using this result and equating coecients, we obtain,
d
2
R
dt
2
R
_
d
dt
_
2
=

R
, 2
dR
dt
d
dt
+R
d
2

dt
2
=
1
R

,
d
2
z
dt
2
=
d
dz
Rearranging these and substituting for dR/dt, d/dt and dz/dt from 2.55, we obtain,
dv
R
dt
=

R
+
v
2

R
,
dv
z
dt
=

z
,
and with some more manipulation,
dv

dt
=
d
dt
_
R
d
dt
_
=
dR
dt
d
dt
+R
d
2

dt
2
= v
R
v

R
+
_

1
R

2
dR
dt
d
dt
_
=
v
R
v

R

1
R

2 v
R
v

R
=
1
R


v
R
v

R
. (2.57)
Substituting these into Equation 2.53, we obtain,
df
dt
=
f
t
+ v
R
f
R
+
v

R
f

+ v
z
f
z
+
_
v
2

R


R
_
f
v
R

1
R
_
v
R
v

_
f
v


z
f
v
z
= 0 , (2.58)
This is the collisionless Boltzmann equation in cylindrical coordinates. This form relates f
to observable parameters (R, , z, v
R
, v

, v
z
) and the potential .
In many practical cases, particularly spiral galaxies, will be independent of , so
/ = 0 (but not if we include spiral arms where the potential will be slightly deeper).
2.14 Orbits of Stars in Galaxies
2.14.1 The character of orbits
The term orbit is used to describe the trajectories of stars within galaxies, even though they
are very dierent to Keplerian orbits such as those of planets in the Solar System. The orbits
of stars in a galaxy are usually not closed paths and in general they are three dimensional
(they do not lie in a plane). They are often complex. In general they are highly chaotic,
even if the galaxy is in equilibrium.
The orbit of a star in a spherical potential, to consider the simplest example, is conned
to a plane perpendicular to the angular momentum vector of the star. It is, however, not a
closed path and has an appearance that is usually described as a rosette. In axisymmetric
potentials (e.g. an oblate elliptical galaxy) the orbit is conned to a plane that precesses.
This plane is inclined to the axis of symmetry and rotates about the axis. The orbit within
the plane is similar to that in a spherical potential.
Triaxial potentials can have orbits that are much more complex. Triaxial potentials often
have the tendency to tumble about one axis, which leads to chaotic star orbits.
34
Figure 2.2: An example of the orbit of a star in a spherical potential. An example star has
been put into an orbit in the xy plane. Its orbit follows a rosette pattern, but it remains
in the xy plane. [These diagrams were plotted using data generated assuming a Plummer
potential: the potential lacks a deep central cusp.]
Figure 2.3: The orbit of a star in a attened (oblate) potential. An example star has been
put into an orbit inclined to the x y plane. The galaxy is attened in the z direction
with an axis ratio of 0.7. The orbit follows a rosette pattern, but the plane of the orbit
precesses. This illustrates the trajectory of a star in an oblate elliptical galaxy, for example.
2.14.2 The chaotic nature of many orbits
In chaotic systems, stars that initially move along similar paths will diverge, eventually
moving along very dierent orbits. The divergence in their paths is exponential in time,
which is the technical denition of chaos in dynamical systems. Their motion shows a
stretching and folding in phase space. This can be so even if there is no collective motion of
stars at all (f in equilibrium).
35
Figure 2.4: The orbit of a star in a triaxial potential. An example star has been put into
an orbit inclined to the x y plane. The galaxy has dierent dimensions in each of the x, y
and z directions. The orbit is complex and it maps out a region of space. This illustrates
the trajectory of a star in a triaxial elliptical galaxy, for example. (This simulation extends
over a longer time period than those of Figures 2.2 and 2.3.)
This stretching and folding in phase space can be appreciated using an analogy. When
making bread, a bakers dough behaves essentially as a uid. Dough is incompressible, but
that does not prevent the baker stretching it in one direction and shrinking it in others,
and then folding it back. So while the dough keeps much the same overall shape, particles
initially nearby within it can be dispersed to widely dierent parts of it, through the repeated
stretching and folding. The same stretching and folding operation can take place for stars
in phase space. In fact it appears that phase space is typically riddled with regions where
f gets stretched in one direction while being shrunk in others. Thus nearby orbits tend to
diverge, and the divergence is exponential in time.
Simulations show that the timescale for divergence (the e-folding time) is T
diverge
T
cross
,
the crossing time, and gets shorter for higher star densities.
However, in some special cases, there is no chaos. These systems are said to be integrable.
If the dynamics is conned to one real-space dimension (hence two phase-space dimen-
sions) then no stretching-and-folding can happen, and orbits are regular. So in a spherical
system all orbits are regular. In addition, there are certain potentials (usually referred to
as St ackel potentials) where the dynamics decouples into three eectively one-dimensional
systems; so if some equilibrium f generates a St ackel potential, the orbits will stay chaos-
free. Also, small perturbations of non-chaotic systems tend to produce only small regions of
chaos,
1
and orbits may be well described through perturbation theory.
2.14.3 Integrals of the motion
To solve the collisionless Boltzmann equation for stars in a galaxy, we need further constraints
on the position and velocity. This can be done using integrals of the motion. These are simply
functions of the stars position x and velocity v that are constant along its orbit. They are
useful in potentials (x) that are constant over time. The distribution function f is also
constant along the orbit and can be written as a function of integrals of the motion.
1
If you ever come across the KAM theorem, thats basically it.
36
Examples of integrals of the motion are:
The total energy. The energy E of a particular star in a potential is constant over
time, so E(x, v) =
1
2
mv
2
+m(x). Because this is dependent on the mass of the star,
it is more normal to work with the energy per unit mass, which will be written as E
m
here. So E
m
=
1
2
v
2
+ is a constant.
In an axisymmetric potential (e.g. our Galaxy), the z-component of the angular mo-
mentum, L
z
, is conserved. Therefore L
z
is an integral of the motion in such a potential.
In a spherical potential, the total angular momentum L is constant. Therefore L is an
integral of the motion in this potential, and the x, y and z components of L are each
integrals of the motion.
An orbit is said to be regular if it has as many isolating integrals that can dene the orbit
unambiguously as there are spatial dimensions.
2.14.4 Isolating integrals and integrable systems
The collisionless Boltzmann equation tells us that df/dt = 0 (Section 2.12). As was discussed
earlier, if we move with a star in its orbit, f is constant locally as the star passes through
phase space at that instant in time. But if the system is in a steady state (the potential is
constant over time), f is constant along the stars path at all times. This means that the
orbits of stars map out constant values of f.
An integral of the motion for a star (e.g. energy per unit mass, E
m
) is constant (by
denition). They therefore dene a 5-dimensional hypersurface in 6-dimensional phase space.
The motion of a star is conned to that 5-dimensional surface in phase space. Therefore f
is constant over that hypersurface.
A dierent value of the isolating integral (e.g. a dierent value of E
m
) will dene a
dierent hypersurface. In turn, f will be dierent on this surface. So f is a function of the
isolating integral, i.e. f(x, y, z, v
x
, v
y
, v
z
) = fn(I
1
) where I
1
is an integral of the motion. I
1
here isolates a hypersurface. Therefore the integral of the motion is known as an isolating
integral.
Integrals that fail to conne orbits are called non-isolating integrals. A system is
integrable if we can dene isolating integrals that enable the orbit to be determined.
In integrable systems there are signicant simplications. Each orbit is (i) conned to
a three-dimensional toroidal subspace of six-dimensional phase space, and (ii) lls its torus
evenly.
2
Phase space itself is lled by nested orbit-carrying torithey have to be nested,
since orbits cant cross in phase space. Therefore the time-average of each orbit is completely
specied once we have specied which torus it is on; this takes three numbers for each orbit,
and these are called isolating integrals they are constants for each orbit of course. Think of
the isolating integrals as a coordinate system that parameterises orbital tori; transformations
to a dierent set of isolating integrals is like a coordinate transformation.
If isolating integrals exist, then any f that depends only on them will automatically
satisfy the collisionless Boltzmann equation. Conversely, since orbits ll their tori evenly,
any equilibrium f cannot depend on location on the tori, it can only depend on the tori
themselves, i.e., on the isolating integrals. This result is known as Jeans theorem.
2.14.5 The Jeans Theorem
The Jeans Theorem is an important result in stellar dynamics that states the importance
of integrals of the motion in solving the collisionless Boltzmann equation for gravitational
2
These two statements are important results from Hamiltonian dynamical systems which we wont try to
prove here. But the statements that follow in this section are straightforward consequences of (i) and (ii).
37
potentials that do not change with time. It was named after its discoverer, the English
astronomer, physicist and mathematician Sir James Hopwood Jeans (18771946).
It states that any steady-state solution of the collisionless Boltzmann equation depends
on the phase-space coordinates only through integrals of the motion in the galaxys potential,
and any function of the integrals yields a steady-state solution of the collisionless Boltzmann
equation.
This means that in a potential that does not change with time, we can express the
collisionless Boltzmann equation in terms of integrals of motion, and then solve for the
distribution function f in terms of those integrals of motion. We can then convert the
solution of f in terms of the integrals to a solution for f in terms of the space and velocity
coordinates. For example, if the energy per unit mass E
m
and total angular momentum
components L
x
and L
y
are constant for each star in some potential, then we can solve for f
uniquely as a function of E
m
, L
x
and L
y
. Then we can convert from E
m
, L
x
and L
y
to give
f as a function of (x, y, z, v
x
, v
y
, v
z
).
You should be wary of Jeans theorem, especially when people tacitly assume it, because
as we saw, it assumes that the system is integrable, which is in general not the case.
2.15 Spherical Systems
2.15.1 Solving for f in spherical galaxies
The Jeans Theorem does apply in spherical systems of stars, such as spherical elliptical
galaxies. As a consequence, f can depend on (at most) three integrals of motion in a
spherical system. The simplest case is for f to be a function of the energy of the stars only.
(Since we are considering bound systems, f = 0 for E > 0 always.) To nd an equilibrium
solution, we only have to satisfy Poissons equation.
The total energy of a star of mass m moving with a velocity v is E =
1
2
mv
2
+ m,
where is the gravitational potential at the point where the star is situated. Here it is more
convenient to use the energy per unit mass E
m
=
1
2
v
2
+ .
A spherical galaxy can be described very simply by a spherical polar coordinate system
(r, , ) with the origin at the centre. Poissons equation relates the Laplacian of the grav-
itational potential at a point to the local mass density as
2
= 4G. In a spherical
polar coordinate system the Laplacian of any scalar function A(r, , ) is

2
A
1
r
2

r
_
r
2
A
r
_
+
1
r
2
sin

_
sin
A

_
+
1
r
2
sin
2

2
A

2
(2.59)
(a standard result from vector calculus: see Appendix B).
In a spherically symmetric galaxy that does not change with time, the potential is a
function of the radial distance r from the centre only. So / = 0 and / = 0.
Therefore,

2
=
1
r
2
d
dr
_
r
2
d
dr
_
. (2.60)
Substituting this into the Poisson equation,
1
r
2
d
dr
_
r
2
d
dr
_
= 4G . (2.61)
The distribution function f is related to the number density n of stars by
n =
_
f d
3
v
38
(from Equation 2.35), and in this case f is a function of energy per unit mass: f = f(E
m
).
We can relate this to the density using = mn where m is the mean mass of a star, giving,
= m
_
f d
3
v . (2.62)
This integral is over all velocities. We can convert from d
3
v to dv by considering a thin
spherical shell in a space dened by the three velocity components, which gives d
3
v = 4v
2
dv.
So
= 4 m
_
fv
2
dv . (2.63)
Note that this integration can be performed over velocity at each and every point in the
galaxy, so this is (r).
We must determine the limits on this integral. For any particular point in the galaxy
(i.e. any value of r), the minimum possible velocity is 0, which occurs when a star moving
on a radial orbit reaches its maximum distance from the centre at that point. The maximum
velocity occurs when a star has the greatest possible energy (E
m
= 0, which would allow a
star to move out from the point to arbitrary distance). Therefore the maximum velocity is
v =
_
2(r). So the integration is from velocity v = 0 to
_
2(r). So,
1
r
2
d
dr
_
r
2
d
dr
_
= (4)
2
Gm
_

2(r)
0
f v
2
dv . (2.64)
We can convert this integral to an integral over energy per unit mass. E
m
=
1
2
v
2
+ gives
dE
m
= v dv at a xed position (and hence for a constant ). The maximum possible energy
per unit mass is 0, while the minimum possible value at a radius r would be given by a star
that is stationary at that point: E
m
= (r) (which is of course negative). So, at any radius
r,
1
r
2
d
dr
_
r
2
d
dr
_
= (4)
2

2 Gm
_
0
(r)
_
E
m
(r) f(E
m
) dE
m
, (2.65)
on substituting v =
_
2(E
m
) .
It is usual in Equation 2.64 to take f(v) as given and to try to solve for (r) and hence
(r); this is a nonlinear dierential equation. In Equation 2.65 we would normally take as
given, and try to solve for f(E
m
); this is a linear integral equation.
There are f(E
m
) models in the literature, and you can always concoct a new one by
picking some (r), computing (r) and then solving Equation 2.65 numerically. Note that
the velocity distribution is isotropic for any f(E
m
). If f depends on other integrals of
motion, say angular momentum L or its z component, or both thus f(E
m
, L
2
, L
z
) then
the velocity distribution will be anisotropic, and there are many examples of these around
too.
2.15.2 Example of a spherical, isotropic distribution function: the Plum-
mer potential
As discussed earlier, the Plummer potential has a gravitational potential and a mass
density at a radial distance r from the centre that are given by
(r) =
GM
tot

r
2
+a
2
, (r) =
3M
tot
4
a
2
(r
2
+a
2
)
5/2
, (2.29) and (2.30)
where M
tot
is the total mass of the galaxy and a is a constant. The distribution function for
the Plummer model is related to the density by Equation 2.63. It can be shown that these
(r) and (r) forms give a solution,
f(E
m
) =
24

2
7
3
a
2
G
5
M
4
tot
m
(E
m
)
7
2
. (2.66)
39
This can be veried by inserting in Equation 2.64, although this is not trivial to do. This
result gives the distribution function f as a function only of the energy per unit mass E
m
.
To calculate f for any point (x, y, z, v
x
, v
y
, v
z
) in phase space, we need only to calculate E
m
from these coordinates and then calculate the value of f associated with that E
m
.
2.15.3 Example of a spherical, isotropic distribution function: the isother-
mal sphere
The isothermal sphere was introduced in Section 2.8.3. The density prole was given in
Equation 2.34. The isothermal sphere is dened by analogy with a Maxwell-Boltzmann gas,
and therefore the distribution function as a function of the energy per unit mass E
m
is given
by,
f(E
m
) =
n
0
(2
2
)
3
2
exp
_

E
m

2
_
=
n
0
(2
2
)
3
2
exp
_

1
2
v
2
+

2
_
, (2.67)
where
2
is a velocity dispersion and acts in this distribution like a temperature does in a
gas. n
0
is a constant. Integrating over velocities gives
n(r) =
_
f d
3
v =
_

0
f . 4 v
2
dv =
4 n
0
(2
2
)
3
2
exp
_

2
__

0
v
2
exp
_

v
2
2
2
_
dv
= n
0
exp
_

(r)

2
_
, (2.68)
using the standard integral
_

0
e
ax
2
dx =

/2

a . Converting this to density (r) using


= mn, where m is the mean mass of the stars, we get,
(r) =
0
exp
_

(r)

2
_
, and equivalently, (r) =
2
ln
_
(r)

0
_
, (2.69)
where
0
is a constant. Using this, Poissons equation (
2
= 4G) in a spherically
symmetric potential becomes on substituting for d/dr,
d
dr
_
r
2
d ln
dr
_
=
4G

2
r
2
, (2.70)
for which the solution is
(r) =

2
2Gr
2
(2.71)
(see Equation 2.34). As already commented in Section 2.8.3, the isothermal sphere has
innite mass! (A side eect of this is that the boundary condition () = 0 cannot be used,
which we why we needed the redundant-looking constant
0
in Equations 2.67 and 2.68.)
Nevertheless, it is often used as a model, with some large-r truncation assumed, for the dark
haloes of disc galaxies.
The same (r) can be produced by many dierent f, all having dierent velocity distri-
butions.
2.16 Observable and Measurable Quantities
The phase space distribution f is usually very dicult to measure observationally, because
of the challenges of measuring the distribution of stars over space and particularly over
velocity. Velocity components along the line of sight can be measured spectroscopically from
a Doppler shift. However, transverse velocity components cannot be measured directly for
galaxies beyond our own (or at least beyond the Local Group). As a function of seven
40
variables (six of the phase space, plus time), the function f can be awkward to compute
theoretically. It is therefore more convenient to use quantities related to f.
The number density n of stars in space can be measured observationally by counting
more luminous stars for nearby galaxies, or from the observed intensity of light for more
distant galaxies. Star counts combined with estimates of the distances of individual stars
can provide n as a function of position within our Galaxy. For a distant galaxy, converting
the intensity along the line of sight of the integrated light from large numbers of stars in to
number densities a process known as deprojection requires assumptions about the stellar
populations and their three-dimensional distribution. Nevertheless, reasonable attempts can
be made in many instances.
Spectroscopy provides mean velocities v
r
) along the line of sight through a galaxy, and
the widths of absorption lines provide velocity dispersions
r
along the line of sight. These
mean velocities will be weighted according to the numbers of stars.
It is therefore much more convenient to calculate quantities involving number densities
n, mean velocities and velocity dispersions from f. These quantities can then be compared
with observations more directly. A series of equations called the Jeans Equations allow this
to be done.
2.17 The Jeans Equations
The Jeans Equations relate number densities, mean velocities, velocity dispersions and the
gravitational potential. They were rst used in stellar dynamics by Sir James Jeans in 1919.
It is useful to derive equations for the quantities
n =
_
f d
3
v ,
n v
i
) =
_
v
i
f d
3
v ,
n
2
ij
=
_
(v
i
v
i
)) (v
j
v
j
)) f d
3
v , (2.72)
by taking moments of the collisionless Boltzmann equation (expressed in the Cartesian vari-
ables x
i
and v
i
).
ij
is a velocity dispersion tensor: it is discussed in more detail below.
The collisionless Boltzmann equation gives (Equation 2.43)
f
t
+
3

i=1
_
dx
i
dt
f
x
i
+
dv
i
dt
f
v
i
_
= 0 ,
or equivalently,
f
t
+
3

i=1
v
i
f
x
i

i=1

x
i
f
v
i
= 0 ,
on substituting for the components of acceleration from dv/dt = .
To derive the rst of the Jeans Equations, we shall consider the zeroth moment by
integrating this equation over all velocities.
_
_
f
t
+
3

i=1
v
i
f
x
i

i=1

x
i
f
v
i
_
d
3
v =
_
0 . d
3
v . (2.73)

_
f
t
d
3
v +
3

i=1
_
v
i
f
x
i
d
3
v
3

i=1

x
i
_
f
v
i
d
3
v = 0 ,
41
(with the right hand being zero because it is a denite integral). Some of these terms can
be simplied, particularly by noting the integration is performed over all velocities at each
position and time.
But
_
f
t
d
3
v =

t
_
f d
3
v because t and v
i
s are independent
=
n
t
because n =
_
f d
3
v,
and
_
v
i
f
x
i
d
3
v =
_
(v
i
f)
x
i
d
3
v because v
i
s and x
i
s are independent
=

x
i
_
v
i
f d
3
v because x
i
s and v
i
s are independent
=
(nv
i
) )
x
i
on substituting nv
i
) =
_
v
i
fd
3
v.
and
_

x
i
f
v
i
d
3
v =

x
i
_
f
v
i
d
3
v because x
i
s and are independent of v
i
s
=

x
i
(0) because f 0 as [v
i
[ (by analogy
with the divergence theorem)
= 0 .
Substituting for these terms,
n
t
+
3

i=1
nv
i
)
x
i
= 0 , (2.74)
which is a continuity equation. This is the rst of the Jeans Equations.
To derive the second of the Jeans Equations, we consider the rst moment of the collision-
less Boltzmann equation by multiplying by v
i
and integrating over all velocities. Multiplying
the C.B.E. throughout by v
i
, we obtain,
v
i
f
t
+v
i
3

j=1
v
j
f
x
j
v
i
3

j=1

x
j
f
v
j
= 0 , (2.75)
where the summation is performed over an integer j because we have introduced a velocity
component v
i
. Note that the use of v
i
means that we are considering one particular velocity
component only at this stage, i.e. one value of i from i = 1, 3. Integrating this over all
velocities,
_
_
_
v
i
f
t
+
3

j=1
v
i
v
j
f
x
j

j=1
v
i

x
j
f
v
j
_
_
d
3
v =
_
0 . d
3
v . (2.76)

_
v
i
f
t
d
3
v +
3

j=1
_
v
i
v
j
f
x
j
d
3
v
3

j=1
_
v
i

x
j
f
v
j
d
3
v = 0 .
42
But
_
v
i
f
t
d
3
v =
_
(v
i
f)
t
d
3
v because v
i
and t are independent
=

t
_
v
i
f d
3
v
=

t
(nv
i
)) because nv
i
) =
_
v
i
f d
3
v,
and
_
v
i
v
j
f
x
j
d
3
v =
_

x
j
(v
i
v
j
f) d
3
v because v
i
and v
j
are independent of x
i
=

x
j
_
v
i
v
j
f d
3
v because x
i
and v
i
s are independent
=
(nv
i
v
j
) )
x
j
on substituting nv
i
v
j
) =
_
v
i
v
j
fd
3
v
and
_
v
i

x
j
f
v
j
d
3
v =

x
j
_
v
i
f
v
j
d
3
v because x
j
s and are independent of v
i
s
But
(v
i
f)
v
j
= v
i
f
v
j
+ f
v
i
v
j
v
i
f
v
j
=
(v
i
f)
v
j
f
v
i
v
j
and
v
i
v
j
= 1 if i = j
= 0 if i ,= j because v
i
and v
j
are independent if i ,= j

v
i
v
j
=
ij
v
i
f
v
j
=
(v
i
f)
v
j

ij
f .
So
_
v
i

x
j
f
v
j
d
3
v =

x
j
_ _
(v
i
f)
v
j

ij
f
_
d
3
v
=

x
j
__
(v
i
f)
v
j
d
3
v
ij
_
f d
3
v
_
=

x
j
_
0
ij
n
_
because v
i
f 0 as [v
i
[
=

x
j

ij
n .
Substituting for these terms,
(nv
i
))
t
+
3

j=1

x
j
_
nv
i
v
j
)
_

3

j=1
_

x
i

ij
n
_
= 0 .
So,
(nv
i
))
t
+
3

j=1

x
j
_
nv
i
v
j
)
_
=

x
i
n , (2.77)
for each of i = 1, 2, 3. This is the second of the Jeans Equations.
We need to introduce a tensor velocity dispersion
ij
dened so that
n
2
ij

_
(v
i
v
i
)) (v
j
v
j
)) f d
3
v (2.78)
43
for i, j = 1, 3 (see Equation 2.72 above). This is used to represent the spread of velocities in
each direction. It is a symmetric tensor and we can choose some coordinate system in which
it is diagonal (i.e.
11
,= 0,
22
,= 0,
33
,= 0, but all the other elements are zero). This is
known as the velocity ellipsoid. For example, in a cylindrical coordinate system, we might use
elements such as
RR
,

and
zz
. If the velocity dispersion is isotropic,
11
=
22
=
33
,
which we might simplify by writing as only.
Rearranging Equation 2.78 and multiplying out,

2
ij
=
1
n
_
_
v
i
v
i
)
_ _
v
j
v
j
)
_
f d
3
v
=
1
n
_
_
v
i
v
j
v
i
v
j
) v
i
)v
j
+v
i
) v
j
)
_
f d
3
v
=
1
n
_
v
i
v
j
f d
3
v
1
n
_
v
i
v
j
) f d
3
v
1
n
_
v
i
)v
j
f d
3
v +
1
n
_
v
i
) v
j
) f d
3
v
=
1
n
_
v
i
v
j
f d
3
v v
j
)
1
n
_
v
i
f d
3
v v
i
)
1
n
_
v
j
f d
3
v + v
i
) v
j
)
1
n
_
f d
3
v
because v
i
) and v
j
) are constants
= v
i
v
j
) v
j
) v
i
) v
i
) v
j
) + v
i
) v
j
) from Equation 2.72.
So,

2
ij
= v
i
v
j
) v
i
) v
j
) (2.79)
This can be used to nd v
i
v
j
) using
v
i
v
j
) =
2
ij
+ v
i
) v
j
)
Substituting for v
i
v
j
) into the second of the Jeans Equations (Equation 2.77),
(nv
i
))
t
+
3

j=1
_

x
j
_
n
2
ij
_
+

x
j
_
nv
i
)v
j
)
_
_
=

x
i
n ,
for each of i = 1, 2 and 3. Therefore,
v
i
)
n
t
+ n
v
i
)
t
+
3

j=1

x
j
_
n
2
ij
_
+
3

j=1

x
j
_
nv
i
)v
j
)
_
=

x
i
n . (2.80)
We can eliminate the 1st and 4th terms using the rst of the Jeans Equations (Equation 2.74).
Multiplying that equation throughout by v
i
),
v
i
)
n
t
+ v
i
)
3

j=1

x
j
_
nv
j
)
_
= 0
v
i
)
n
t
+
3

j=1
v
i
)

x
j
_
nv
j
)
_
= 0 (2.81)
But

x
j
_
nv
i
)v
j
)
_
= v
i
)

x
j
_
nv
j
)
_
+ n v
j
)
v
i
)
x
j
Substituting for v
i
)

x
j
(nv
j
)),
v
i
)
n
t
+
3

j=1
_

x
j
(nv
i
)v
j
)) n v
j
)
v
i
)
x
j
_
= 0
44
v
i
)
n
t
+
3

j=1

x
j
(nv
i
)v
j
)) = n
3

j=1
v
j
)
v
i
)
x
j
Substituting this into Equation 2.80, we obtain,
n
v
i
)
t
+ n
3

j=1
v
j
)
v
i
)
x
j
= n

x
i

j=1

x
j
(n
2
ij
) , (2.82)
where i can be any of 1, 2 or 3. This is a third Jeans Equation.
This can also be expressed as,
dv)
dt
=
1
n
(n
2
) . (2.83)
where v) is the mean velocity vector, t is the time, is the potential, n is the number
density of stars and
2
represents the tensor
2
ij
. Note that here d/dt is not /t, but
dv
dt
_

Dv
Dt
_
=
v
t
+v v , (2.84)
which is sometimes called the convective derivative; it is also sometimes written as D/Dt to
emphasise that it is not simply

t
.
This is similar to the Euler equation in uid dynamics. An ordinary uid has
dv)
dt
=
p

+ viscous terms , (2.85)


where the pressure p arises because of the high rate of molecular encounters, which also
leads to the equation of state, and p is isotropic. In stellar dynamics, the stars behave
like a uid in which () behaves like a pressure, but it is anisotropic. Indeed, this
anisotropy is the reason that it is represented by a tensor, whereas in an ordinary uid the
pressure is represented by a scalar. A related fact is that in the ow of an ordinary uid
the particle paths and streamlines coincide, whereas stellar orbits and the streamlines v)
do not generally coincide.
The Jeans Equations have been represented here in terms of the number density n of
stars. However, it is possible to work instead with the mean mass density in space instead of
n. The Jeans Equations can be used for all stars in a galaxy, but sometimes they are used for
subpopulations in our Galaxy (e.g G dwarfs, K giants). If they are used for subpopulations,
remains the total gravitational potential of all matter (including dark matter), but the
velocities and number densities refer to the subpopulations.
2.18 The Jeans Equations in an Axisymmetric System, e.g.
the Galaxy
Using cylindrical coordinates (R, , z) and assuming axisymmetry (so / = 0), the second
Jeans Equation is

t
(nv
R
)) +

R
(nv
2
R
)) +

z
(nv
R
v
z
)) +
n
R
_
v
2
R
) v
2

)
_
= n

R
for the R direction

t
(nv

)) +

R
(nv
R
v

)) +

z
(nv

v
z
)) +
2 n
R
v
R
v

) = 0
for the direction
45

t
( nv
z
) ) +

R
( nv
R
v
z
) ) +

z
( nv
2
z
) ) +
nv
R
v
z
)
R
= n

z
for the z direction. (2.86)
In a steady state, where the potential does not change with time, we can use /t = 0.
This axisymmetric form of the second Jeans Equation is useful in spiral galaxies, such as
our own Galaxy, provided that we neglect any change in the potential in the direction
(although there might be a dependence if the potential is deeper in the spiral arms).
Meanwhile, the rst of the Jeans Equations in a cylindrical coordinate system with
axisymmetric symmetry (/ = 0) is,
n
t
+

R
( Rnv
R
) ) +

z
( nv
z
) ) = 0 . (2.87)
2.19 The Jeans Equations in a Spherically Symmetric System
The second Jeans Equation in a steady-state (/t = 0) spherically-symmetric (/ = 0)
galaxy in a spherical polar coordinate system (r, , ) is
d
dr
( nv
2
r
) ) +
n
r
_
2v
2
r
) v
2

) v
2

)
_
= n
d
dr
. (2.88)
This might be used, for example, for a spherical elliptical galaxy.
We can calculate the gradient in the potential in this spherical case very sinply. Using
the general result that the acceleration due to gravity is g = , that g = GM(r)/r
2
in a spherically symmetric system where M(r) is the mass interior to the radius r, and that
= d/dr in a spherical system, we get d/dr = GM(r)/r
2
.
As a simple test to see whether this really does work, let us make a crude model of
our Galaxys stellar halo. We shall assume that the halo is spherical, assume a logarithmic
potential of the form (r) = v
2
0
ln r where v
0
is a constant, assume there are isotropic
velocity components (i.e. v
2
r
) = v
2

) = v
2

) =
2
, where is a constant), and assume that
the star number density of the halo can be approximated by n(r) r
l
where l is a constant.
Equation 2.88 becomes
d
dr
_
n
2
)
_
+
n
r
( 0 ) = n
d
dr
,
on substituting for the velocity terms. Substituting for (r) and n(r), and cancelling r, we
obtain = v
0
/

l. For the Milky Ways halo, observations show that n r


3.5
(i.e. l = 3.5),
while v
0
as measured from gas on circular orbits is 220 kms
1
, and rotation is negligible
(which is a requirement for v
2

=
2
etc.). So we expect 220 kms
1
/

3.5 120 kms


1
.
And it is.
2.20 Example of the Use of the Jeans Equations: the Surface
Mass Density of the Galactic Disc
The Jeans Equations can be applied to our Galaxy to measure the surface mass density of
the Galactic disc at the solar distance from the centre using observations of the velocities
along the line of sight of stars lying some distance above or below the Galactic plane. The
surface mass density is the mass per unit area of the disc when viewed from from a great
distance. It is expressed in units of kg m
2
, or more commonly solar masses per square
46
parsec (M

pc
2
). This analysis is important because it allows the quantity of dark matter
in the disc to be estimated. Determining whether there is dark matter in the Galactic disc
is a very important constraint on the nature of dark matter.
The second Jeans Equation in a cylindrical coordinate system (R, , z) centred on the Galaxy,
with z = 0 in the plane and R = 0 at the Galactic Centre states for the z direction that
( nv
z
) )
t
+
( nv
R
v
z
) )
R
+
( nv
2
z
) )
z
+
nv
R
v
z
)
R
= n

z
(Equation 2.86), where n is the star number density, v
R
and v
z
are the velocity components
in the R and z directions, (R, z, t) is the Galactic gravitational potential and t is time.
The Galaxy is in a steady state, so n does not change with time. Therefore the rst term
(nv
z
))/t = 0.
Observations show that
( nv
R
v
z
) )
R
0 and
nv
R
v
z
)
R
0 ,
as is to be expected because of the cancelling of positive and negative terms of the z-
components of the velocity. Therefore,
( nv
2
z
) )
z
= n

z
. (2.89)
v
2
z
) is the mean square velocity in the direction perpendicular to the Galactic plane.
Poissons equation gives
2
= 4G, where is the mass density at a point. In cylindrical
coordinates the Laplacian is

2
=
1
R

R
_
R

R
_
+
1
R
2

2
+

2

z
2
.
If we observe stars above and below the Galactic plane, all at the same Galactocentric radius
R, we can neglect the /R and
2
/
2
terms.

z
2
= 4G ,
and so

z
_

1
n

z
_
nv
2
z
)
_
_
= 4G .
Integrating perpendicular to the galactic plane from z to z, the surface mass density within
a distance z of the plane at a Galactocentric radius R is
(R, z) =
_
z
z
dz

=
_
z
z
1
4G

z
_

1
n

z
_
nv
2
z
)
_
_
dz

47
=
1
4G
_
1
n

z
_
nv
2
z
)
_
_
z
z

=z
=
1
2Gn

z
_
nv
2
z
)
_

z
,
assuming symmetry about z = 0. Therefore the surface mass density within a distance z of
the plane at the solar Galactocentric radius R
0
is
(R
0
, z) =
1
2Gn

z
_
nv
2
z
)
_

z
. (2.90)
If the star densities n can be measured as a function of height z from the plane and if the
z-component of the velocities v
z
can be measured as spectroscopic radial velocities, we can
solve for (R
0
, z) as a function of z. This gives, after modelling the contribution from the
dark matter halo, the mass density of the Galactic disc.
The analysis can be performed on some subclass of stars, such as G giants or K giants.
In this case the number density n of stars in space is that of the subclass. Number counts
of stars towards the Galactic poles, combined with estimates of the distances to individual
stars, give n. Spectroscopic observations give radial velocities (the velocity components along
the line of sight) through the Doppler eect. By observing towards the Galactic poles, the
radial velocities are the same as the v
z
components.
This analysis gives (R
0
, z) as a function of z. The value increases with z as a greater
proportion of the stars of the disc are included, until all the disc matter is included. (R
0
, z)
will still increase slowly with z beyond this as an increasing amount mass from the dark
matter halo is included. Indeed, it is necessary to determine the contribution
d
(R
0
) from the
disc alone to the observed data. An additional complication is that in measuring (nv
2
z
))/z
as a function of z, we are dealing with the dierential of observed quantities. This means
that the eects of observational errors can be considerable.
The rst measurement of the surface density of the Galactic disc was carried out by
Oort in 1932. More modern attempts were carried out in the 1980s by Bahcall and by
Kuijken and Gilmore. There has been considerable debate about the interpretation of results.
Early studies claimed evidence of dark matter in the Galactic disc, but more recently some
consensus has developed that there is little dark matter in the disc itself, apart from the
contribution from the dark matter halo that extends into the disc. A modern value is
48

d
(R
0
) = 50 10 M

pc
2
. The absence of signicant dark matter in the disc indicates that
dark matter does not follow baryonic matter closely on a small scale.
2.21 N-body Simulations
An alternative approach that can be adopted to study the dynamics of stars in galaxies
is to use N-body simulations. In these analyses, the system of stars is represented by a
large number of particles and computer modelling is used to trace the dynamics of these
particles under their mutual gravitational attractions. These simulations usually determine
the positions of the test particles at each of a series of time steps, calculating the changes
in their positions between each step. It is possible to add further particles to trace gas and
dark matter, although the gas must be made collisional.
The individual particles in a galaxy simulation, however, do not correspond to stars. It is
impossible to represent every star in a galaxy in N-body simulations. In practice, the limits
on computational power allows only 10
5
to 10
8
particles, whereas there may be as many
as 10
12
stars in the galaxies being modelled. The appropriate interpretation of simulation
particles is as Monte-Carlo samplers of the distribution function f.
In Section 2.6 we found that the ratio of the relaxation time to the crossing time was
T
relax
/T
cross
N/12 ln N for a system of N particles. It follows that a system that is
modelled computationally by too few particles will have a relaxation time that is too short,
and may experience the eects of two-body encounters. As a consequence, the particles in
N-body simulations have to be made collisionless articially. The standard way of doing this
is to replace the 1/r gravitational potential by (r
2
+a
2
)

1
2
, which amounts to smearing out
the mass on the softening length scale a.
Early N-body computer codes performed calculations for each time step that took a time
that depended on the number of particles as N
2
. Modern codes perform faster computations
by treating distant particles dierently to nearby particles. Tree codes combine the eects
of number of distant particles together. This increases their eciency and the computation
times scale as only N ln N.
N-body simulations are widely used now to study the evolution of galaxies, and an
active research area at present is to incorporate gas dynamics in them. In contrast to
standard N-body methods, smoothed particle hydrodynamics (SPH) are often used to study
the gas in galaxies. Modern simulations include the eects of dark matter alongside stars
and gas. They can follow the collapse of clumps of dark matter in the early Universe that
lead to the formation of galaxies. Simulations can also follow the growth of structure in the
Universe as gravitational attraction produced the clustering of galaxies observed today. N-
body simulations can model the eects of large changes in gravitational potentials, whereas
analytical methods can nd these more challenging.
49
Chapter 3
The Interstellar Medium
3.1 An Introduction to the Interstellar Medium
The interstellar medium (ISM) of a galaxy consists of the gas and dust distributed between
the stars. The mass in the gas is much larger than that in dust, with the mass of dust in
the disc of our Galaxy 0.1 of the mass of gas. Generally, the interstellar medium amounts
to only a small fraction of a galaxys luminous mass, but this fraction is strongly correlated
with the galaxys morphological type. This fraction is 0 % for an elliptical galaxy, 125 %
for a spiral (the gure varies smoothly from type Sa to Sd), and 1550 % for an irregular
galaxy.
The interstellar gas is very diuse: in the plane of our Galaxy, where the Galactic gas
is at its densest, the particle number density is 10
3
to 10
9
atomic nuclei m
3
. Some of
this gas is in the form of single neutral atoms, some is in the form of simple molecules, some
exists as ions. Whether gas is found as atoms, molecules or ions depends on its temperature,
density and the presence of radiation elds, primarily the presence of ultraviolet radiation
from nearby stars. Note that the density of the gas is usually expressed as the number of
atoms, ions and molecules per unit volume; here they will be expressed in S.I. units of m
3
,
but textbooks, reviews and research papers still often use cm
3
(with 1 cm
3
10
6
m
3
, of
course).
The interstellar medium in a galaxy is a mixture of gas remaining from the formation
of the galaxy, gas ejected by stars, and gas accreted from outside (such as infalling diuse
gas or the interstellar medium of other galaxies that have been accreted). The ISM is very
important to the evolution of a galaxy, primarily because it forms stars in denser regions. It
is important observationally. It enables us, for example, to observe the dynamics of the gas,
such as rotation curves, because spectroscopic emission lines from the gas are prominent.
The chemical composition is about 90 % hydrogen, 9% helium plus a trace of heavy
elements (expressed by numbers of nuclei). The heavy elements in the gas can be depleted
into dust grains. Dust consists preferentially of particles of heavy elements.
Individual clouds of gas and dust are given the generic term nebulae. However, the
interstellar gas is found to have a diverse range of physical conditions, having very dierent
temperatures, densities and ionisation states. There are several distinct types of nebula, as
is described below.
3.2 Background to the Spectroscopy of Interstellar Gas
The gas in the interstellar medium readily emits detectable radiation and can be studied
relatively easily. The gas is of very low density compared to conditions on the Earth, even
compared to many vacuums created in laboratories. Therefore spectral lines are observed
from the interstellar gas that are not normally observed in the laboratory. These are referred
50
to as forbidden lines, whereas those that are readily observed under laboratory conditions
are called permitted lines. Under laboratory conditions, spectral lines with low transition
probabilities are forbidden because the excited states get collisionally de-excited before
they can radiate. In the ISM, collisional times are typically much longer than the lifetimes
of those excited states that only have forbidden transitions. So forbidden lines are observable
from the ISM, and in fact they can dominate the spectrum.
Astronomy uses a particular notation to denote the atoms and ions that is seldom en-
countered in other sciences. Atoms and ions are written with symbols such as HI, HII, HeI
and HeII. In this notation, I denotes a neutral atom, II denotes a singly ionised positively
charged ion, III denotes a doubly ionised positive ion, etc. So, HI is H
0
, HII is H
+
, HeI is
He
0
, HeII is He
+
, HeIII is He
2+
, Li I is Li
0
, etc. A negatively charged ion, such as H

, is
indicated only as H

, although few of these are encountered in astrophysics. Square brackets


around the species responsible for a spectral line indicates a forbidden line, for example [OII].
3.3 Cold Gas: the 21cm Line of Neutral Hydrogen
Cold gas emits only in the radio and the microwave region, because collisions between atoms
and the radiation eld (e.g. from stars) are too weak to excite the electrons to energy levels
that can produce optical emission.
The most important ISM line from cold gas is the 21 cm line of atomic hydrogen (HI). It
comes from the hyperne splitting of the ground state of the hydrogen atom (split because
of the coupling of the nuclear and electron spins). The energy dierence between the two
states is E = 9.410
25
J = 5.910
6
eV. This produces emission with a rest wavelength

0
= hc/E = 21.1061cm and a rest frequency
0
= E/h = 1420.41MHz. In this process,
hydrogen atoms are excited into the upper state through collisions (collisional excitation),
but these collisions are rare in the low densities encountered in the cold interstellar medium.
The transition probability is A = 2.87 10
15
s
1
. The lifetime of an excited state is
1/A = 11 million years. This is still much shorter than times between collisions, which
allows de-excitation to occur through spontaneous emission rather than through collisional
de-excitation.
The 21cm spin ip transition itself cannot be observed in a laboratory because of the
extremely low transition probabilities, but the split ground state shows up in the laboratory
51
through of the hyperne splitting of the Lyman lines in the ultraviolet. In the ISM, the
21cm line is observed primarily in emission, but can also be observed in absorption against
a background radio continuum source.
HI observations have many uses. One critically important application is to measure the
orbital motions of gas to determine rotation curves in our own Galaxy and in other galaxies.
HI observations can map the distribution of gas in and around galaxies.
3.4 Cold Gas: Molecules
Molecular hydrogen (H
2
) is very dicult to detect directly. It has no radio lines, which
is unfortunate, since it prevents the coldest and densest parts of the ISM being observed
directly. There is H
2
band absorption in the far ultraviolet, but this can only be observed
from above the Earths atmosphere.
What saves the situation somewhat is that other molecules do emit in the radio/microwave
region. Molecular energy transitions can be due to changes in the electron energy levels, and
also to changes in vibrational and rotational energies of the molecules. All three types of
energy are quantised. Transitions between the electron states are in general the most en-
ergetic, and can produce spectral lines in the optical, ultraviolet and infrared. Transitions
between the vibrational states can produce lines in the infrared. Transitions between the
rotational states are in general the least energetic and produce lines in the radio/microwave
region.
CO has strong lines at 1.3mm and 2.6mm from transitions between rotational states. CO
is particularly useful as a tracer of H
2
molecules on the assumption that the densities of the
two are proportional. Mapping CO density is therefore used to determine the distribution
of cold gas in the ISM.
Cold, dense molecular gas is concentrated into clouds. These are relatively small in size
( 240 pc across). Temperatures are T 10 K and densities 10
8
to 10
11
m
3
. Molecular
clouds ll only a very small volume of the ISM but contain substantial mass. Regions of gas
in molecular clouds can experience collapse under its own gravitation to form stars. The
newly formed hot stars in turn irradiate the gas with ultraviolet light, ionising and heating
the gas.
Giant molecular clouds are larger regions of cold, molecular gas. Their masses are large
enough (up to 10
6
M

) that they can have perturbing gravitational eects on stars in the


disc of the Galaxy. Within our Galaxy they are mostly found in the spiral arms.
3.5 Hot Gas: HII Regions
Hot gas is readily observed in the optical region of the spectrum. Gas that is largely ionised
produces emission lines from electronic transitions as ions and some atoms revert to lower
energy states. The gas is therefore observed as nebulae.
An important kind of object is HII regions, which are regions of partially ionised hydro-
gen surrounding very hot young stars of O or B spectral types. Hot stars produce a large
ux of ultraviolet photons, and any Lyman continuum photons (i.e., photons with wave-
lengths < 912

A) will photoionise hydrogen producing a region of H


+
, i.e. HII ions. These
wavelengths correspond to energies > 13.6 eV, the ionisation energy from the ground state
of hydrogen. The ionised hydrogen then recombines with electrons to form neutral atoms.
But the hydrogen does not have to recombine into the ground state; it can recombine into
an excited state and then radiatively decay after that. Electrons will cascade down energy
levels, emitting photons as they do, in a process known as radiative decay.
This process produces a huge variety of observable emission lines and continua in the
ultraviolet, optical, infrared and radio parts of the spectrum. Free-bound transitions (in-
52
The optical spectrum of
the Orion Nebula. The
spectrum shows very strong
emission lines from species
such as HI, [OII] and [OIII].
volving free electrons combining with ions to become bound in atoms) produce continuum
radiation. Bound-bound transitions (involving electrons inside an atom moving to a dierent
energy level in the same atom) produce spectral lines.
Transitions in hydrogen atoms down to the rst excited level (n = 2) produce Balmer
lines, which lie in the optical. These are prominent in nebulae. Transitions down to the
ground (n = 1) state produce Lyman lines in the ultraviolet. In each series, the individual
lines are labelled , , , , ..., in order of decreasing wavelength. The transitions from n to
n 1 levels are the strongest. Therefore the line of any series is the strongest.
Lyman series lines of hydrogen are: Balmer series lines of hydrogen are:
Ly = 1216

A (in ultraviolet) H = 6563

A (in optical)
Ly 1026

A ( ) H 4861

A ( )
Ly 973

A ( ) H 4340

A ( )
H 4102

A ( )
H 3970

A ( )
Atoms in HII regions can also be collisionally excited. Atomic hydrogen has no levels
accessible at collision energies characteristic of HII regions (T 10
4
K) but NII, OII, SII,
OIII, NeIII all do. The [OIII] lines at 4959

A and 5007

A are particularly prominent. Some


of the most prominent lines in the optical spectra of HII regions, other than the hydrogen
lines listed above, are:
[OII] 3727

A HeI 5876

A
[NeIII] 3869

A [NII] 6548

A
[OIII] 4959

A [NII] 6584

A
[OIII] 5007

A
Colour optical images of HII regions of the kind used in popular astronomy books show
strong red and green colours: the red is produced mainly by the H line, while the green
is produced by [OIII] and H. HII regions are seen prominently in images of spiral and
irregular galaxies. Their emission lines dominate the spectra of late-type galaxies and are
valuable for use in measuring redshifts.
The photoionisation and recombination process in HII regions and planetary nebulae
produces, by a convenient accident, one Balmer photon for each Lyman continuum photon
from the hot star, so the ultraviolet ux from the star can be measured by observing an
53
optical spectrum of the HII region surrounding the hot star. The reason is basically that the
gas is opaque to Lyman photons and transparent to other photons, since almost all the H
atoms are in the ground state. A Lyman continuum photon initially from the star will get
absorbed by a hydrogen atom, producing a free electron. This electron will then be captured
into some bound state. If it gets captured to the ground state we are back where we started
(with a ground state atom and a Lyman continuum photon), so consider the case where the
electron is captured to some n > 1 state. Such a capture releases a free-bound continuum
photon which then escapes, and leaves an excited state which wants to decay to n = 1. If it
decays to n = 1 bypassing n = 2, it will just produce a Lyman photon which will get almost
certainly get absorbed again. Only if it decays to some n > 1 will a photon escape. In other
words, if the decay bypasses n = 2 it almost always gets another chance to decay to n = 2
and produce a Balmer photon that escapes. The Ly photons produced by the nal decay
from n = 2 to n = 1 random-walk through the gas as they get absorbed and re-emitted again
and again. The total Balmer photon ux thus equals the Lyman continuum photon ux.
One can then place the source star in an optical-ultraviolet colourmagnitude diagram, and
determine a colour temperature which is called the Zanstra temperature in this context.
HII regions and planetary nebulae also produce thermal continuum radiation. The pro-
cess that produces this is free-free emission: free electrons in the HII can interact with
protons without recombination, and the acceleration of the electrons in this process pro-
duces radiation. (Electrons can interact with other electrons in similar fashion as well, but
this produces no radiation because the net electric dipole moment does not change.) The re-
sulting spectrum is not blackbody because the gas is transparent to free-free photons: there
is no redistribution of the energy of the free-free photons. In fact the spectrum is quite
at at radio frequencies this is the same thing as saying that the time scale for free-free
encounters is 1/ for radio frequency .
Detailed analyses of the relative strengths of the emission lines from HII regions can
provide measurements of the temperatures, densities and chemical composition of the inter-
stellar gas. This is possible for nebulae in our Galaxy and for emission lines in other galaxies.
Measuring the chemical compositions of gas in galaxies is important in understanding how
the abundances of chemical elements vary from one place in the Universe to another.
3.6 Hot Gas: Planetary Nebulae
A planetary nebula is like a compact HII region, except that it surrounds the exposed core
of a hot, highly evolved star rather than a hot young star. The gas is ejected from the star
through mass loss over time. Ultraviolet photons from the star ionise the gas in manner
similar to HII regions, and the gas emits photons like a HII region. Emission processes are
similar to HII regions, but the density, temperature and ionisation state of the gas around a
planetary nebula can be somewhat dierent to a HII region.
Planetary nebulae are relatively luminous and have prominent emission lines. As such
they can be observed in other galaxies and can be detected at greater distances than many
individual ordinary stars. They can be used to trace the distribution and kinematics of stars
in other galaxies.
3.7 Hot Gas: Supernova Remnants
Supernovae eject material at very high velocities into the interstellar medium. This gas
shocks, heats and disrupts the ISM. Low density components of the ISM can be signicantly
aected, but dense molecular clouds are less strongly aected. Hot gas from supernovae can
even be ejected out of the Galactic disc into the halo of the Galaxy. Supernova remnants
have strong line emission. They expand into and mix with the ISM.
54
3.8 Hot Gas: Masers
In the highest density HII regions ( 10
14
m
3
), either very near a young star, or in a
planetary-nebula-like system near an evolved star, population inversion between certain
states becomes possible. The overpopulated excited state then decays by stimulated emis-
sion, i.e., it becomes a maser. An articial maser or laser uses a cavity with reecting walls to
mimic an enormous system, but in an astrophysical maser the enormous system is available
for free; so an astrophysical maser is not directed perpendicular to some mirrors but shines
in all directions. But as in an articial maser, the emission is coherent (hence polarised),
with very narrow lines and high intensity. Masers from OH and H
2
0 are known. Their high
intensity and relatively small size makes masers very useful as kinematic tracers.
3.9 Hot Gas: Synchrotron Radiation
Finally, well just briey mention synchrotron radiation. It is a broad-band non-thermal ra-
diation emitted by electrons gyrating relativistically in a magnetic eld, and can be observed
in both optical and radio. The photons are emitted in the instantaneous direction of electron
motion and polarised perpendicular to the magnetic eld. The really spectacular sources of
synchrotron emission are systems with jets (young stellar objects with bipolar outows, or
active galactic nuclei). It is synchrotron emission that lights up the great lobes of radio
galaxies.
3.10 Absorption-Line Spectra from the Interstellar Gas
If interstellar gas is seen in front of a continuum background light source, light from the
source is found to be absorbed at particular wavelengths. A number of interstellar lines and
molecular bands are seen in absorption. This process requires a relatively bright background
source in practice.
The molecular absorption can be very complex. Electron transitions in the molecules do
not produce single strong spectral lines, but a set of bands. This is because of the eect of
the various vibrational and rotational energy states of the molecules in the gas, combined
with the stronger electronic transitions.
Some of the interstellar absorption features are not well understood. One particular
problem is the diuse interstellar bands in the infrared. These are probably caused by
carbon molecules, possibly by polycyclic aromatic hydrocarbons (PAHs).
An interesting example of the importance of interstellar absorption lines concerns mea-
surements of the temperatures of cold interstellar CN molecules. Like most heteronuclear
molecules, CN has rotational modes which produce radio lines. The radio lines can be ob-
served directly, but more interesting are the optical lines that have been split because of
these rotational modes. Observations of cold CN against background stars reveal, through
the relative widths of the split optical lines, the relative populations of the rotational modes,
and hence the temperature of the CN. The temperature turns out to be 2.7 K, i.e., these cold
clouds are in thermal equilibrium with the microwave background. The temperature of in-
terstellar space was rst estimated to be 3 K in 1941, well before the Big Bang predictions
of 1948 and later, but nobody made the connection at the time.
3.11 The Components of the Interstellar Medium
It is sometimes convenient to divide the diuse gas in the interstellar medium into distinct
components, also called phases:
55
the cold neutral medium consisting of neutral hydrogen (HI) and molecules at tem-
peratures T 10 100 K and relatively high densities;
the warm neutral medium consisting of HI but at temperatures T 10
3
10
4
K of
lower densities;
the warm ionised medium consisting of ionised gas (HII) at temperatures T 10
4
K
of lower densities;
the hot ionised medium consisting of ionised gas (HII) at very high temperatures
T 10
5
10
6
K but very low densities.
These phases are pressure-conned and are stable in the long term. Ionisation by supernova
remnants is an important mechanism in producing the hot ionised medium. The cold neutral
medium makes up a signicant fraction 50% of the ISMs mass, but occupies only a very
small fraction by volume. Some other individual structures in the ISM, such as supernova
remnants, planetary nebulae, HII regions and giant molecular clouds, are not included in
these phases because they are not in pressure equilibrium with these phases.
3.12 Interstellar Dust
Interstellar dust consists of particles of silicates or carbon compounds. They are relatively
small, but have a broad range in size. The largest are 0.5 m with 10
4
atoms, but some
appear to have 10
2
atoms and thus are not signicantly dierent from large molecules.
Dust has a profound observational eect it absorbs and scatters light. Dust diminishes
the light of background sources, a process known as interstellar extinction. Examples of this
are dark nebulae and the zone of avoidance for galaxies at low galactic latitudes.
Consider light of wavelength with a specic intensity I

passing through interstellar


space. (Specic intensity of light here means the energy transmitted in a direction per unit
time through unit area perpendicular to that direction into unit solid angle per unit interval
in wavelength.) If the light passes through an element of length in the interstellar medium,
it will experience a change dI

in the intensity I

due to absorption and scattering by dust.


This is related to the change d

in the optical depth

at the wavelength that the light


experiences along its journey by
dI

= d

.
Integrating over the line of sight from a light source to an observer, the observed intensity is
I

= I
0
e

where I
0
is the light intensity at the source and

is the total optical depth along the line


of sight.
This can be related to the loss of light in magnitudes. The magnitude m in some photo-
metric band is related to the ux F in that band by m = C 2.5 log
10
( F ), where C is a
calibration constant. So, the change in magnitude caused by an optical depth in the band
is A = 2.5 log
10
(e

) = +1.086 . The observed magnitude m is related to the intrinsic


magnitude m
0
by m = m
0
+ A, where A is the extinction in magnitudes (the intrinsic mag-
nitude m
0
is the magnitude that the star would have if there were no interstellar extinction).
A depends on the photometric band. For example, for the V (visual) band (corresponding
to yellow-green colours and centred at 5500

A), V = V
0
+ A
V
, while for the B (blue) band
(centred at 4400

A), B = B
0
+ A
B
. For sight lines at the Galactic poles, A
V
0.00 to
0.05 mag, while at intermediate galactic latitudes, A
V
0.05 to 0.2 mag. However, in the
Galactic plane, the extinction can be many magnitudes, and towards the Galactic Centre it
is A
V
20 mag.
A

is a strong function of wavelength and it scales as A

1/ (but not as strong as


the 1/
4
relation of the Rayleigh law). There is therefore much stronger absorption in the
56
The interstellar extinction law.
The extinction caused by dust
is plotted against wavelength
and extends from the ultra-
violet through to the near-
infrared. [Based on data from
Savage & Mathis, Ann. Rev.
Astron. Astrophys., 1979.]
blue than in the red. This produces an interstellar reddening by dust: as light consisting of a
range of wavelengths passes through the interstellar medium, it is reddened by the selective
loss of short wavelength light compared to long wavelengths. Colour indices are reddened,
e.g. B V is reddened so that the observed value is (B V ) = (B V )
0
+ E
BV
, where
(B V )
0
is the intrinsic values (what would be observed in the absence of reddening) and
E
BV
is known the colour excess. The colour excess measures how reddened a source is.
The colour excess is therefore the dierence in the extinctions in the two magnitudes, e.g.
E
BV
= A
B
A
V
. If the intrinsic colour can be predicted, i.e. we can predict (BV )
0
(e.g.
from a spectrum), it is possible to calculate E
BV
from E
BV
= (BV ) (BV )
0
. E
BV
data can then be used to map the dust distribution in space. It is found from observations
that the extinction in the V band A
V
3.1 E
BV
. Extinction gets less severe for 1 m
as the wavelength gets much longer than the grains. Grains are transparent to X-rays.
The extinction becomes very strong for long sight lines through the disc of the Galaxy.
The Galactic Centre is completely opaque to optical observations. There is some patchiness
in the distribution of the dust. A few areas of lower dust extinction towards the bulge of our
Galaxy, such as Baades Window, enable the stars in the bulge to be studied.
We can model the extinction caused by dust in the Galaxy to predict how the extinction
in magnitudes towards distant galaxies will depend on their galactic latitude b. The optical
depth caused by dust extinction when light of wavelength travels a short distance ds
through the interstellar medium is given by d

d
ds, where
d
is the density of dust at
that point in space and

is the mass extinction coecient at that point for the wavelength


. Observations of star numbers show that their number density declines exponentially with
distance from the Galactic plane. We shall adopt a similar behaviour for the density of dust,
and therefore assume that the density of dust varies with distance z above the Galactic plane
as
d
(z) =
d0
e
|z|/h
, where
d0
and h are constants.
The optical depth when travelling a distance ds along the line of sight at galactic latitude
b is d

d
(z) ds =

d
(z) dz/ sin [b[ where z is the distance north of the Galactic plane.
Integrating along the line of sight,
_

0
d

=
_

0

d
(z) dz
sin [b[
=
_

0

d0
e
|z|/h
sin [b[
dz =

d0
sin [b[
_

0
e
|z|/h
dz
57
assuming that the opacity

does not vary with distance from the Galactic plane. Therefore,

d0
h
sin [b[
=

d0
hcosec[b[ .
Since the extinction in magnitudes is A

= 1.086

, we get
A

= 1.086

d0
hcosec[b[ .
Therefore the extinction in magnitudes towards an extragalactic object is predicted to vary
with galactic latitude b as A

cosec[b[ in this particular model. By coincidence, the same


cosec[b[ dependence of A

on galactic latitude b is obtained using a simplistic model in which


the dust is found in a slab of uniform density centred around the Galactic plane.
3.13 Interstellar Dust: Polarisation by Dust
However, extinction by dust does one very useful thing for optical astronomers. Dust
grains are not spherical and tend to have some elongation. Spinning dust grains tend to align
with their long axes perpendicular to the local magnetic eld. They thus preferentially block
light perpendicular to the magnetic eld: extinction produces polarised light. The observed
polarisation will tend to be parallel to the magnetic eld. Hence polarisation measurements
of starlight reveal the direction of the magnetic eld (or at least the sky-projection of the
direction).
Dust also reects light, with some polarisation. This is observable as reection nebulae,
where faint diuse starlight can be seen reected by dust.
58
3.14 Interstellar Dust: Radiation by Dust
Light absorbed by dust will be reradiated as a black-body spectrum (or close to a black-body
spectrum). The Wien displacement law states that the maximum of the Planck function B

over wavelength of a black-body at a temperature T is found at a wavelength

max
=
2.898 10
3
T
K m .
This predicts that the peak of the black-body spectrum from dust at a temperature of
T = 10 K will be at a wavelength
max
= 290 m, from dust at T = 100 K will be at a
wavelength
max
= 29 m, and for T = 1000 K will be at
max
= 2.9 m.
The radiation emitted by dust will be found in the infrared, mostly in the mid-infrared,
given the expected temperatures of dust. This can be observed, for example using observa-
tions from space (such as those made by the IRAS satellite), as diuse emission superimposed
on a reected starlight spectrum. However, the associated temperature of the black-body is
surprisingly high T 10
3
K which is much hotter than most of the dust. The interpre-
tation of this is that some dust grains are so small (< 100 atoms) that a single ultraviolet
photon packs enough energy to heat them to 10
3
K, after which these stochastically
heated grains cool again by radiating, mostly in the infrared. This process may be part
of the explanation for the correlation between infrared and radio continuum luminosities of
galaxies (e.g., at 0.1 mm and 6 cm), which seems to be independent of galaxy type. The
idea is that ultraviolet photons from the formation of massive stars cause stochastic heating
of dust grains, which then reradiate them to give the infrared luminosity. The supernovae
resulting from the same stellar populations produce relativistic electrons which produce the
radio continuum as synchrotron emission.
3.15 Star formation
Stars form by the collapse of dense regions of the interstellar medium under their own gravity.
This occurs in the cores of molecular clouds, where the gas is cold ( 10 K) and densities
can exceed 10
10
molecules m
3
.
A region of cold gas will collapse when its gravitational self-attraction is greater than the
hydrostatic pressure support. This gravitational instability is often described by the Jeans
length and Jeans mass. For gas of uniform density , the Jeans length
J
is the diameter of
a region of the gas that is just large enough for the gravitational force to exceed the pressure
support. It is given by

J
= c
s
_

G
, (3.1)
where c
s
is the speed of sound in the gas and G is the constant of gravitation. The Jeans
mass is the mass of a region that has a diameter equal to the Jeans length and is therefore
M
J
=

6

3
J
. (3.2)
Star formation can be self-propagating. Stars will form, heat up and ionise the cold
molecular gas, with the resulting outwards ow of gas compressing gas ahead of it. This
compression causes instabilities that result in local collapse to form new stars. The enhanced
density in the spiral arms of our Galaxy and in other spiral galaxies means that star formation
occurs preferentially in the arms.
59
Chapter 4
Galactic Chemical Evolution
4.1 Introduction
Chemical evolution is the term used for the changes in the abundances of the chemical
elements in the Universe over time, since the earliest times to the present day. The study
of these changes is an important eld in astronomy, for both our Galaxy and for other
galaxies. This includes the study of the elemental abundances in stars and in the interstellar
medium. A fundamental objective of studies of chemical evolution is to develop a complete
understanding of how the elemental abundances correlate with parameters like time, location
within a galaxy, and stellar velocities.
The term chemical here refers to the chemical elements. It does not refer to chemistry in
the broader sense: the study of interactions between molecules in the Universe is a dierent,
distinct eld called astrochemistry.
4.2 Chemical Abundances
The relative abundances of the chemical elements can be measured in a number of astro-
nomical objects, most importantly using spectroscopic techniques. The observed strengths
of spectral lines depend on a variety of factors among which are the chemical abundances of
the elements producing the spectral lines. Abundances can measured in stellar photospheres
from the strengths of absorption lines. The observed strengths of lines in stellar spectra de-
pend on the abundance of the element responsible, on the eective temperature of the star,
on the acceleration due to gravity at its surface and on small-scale turbulence in the atmo-
sphere of the star. All these parameters can be solved for if there are sucient spectroscopic
observations, while the analysis becomes simpler if the temperature can be determined in
advance from photometric observations. Equally, abundances can be determined from the
strengths of emission lines from interstellar gas, most notably from HII regions.
It is important to dene some terms and parameters that are used for the study of the
abundances of chemical elements. In astronomy, for historical reasons, the term metals is
used for elements other than H and He; the term heavy elements is also used for these.
Under this denition, even elements such as carbon, oxygen, nitrogen and sulphur are called
metals. The term metallicity is used for the fraction of heavy elements, usually expressed as
a fraction by mass.
It is convenient to dene the fractions by mass of hydrogen X, of helium Y , and of heavy
elements Z. Therefore, Z = (mass of heavy elements)/(total mass of all nuclei) in some
object, objects or region of space. We therefore have X + Y + Z 1.
In some other applications, the abundances by number of nuclei are used. These are
usually expressed relative to hydrogen. So N(He)/N(H) is the ratio of the abundance of
helium to hydrogen by number, and N(Fe)/N(H) the iron-to-hydrogen ratio.
60
For convenience, chemical abundances in the Universe are often compared to the values
in the Sun. Solar abundances give X = 0.70, Y = 0.28, Z = 0.02 by mass. An object, such
as a star, that has a heavy element fraction signicantly lower than the Sun is said to be
metal poor, while one that has a larger heavy element fraction is said to be metal rich.
Abundance ratios by number are expressed relative to the Sun using a parameter [A/B],
where A and B are the chemical symbols of two elements, and is dened as,
[A/B] = log
10
_
N(A)
N(B)
_
log
10
_
N(A)
N(B)
_

, (4.1)
where represents the abundance ratio in the Sun. So the ratio of iron to hydrogen in a
star relative to the Sun is written as
[Fe/H] = log
10
_
N(Fe)
N(H)
_
log
10
_
N(Fe)
N(H)
_

. (4.2)
The [Fe/H] parameter for the Sun is, by denition, 0. A mildly metal-poor star in the
Galaxy might have [Fe/H] 0.3, while a very metal-poor star in the halo of the Galaxy
might have [Fe/H] 1.5 to 2. A metal-rich star in the Galaxy might have [Fe/H] +0.3.
The interstellar gas in the Galaxy has a near-solar metallicity with [Fe/H] 0. The iron-
to-hydrogen number ratio is encountered more often in research papers than other ratios
because iron produces large numbers of absorption lines in the spectra of late-type stars like
our Sun, making iron abundances relatively straightforward to measure.
4.3 The Chemical Enrichment of Galaxies
Current cosmological models show that the Big Bang produced primordial gas having a chem-
ical composition that was 91 % H, 9 % He by number, plus a trace of Li
7
. This corresponds
to a composition by mass that was 77 % H and 23 % He: i.e. X = 0.77, Y = 0.23, Z = 0.00.
The baryonic material produced by the Big Bang was therefore almost pure hydrogen and
helium, with ten hydrogen nuclei for every one of helium. This production of the chemical
elements in the Big Bang is known as primordial nucleosynthesis.
The material we nd around us in the Universe today contains signicant quantities of
heavy elements, although these are still only minor contributors to the total mass of baryonic
matter. These heavy elements have been synthesised in nuclear reactions in stars, a process
known as nucleosynthesis.
These nuclear reactions in general occur in the cores of stars, producing enriched material
in stellar interiors. Enriched material can be ejected into the interstellar medium in the later
stages of stellar evolution, through mass loss and in supernovae. Star formation therefore
produces stars which, after a time delay, eject heavy elements into the interstellar medium,
including heavy elements newly synthesised in the stellar interiors. Star formation from this
enriched material in turn results in stars with enhanced abundances of heavy elements. This
process occurs repeatedly over time, with the continual recycling of gas, leading to a gradual
increase in the metallicity of the interstellar medium with time.
Supernovae are important to chemical enrichment. They can eject large quantities of
enriched material into interstellar space and can themselves generate heavy elements in
nucleosynthesis. Type II supernovae are produced by massive stars (M 8M

). They eject
enriched material into the interstellar medium 10
7
yr after formation. This material is
rich in C, N and O. Type Ia supernovae are probably caused by explosive fusion reactions
in binary systems and eject enriched material 10
8
yr after the initial star formation. This
material is rich in iron.
The main sequence lifetime T
MS
of a star is a very strong function of the stars mass
M. Stars with masses M 0.8M

have T
MS
> age of the Universe. So mass that goes
61
into low mass stars is lost from the recycling process. Usefully, samples of low mass stars
preserve abundances of the interstellar gas from which they formed if enriched material is
not dredged up from the stellar interiors. This is true for G dwarfs for example, where the
gas in the photosphere is almost unchanged in chemical composition from the gas from which
they formed. So a sample of G dwarfs provides samples of chemical abundances through the
history of the Galaxy. In contrast, observations of the interstellar gas in a galaxy provide
information on present-day abundances and on the current state of chemical evolution.
4.4 The Simple Model of Galactic Chemical Evolution
The Simple Model of chemical evolution simulates the build up of the metallicity Z in a
volume of space. The Simple Model makes some simplifying assumptions:
the volume initially contains only unenriched gas initially there are no stars and no
heavy elements;
the volume of space where the evolution takes place is a closed box no gas enters
or leaves the volume;
the gas in the volume is well mixed it has the same chemical composition throughout;
instantaneous recycling occurs following star formation, all newly created heavy
elements that enter the ISM do so immediately;
the fraction of newly-synthesised heavy elements ejected into the ISM after material
forms stars is constant.
These assumptions, although slightly naive, allow some important predictions about the
variation of the metallicity in the interstellar gas with the amount of star formation that has
taken place.
Consider a volume within a galaxy, small enough to be fairly homogeneous, but large
enough to contain a good sample of stars. The total mass of chemical elements in this volume
at time t is M
total
, made up of M
stars
in stars and M
gas
in gas. So
M
total
= M
stars
+ M
gas
. (4.3)
Here, M
stars
= M
stars
(t) and M
gas
= M
gas
(t). The assumption that the volume is a closed
box means that M
total
= constant at all times. Initially, at time t = 0, M
stars
= 0 (the
volume contains pure gas initially).
Let M
metals
be the mass of heavy elements in the gas within the volume at time t.
Therefore the heavy element mass fraction of the gas is
Z
M
metals
M
gas
. (4.4)
Consider a time interval from t to t +t. Star formation will occur in this time, with gas
forming stars. Let the change in M
stars
and M
gas
in this time be M
stars
and M
gas
. Some
stars will eject enriched gas back into the interstellar medium (through supernovae and mass
loss).
We rstly need to express the change Z in the metallicity of the interstellar gas in terms
of M
stars
and M
gas
. From Equation 4.4, we have Z = fn(M
metals
, M
gas
), so the dierential
of Z with respect to time is
dZ
dt
=
Z
M
metals
dM
metals
dt
+
Z
M
gas
dM
gas
dt
Dierentiating Equation 4.4,
Z
M
metals
=
1
M
gas
and
Z
M
gas
=
M
metals
M
2
gas
,
62
which gives,
dZ
dt
=
1
M
gas
dM
metals
dt

M
metals
M
2
gas
dM
gas
dt
.
For a small time interval t, we have,
Z =
M
metals
M
gas

M
metals
M
2
gas
M
gas
,
which gives from the denition of Z in Equation 4.4,
Z =
M
metals
M
gas
Z
M
gas
M
gas
. (4.5)
We need to distinguish between the the total mass in stars M
stars
at time t and the total
mass that has taken part in star formation M
SF
over all periods up to time t. When a mass
M
SF
goes into stars during star formation, the total mass in stars will change by amount
less than this, because material from the new stars is ejected back into the interstellar gas.
So, M
SF
> M
stars
, and M
SF
> M
stars
.
Let be the fraction of mass participating in star formation that remains locked up in
long-lived stars and stellar remnants. So,
M
stars
= M
SF
(4.6)
The mass of newly synthesised heavy elements ejected back into the ISM is proportional
to the mass that goes into stars (from the Simple Model assumptions listed above). Let the
mass of newly synthesised heavy elements ejected into the ISM be p M
stars
, where p is a
parameter known as the yield, with p set to be a constant here.
The change M
metals
in the mass of heavy elements in the gas in a time t will be caused
by the loss of heavy elements in the gas that goes into star formation, by the gain of old heavy
elements that have gone into star formation and are then ejected back into the gas, and by
the gain of newly synthesised heavy elements from stars that are ejected into the gas. The
contribution to M
metals
from the loss of heavy elements in the gas going into star formation
will be M
SF
M
metals
/M
gas
= M
SF
Z. The contribution from old heavy elements in
the gas that have gone into star formation then are ejected back unchanged into the gas will
be Z M
SF
(fraction of mass going into stars that is ejected back) = Z (1 ) M
SF
. The
contribution from the newly synthesised heavy elements that are ejected into the gas will be
p M
SF
from the denition of the yield p above.
Therefore, in time t,
M
metals
= Z M
SF
+ Z (1 ) M
SF
+ p M
stars
,
which gives on expanding and cancelling,
M
metals
= Z M
SF
+ p M
stars
= Z M
stars
+ p M
stars
, (4.7)
on substituting for M
SF
= M
stars
. Dividing by M
gas
,
M
metals
M
gas
= Z
M
stars
M
gas
+ p
M
stars
M
gas
.
But from Equation 4.3, the changes in masses are related by M
total
= M
stars
+ M
gas
= 0
for a closed box. Therefore,
M
stars
= M
gas
. (4.8)
63

M
metals
M
gas
= Z
M
gas
M
gas
p
M
gas
M
gas
.
Substituting this into Equation 4.5,
Z = Z
M
gas
M
gas
p
M
gas
M
gas
Z
M
gas
M
gas
Z = p
M
gas
M
gas
, (4.9)
Converting this to a dierential and integrating from time 0 to t,
_
Z(t)
0
dZ

=
_
Mgas(t)
Mgas(0)
p
dM

gas
M

gas
Z(t) 0 = p
_
ln M

gas
_
Mgas(t)
Mgas(0)
.
This gives,
Z(t) = p ln
_
M
gas
(t)
M
gas
(0)
_
. (4.10)
Since the M
gas
(0) = M
total
(t) (a constant) for all t (because we have a closed box
that initially contained only gas), we can rewrite this equation using the gas fraction
M
gas
(t)/M
total
(t) as
Z(t) = p ln . (4.11)
Both Z and can in principle be measured with appropriate observations, and this equation
does not depend on time t or star formation rate explicitly. This is an important prediction
from the theory that is discussed more in Section 4.5, where comparisons with observations
are considered.
We now need to consider how the mass in stars depends on metallicity. Equation 4.9
gives the change in metallicity Z in time t. But from Equation 4.3, at any time, M
total
=
M
stars
+ M
gas
, and at time 0, M
total
= M
gas
(0). So, M
gas
(t) = M
gas
(0) M
stars
(t). From
Equation 4.8, M
stars
= M
gas
. Substituting for M
gas
(t) and M
stars
into Equation 4.9,
Z = p
M
stars
M
gas
(0) M
stars
(t)
.
Integrating from time 0 to t,
_
Z
0
dZ

= p
_
Mstars
0
dM

stars
M
gas
(0) M

stars
Z 0 = p
_
ln
_
M
gas
(0) M

stars
_
Mstars
0
Z = p ln (M
gas
(0) M
stars
(t)) + ln (M
gas
(0))
= p ln
_
M
gas
(0) M
stars
(t)
M
gas
(0)
_
,
64
which rearranges to
M
stars
(t)
M
gas
(0)
= 1 e
Z(t)/p
. (4.12)
This is a prediction of how the fraction of the mass of the volume that is in stars varies
with metallicity. Again we have a neat prediction of the Simple Model that involves time
only through Z(t) and M
stars
(t). This equation does not involve the star formation rate as
a function of time explicitly, which means that the predictions are much simpler to compare
with observations.
Today, at time t
1
, we have a metallicity Z
1
and a mass in stars M
stars1
. Therefore we
have
M
stars1
M
gas
(0)
= 1 e
Z
1
/p
,
at the present time. Dividing Equation 4.12 by this,
M
stars
(t)
M
stars1
=
1 e
Z(t)/p
1 e
Z
1
/p
. (4.13)
This is a prediction of how the mass in stars at any time varies with the metallicity.
If we observe some subsample of long-lived stars of similar mass, the number N(Z) of
these stars having a metallicity Z and less will be related to the mass by N(Z) M
stars
(t).
So,
N(Z)
N
1
=
M
stars
(t)
M
stars1
,
where N
1
is the value of N(Z) today. This then gives,
N(Z)
N
1
=
1 e
Z(t)/p
1 e
Z
1
/p
. (4.14)
This gives a specic prediction of the number of stars as a function of metallicity. In practice
it is often easier to work with the dierential distribution dN/dZ, which expresses the number
of stars with metallicity Z against Z.
4.5 Comparing the Simple Model with Observations: Abun-
dances in the Interstellar Gas of Galaxies
Equation 4.11 provided a simple expression for the metallicity in the Simple Model:
Z = p ln (gas fraction) .
This was derived by considering the chemical evolution in a single closed box. However, this
is a general result for any closed box and it can be compared with the observed metallicities
and gas fractions in a number of galaxies, provided that the yield is the same in all places.
Magellanic irregular galaxies are found to t this relation reasonably well, and p is es-
timated to be 0.0025 from observations. In spiral galaxies, the gas fraction in the disc
increases as we go outwards, and Z is indeed observed to decrease, though perhaps more
steeply than this crude model predicts.
65
4.6 Comparing the Simple Model with Observations: Stellar
Abundances in the Galaxy and the G-Dwarf Problem
Equation 4.14 gives a prediction of the number of stars N(Z) having a metallicity Z as
a function of Z for a sample of long-lived stars, based on the Simple Model assumptions in
Section 4.4. These predictions can be compared with observations.
These comparisons require metallicity data for relatively large numbers of stars to en-
sure adequate statistics. Therefore, metallicity estimates are often made using photometric
techniques, rather than using more precise spectroscopic measurements. These photometric
estimates tend to measure the abundances of the elements that produce strong absorption in
the light of the stars, rather than the overall metallicity. Therefore iron abundances [Fe/H]
are generally quoted for studies of the metallicity distribution.
G- or K-type main sequence stars can conveniently be used in these studies because
their lifetimes are suciently long that they will have survived from the earliest times to
the present: these are known as G and K dwarfs. G dwarfs have generally been used
because of the advantage of their greater luminosity and because the techniques of estimating
metallicities have been better calibrated.
The metallicity distribution observed for metal-poor globular clusters (where the abun-
dance of each cluster is used instead of individual stars) gives a tolerably good t to the
Simple Model prediction. However, matters are very dierent for the stars in the solar
neighbourhood, within the disc of the Galaxy.
The gure below gives a comparison of the the predicted metallicity distribution of Equa-
tion 4.14 with observations of long-lived stars in the solar neighbourhood. The Simple Model
prediction is found to be very dierent to the observed distribution. The Simple Model pre-
dicts a far larger proportion of metal-poor stars than are actually found. This has become
known as the G dwarf problem.
The observed cumulative metallicity distribution for stars in the solar neighbourhood, com-
pared with the Simple Model prediction for p = 0.010 and Z
1
= Z

= 0.017. [The observed


distribution uses data from Kotoneva et al., M.N.R.A.S., 336, 879, 2002, for stars in the
Hipparcos Catalogue.]
66
The dierential metallicity distribution, representing simply a histogram of star numbers
as a function of [Fe/H], also shows the failure of the Simple Model prediction. This is shown
in the gure below.
The observed dierential metallicity distribution for stars in the solar neighbourhood, com-
pared with the Simple Model prediction for p = 0.010 and Z
1
= Z

= 0.017. [The observed


distribution uses data from Kotoneva et al., M.N.R.A.S., 336, 879, 2002, for stars in the
Hipparcos Catalogue.]
Determining the metallicity distribution in galaxies outside our own is very dicult
observationally. G dwarf stars are very faint in even the nearest Local Group galaxies.
4.7 Solutions to the G-Dwarf Problem
The G-dwarf problem indicates that the Simple Model is an oversimplication in the solar
neighbourhood: one or more of the assumptions in Section 4.4 must be wrong. This is a
very important result, but precisely which of the assumptions are wrong is dicult to say. A
better t to the observed data can be had by relaxing any of a number of the Simple Model
assumptions. These include:
the gas was not initially of zero metallicity;
there has been an inow of very metal-poor gas (this can help, but the value of the
yield p must be adjusted);
there has been a variable initial mass function (which could result in a change in the
fraction of mass that remains locked up in long-lived stars, or in a change in the
yield p);
the samples of stars used to test the Simple Model are biased against low-metallicity
stars (but considerable care is taken by observers to correct for these eects).
67
(The initial mass function is the number N of stars as a function of star mass M immediately
following star formation. There is virtually no evidence that the initial mass function varied
with time.)
One possible change to the Simple Model is to allow for a loss of gas from the volume.
This is plausible, because supernovae following star formation could drive gas out of the
region of the galaxy being studied. One possible prescription would be to set the outow
rate to be proportional to the rate of star formation. Therefore, the loss of gas from the
volume in a time t is c M
stars
, where c is a constant. In this case the total mass M
total
in the volume varies with time, unlike in the Simple Model. However, it is found that a
loss of enriched gas would make the G dwarf problem even worse: it reduces the quantity of
enriched gas to make stars.
One modication to the Simple Model that can achieve a better t between theoretical
prediction and observations is to allow for the inow of gas. This gas could be unprocessed,
primordial gas. Analytic models of this type often set the inow rate to be proportional to
the star formation rate, so the inow of gas in a time t is c M
stars
, where c is a constant.
This can produce a better t between models and observations, provided the yield p is chosen
appropriately.
4.8 Nucleosynthesis
The processes by which chemical elements are created are called nucleosynthesis. The ele-
ments produced in the hot, dense conditions in the Big Bang (hydrogen, helium and Li
7
)
were created in primordial nucleosynthesis. Other isotopes and elements were produced by
nucleosynthesis inside stars and in supernova explosions (including some additional helium).
The nuclear reactions responsible are complex and varied. The proto-proton chain is a
series of reactions that fuse protons to form helium. In summary these reactions involve
4 H
1
He
4
with the addional production of positrons, neutrinos and gamma rays. The
carbon-nitrogen cycle and the carbon-nitrogen-oxygen bicycle also fuse protons to form He
4
using pre-existing C
12
in these reactions, creating N and O as (mostly) temporary byproducts
which ultimately return to C
12
. However, incomplete reactions in this series can leave behind
some N, O and F.
Helium burning occurs in evolved red giant stars. In summary, 3He
4
C
12
through an
intermediate stage involving Be
8
. Reactions of this type can continue, with carbon burning
and oxygen burning producing elements such as Mg and Si.
Another important process involves the capture of He nuclei by other nuclei, known as
-capture (because the He nucleus is an particle). For example, O
16
+ He
4
Ne
20
.
Elements that are mainly in form of isotopes consisting of multiples of particles are known
as elements and are relatively abundant.
A number of isotopes are built up by neutron capture. This process can occur slowly inside
stars over long periods of time, when it is called the s-process. Under extreme circumstances
in supernova explosions, neutron capture can occur rapidly and it is called the r-process.
Some isotopes are particularly stable, while others are fragile in the high temperatures
in stellar interiors (and participate readily in reactions with other nuclei, producing other
isotopes). This stability/fragility aects the abundances of the elements produced during
nucleosynthesis.
When the amount of an element produced by nucleosynthesis in stars does not depend
on the abundances of elements in the gas that formed the stars, that element is said to be a
primary element. On the other hand, if the amount of an element produced by nucleosyn-
thesis does depend on the abundances in the gas that went into the star, the element is said
to be a secondary element. For example, the amount of the isotope N
14
produced in stars
depends on the abundance of carbon in the gas that formed the stars: the more carbon, the
68
more N
14
can be formed as a minor byproduct of the CNO bicycle.
4.9 Element Ratios
The sections on chemical evolution above considered the changes in the overall metallicity
Z. However, the abundances of individual elements provide very important additional infor-
mation. The ratios of the abundances of individual elements, for example oxygen to iron or
carbon to oxygen, have been determined by the details of nucleosynthesis and the enrichment
of the interstellar medium over time, including how the relative quantities of these elements
ejected into the interstellar gas has varied with time.
The [O/Fe] element abundance ratio plotted as a function of [Fe/H]. [Based on data from
Edvardsson et al., Astron. Astrophys., 275, 101, 1993, supplemented with data from Zhang
& Zhao, M.N.R.A.S., 364, 712, 2005.]
Chemical elements can be measured from the spectra of stars. Correlations are often
found between abundance ratios. An important example is how the oxygen-to-iron ratio,
[O/Fe], varies with the iron-to-hydrogen ration, [Fe/H]. Stars with metallicities similar to
that of the Sun have, unsurprisingly, [O/Fe] values similar to the Sun. Metal-poor stars have
[O/Fe] values that are larger than the Sun, with the [O/Fe] values increasing with decreasing
[Fe/H] until a near-constant value is reached. The conventional interpretation of this is that
the heavy metal enrichment that produced the material that went into very metal-poor stars
was caused by type II supernovae predominantly. Type II supernovae occur soon after a
burst of star formation and produce large quantities of oxygen relative to iron. Therefore
the material in very metal-poor stars had high values of [O/Fe]. Later, type Ia supernovae
produced larger quantities of iron compared with oxygen, reducing the oxygen-to-iron ration
in the interstellar gas. Later stars were therefore less metal-poor and had [O/Fe] values
closer to the Sun.
69
Chapter 5
Rotation Curves
5.1 Circular Velocities and Rotation Curves
The circular velocity v
circ
is the velocity that a star in a galaxy must have to maintain a
circular orbit at a specied distance from the centre, on the assumption that the gravitational
potential is symmetric about the centre of the orbit. In the case of the disc of a spiral galaxy
(which has an axisymmetric potential), the circular velocity is the orbital velocity of a star
moving in a circular path in the plane of the disc. If the absolute value of the acceleration
is g, for circular velocity we have g = v
2
circ
/R where R is the radius of the orbit (with R a
constant for the circular orbit). Therefore, /R = v
2
circ
/R, assuming symmetry.
The rotation curve is the function v
circ
(R) for a galaxy. If v
circ
(R) can be measured over a
range of R, it will provide very important information about the gravitational potential. This
in turn gives fundamental information about the mass distribution in the galaxy, including
dark matter.
We can go further in cases of spherical symmetry. Spherical symmetry means that the
gravitational acceleration at a distance R from the centre of the galaxy is simply GM(R)/R
2
,
where M(R) is the mass interior to the radius R. In this case,
v
2
circ
R
=
GM(R)
R
2
and therefore, v
circ
=
_
GM(R)
R
. (5.1)
If we can assume spherical symmetry, we can estimate the mass inside a radial distance
R by inverting Equation 5.1 to give
M(R) =
v
2
circ
R
G
, (5.2)
and can do so as a function of radius. This is a very powerful result which is capable of
telling us important information about mass distribution in galaxies, provided that we have
spherical symmetry. However, we must use a more sophisticated analysis for the general
case where we do not have spherical symmetry. The more general case of axisymmetry is
considered in Section 5.3.
5.2 Observations
Gas and young stars in the disc of a spiral galaxy will move on nearly closed orbits, and if
the underlying potential is axisymmetric these will be nearly circular. Therefore if the bulk
velocity v of gas or young stars can be measured, it provides v
2
circ
= R /R. Old stars
should be avoided: old stars have a greater velocity dispersion around their mean orbital
motion and their bulk rotational velocity will be slightly smaller than the circular velocity.
70
Spectroscopic radial velocities can be used to determined the rotational velocities of spiral
galaxies provided that the galaxies are inclined to the line of sight. The analysis is impossible
for face-on spiral discs, but inclined spirals can be used readily. The circular velocity v
circ
is related to the velocity v
r
along the line of sight (corrected for the bulk motion of the
galaxy) by v
r
= v
circ
cos i where i is the inclination angle of the disc of the galaxy to the line
of sight (dened so that i = 90

for a face-on disc). Placing a spectroscopic slit along the


major axis of the elongated image of the disc on the sky provides the rotation curve from
optical observations. Radio observations of the 21 cm line of neutral hydrogen at a number
of positions on the disc of the galaxy can also provide rotation curves.
For example, in our Galaxy the circular velocity at the solar distance from the Galactic
Centre is 220 kms
1
(at R
0
= 8.0 kpc from the centre).
When people rst starting measuring rotation curves (c. 1970), it quickly became clear
that the mass in disc galaxies does not follow the visible disc. It was found that disc galaxies
generically have rotation curves that are fairly at to as far out as they could be measured
(out to several scale lengths). This is very dierent to the behaviour that would be expected
were the visible mass the mass of the stars and gas the only matter in the galaxies. This
is interpreted as strong evidence for the existence of dark matter in galaxies.
The simplest interpretation of a at rotation curve is that based on the assumption the
dark matter is spheroidally distributed in a dark halo. For a spherical distribution of mass,
v
circ
= constant implies that the enclosed mass M(r) r, and so (r) 1/r
2
.
Rotation curves determined from optical spectra are generally limited to few scale
lengths (assuming an exponential density prole). These do provide important evidence of
at rotation curves. However, 21 cm radio observations can be followed out to signicantly
greater distances from the centres of spiral galaxies, using the emission from the atomic
hydrogen gas. These HI observations provide powerful evidence of a constant circular velocity
with radius, out to radial distances where the density of stars has declined to very low levels,
providing strong evidence for the existence of extensive dark matter haloes.
As yet it is not clear exactly how far dark matter haloes extend. Neither is there a good
estimate of the total mass of any disc galaxy. This is what makes disc rotation curves very
important.
Figure 5.1: The spiral galaxy NGC 2841 and its HI 21 cm radio rotation curve. The gure
on the left presents an optical (blue light) image of the galaxy, while that on the right gives
the rotation curve in the form of the circular velocity plotted against radial distance. The
optical image covers the same area of the galaxy as the radio observations: the 21 cm radio
emission from the atomic hydrogen gas is detected over a much larger area than the galaxy
covers in the optical image. [The optical image was created using Digitized Sky Survey II
blue data from the Palomar Observatory Sky Survey. The rotation curve was plotted using
71
data by A. Bosma (Astron. J., 86, 1791, 1981) taken from S. M. Kent (AJ, 93, 816, 1987).]
5.3 Theoretical Interpretation
However, one needs to be careful about interpreting at rotation curves. The existence of
dark matter haloes is a very important subject and caution is appropriate before accepting
evidence that has profound signicance to our understanding of matter in the Universe. For
this reason, attempts were made to model observed rotation curves using as little mass in the
dark matter haloes as possible. These maximal disc models attempted to t the observed
data by assuming that the stars in the galactic discs had as much mass as could still be
consistent with our understanding of stellar populations. They still, however, required a
contribution from a dark matter halo at large radii when HI observations were taken into
account.
Importantly, the maximum contribution to the rotation curve from an e
R/R
0
disc is not
(as we might naively expect) around R
0
but around 2.5R
0
. Adding the eect of a bulge can
easily give a fairly at rotation curve to 4R
0
without a dark halo. To be condent about
the dark halo, one needs to have the rotation curve for 5R
0
. In practice, that means HI
measurements; optical rotation curves do not go out far enough to say anything conclusive
about dark haloes.
The rest of this section is a more detailed working out of the previous paragraph. It
follows an elegant derivation and explanation due to A. J. Kalnajs.
Consider an axisymmetric disc galaxy. Consider the rotation curve produced by the disc
matter only (at this stage we shall not consider the contribution from the bulge or from the
dark matter halo). This analysis will use a cylindrical coordinate system (R, , z) with the
disc at z = 0 and R = 0 at the centre of the galaxy. Let the surface mass density of the disc
be (R).
The gravitational potential in the plane of the disc at the point (R, , 0) is
(R) = G
_

0
R

(R

) dR

_
2
0
d
_
R
2
+R
2
2RR

cos
, (5.3)
found by integrating the contribution from volume elements over the whole disc. To make
this tractable, let us rst dene a function L(u) so that
L(u)
1
2
_
2
0
d
_
1 +u
2
2ucos
. (5.4)
(The function within the integral can be expanded into terms called Laplace coecients,
which are explained in many old celestial mechanics books.)
This can be expanded as
L(u) = 1 +
u
2
4
+
9
64
u
4
+
25
256
u
6
+
1225
16384
u
8
+ O(u
10
) for u < 1 , (5.5)
either using a binomial expansion of the function in u or by expressing it as Laplace coe-
cients (which uses Legendre polynomials), and then integrating each term in the expansion.
The integration over in Equation 5.3 can be expressed in terms of L(u) as
_
2
0
d
_
R
2
+R
2
2RR

cos
=
2
R
L
_
R

R
_
for R

< R
=
2
R

L
_
R
R

_
for R

> R , (5.6)
72
because the expansion of L(u) assumed that u < 1.
Splitting the integration in Equation 5.3 into two parts (for R

= 0 to R, and for R

= R to
) and substituting for L(R

/R) and L(R/R

), we obtain,
(R) = 2G
_
R
0
R

R
(R

) L
_
R

R
_
dR

2G
_

R
(R

) L
_
R
R

_
dR

. (5.7)
Consider a star in a circular orbit in the disc at radius R, having a velocity v. The radial
component of the acceleration is
v
2
R
=

R
,
and hence
v
2
(R) = R

R
= 2GR
d
dR
_
R
0
R

R
(R

) L
_
R

R
_
dR

2GR
d
dR
_

R
(R

) L
_
R
R

_
dR

These two dierentials of integrals can be simplied by using a result known as Leibnizs
Integral Rule, or Leibnizs Theorem for the dierential of an integral. This states for a
function f of two variables,
d
dc
_
b(c)
a(c)
f(x, c) dx =
_
b(c)
a(c)

c
f(x, c) dx + f(b, c)
db
dc
f(a, c)
da
dc
(5.8)
This gives
d
dR
_
R
0
R

R
(R

) L
_
R

R
_
dR

=
_
R
0

R
_
R

R
L
_
R

R
_
_
(R

) dR

+ (R)L(1)
and
d
dR
_
R
0
(R

) L
_
R
R

_
dR

=
_

R

R
_
L
_
R
R

_
_
(R

) dR

(R)L(1)
Therefore we get
v
2
(R) = 2GR
_
R
0
d
dR
_
R

R
L
_
R

R
_
_
(R

) dR

2GR
_

R
(R

)
d
dR
L
_
R
R

_
dR

But,
d
dR
_
R

R
L
_
R

R
_
_
=
R

R
2
L
_
R

R
_
+
R

R
d
dR
L
_
R

R
_
=
R

R
2
L
_
R

R
_
+
R

R
dL
_
R

R
_
d(R

/R)
d(R

/R)
dR
=
R

R
2
L
_
R

R
_

R
2
R
3
L

_
R

R
_
writing L

(u)
dL(u)
du
.
v
2
= + 2G
_
R
0
_
R

R
L
_
R

R
_
+
_
R

R
_
2
L

_
R

R
_
_
(R

) dR

2G
_

R
_
R
R

_
L

_
R
R

_
(R

) dR

. (5.9)
73
This can be quite messy and it can abbreviated as
v
2
(R) = 2G
_

0
K
_
R
R

_
(R

) dR

, (5.10)
where the function K
_
R
R

_
represents the function over both R

= 0 to R and R

= R to
domains.
Changing variables to x ln R, y ln R

, we can write this as a convolution


v
2
(R) = 2G
_

K(e
xy
) R

(R

) dy. (5.11)
The kernel K(R/R

) is in Figure 5.2.
Figure 5.2: The kernel K(R/R

). Observe that the R > R

part tends to have higher absolute


value than the R < R

part.
Figure 5.3 shows R(R) and v
2
for an exponential disc, but the general shapes are not
very sensitive to whether (R) is precisely exponential. The important qualitative fact is
that whatever R(R) does, v
2
does roughly the same, but expanded by a factor of e.
The distinctive shape of the v
2
(ln R) curve for realistic discs makes it very easy to
recognise non-disc mass. Figure 5.4, following Kalnajs, shows the rotation curves you get by
adding either a bulge or a dark halo. (Actually this gure fakes the bulge/halo contribution
by adding a smaller/larger disc; but if you properly add spherical mass distributions for
disc/halo, the result is very similar.) Kalnajss point is that a bulge+disc rotation curve has
a similar shape to a disc+halo rotation curve only the scale is dierent. So when examining
a at(-ish) rotation curve, you must ask what the disc scale radius is.
5.4 Representing Dark Matter Distributions
The dark matter within spiral galaxies does not appear to be conned to the discs and it
is probably distributed approximately spheroidally. A popular density prole that has been
adopted for modelling dark matter haloes has the form
(r) =

0
1 + (r/a)
2
, (5.12)
74
Figure 5.3: The dashed curve is R(R) for an exponential disc with e
R
and the solid
curve is v
2
(R). Note that R is measured in disc scale lengths, but the vertical scales are
arbitrary.
where r is the radial distance from the centre of the galaxy,
0
is the central dark matter
density, and a is a constant. This form does reproduce the observed rotation curves of spiral
galaxies adequately: it gives a circular velocity that is v
circ
= 0 at R = 0, that rises rapidly
with the raidal distance in the plane of the disc R, and then becomes at (v
circ
=constant)
for R a. This prole, however, has the problem that its mass is innite. Therefore a more
practical functional form is
(r) =

0
1 + (r/a)
n
, (5.13)
where a and n are constants, with n > 2 giving a nite mass.
Some numerical N-body simulations of galaxy formation have predicted that dark matter
haloes will have density proles of the form
(r) =
k
r (a + r)
2
, (5.14)
where a and k are constants. This is known as the Navarro-Frenk-White prole after the
scientists who rst described it. It ts the densities of collections of particles representing
dark matters haloes in numerical simulations, and does so adequately over broad ranges in
masses and sizes. It is therefore often used to represent the dark matter haloes of galaxies
and also of clusters of galaxies.
The proles above are spherical: the density depends only on the radial distance r from
the centre. These functional forms for can be modied to allow for attened systems.
75
Figure 5.4: Plots of v
2
against ln R (upper panel) or v against R (lower panel) For one curve
in each panel, a second exponential disc with mass and scale radius both scaled down by
e
2
7.39 has been added (to mimic a bulge); for the other curve a second exponential disc
with mass and scale radius both scaled up by e
2
7.39 has been added (to mimic a dark
halo).
76
Chapter 6
Gravitational Lensing and Dark
Matter in the Galactic Halo
6.1 Introduction
Gravitational lensing is the process that causes the appearance of distant bright objects to be
altered by the gravity of foreground mass. Being a purely gravitational eect makes lensing
astrophysically important as a probe of mass, including dark matter as well as visible matter.
Examples of gravitational lensing that have been observed include
microlensing by stars, brown dwarfs etc. in the Galactic halo;
deection of light and radio waves by the Sun;
lensing by distant galaxies; and
lensing by galaxy clusters.
We shall begin this Chapter with a detailed review of gravitational lensing. The purpose
of this is to explain the background to gravitational lensing before using these principles to
understand how the light of distant stars can be lensed by objects within our Galaxy. It is
the sections on microlensing in the Milky Way that are really syllabus material. The rest
you should consider as relevant background material, plus information of general interest.
6.2 Gravitational Deection of Light by a Point Mass
Photons are aected by a gravitational eld, but not in the same way as massive particles
are. For the details we need general relativity, but fortunately, for astrophysical applications
we only need to take over a few simple results. The most important is that if a light ray
Figure 6.1: The deection of a ray of light by a point mass. The deection angle is .
77
passes by a mass M with impact parameter R ( GM/c
2
and the size of the mass), it
gets deected by an angular amount
=
4GM
c
2
R
. (6.1)
In contrast, a massive body at high speed v gets deected by = 2GM/v
2
R (which was
proved in the discussion of relaxation time in stellar dynamics).
In most practical applications, the gravitational deection of light is very small. For
example, the deection of a ray of light skimming the surface of the Sun is only 8.5
10
6
rad = 1.8 arcsec. This was rst measured using observations of a solar eclipse in 1919
by a team led by Arthur Eddington.
If lensing takes place over a short enough distance that the deection can be taken to be
sudden, the lens is said to be geometrically thin. Otherwise it is a thick lens.
6.3 The Lensing Equation
To make Equation 6.1 useful we need two approximations, both very good in almost all
astrophysical situations:
(i) The deector is much smaller than the distances to the observer and the object being
viewed (the source);
(ii) The deections are always very small, so we can freely use sin = , and also we can
get the total deection from a mass distribution by integrating Equation 6.1.
Figure 6.2: The denitions of the quantities D
L
, D
S
, D
LS
, ,
S
, and .
Accordingly, let us consider a situation as in Figure 6.2: the observer is viewing a source
at distance D
S
, with a lens (a mass screen) intervening at distance D
L
; D
LS
is the distance
from the lens to the source. On galactic scales D
L
, D
S
, D
LS
are ordinary distances, but on
cosmological scales they must be understood as angular diameter distances, and D
S
,= D
L
+
78
D
LS
. The reason for this complication is that the universe will have expanded substantially
over the light travel time. We shall ignore these cosmological eects in this analysis because
our objective is to understand lensing within the neighbourhood of our Galaxy.
We can use angular coordinates to describe the transverse positions.
1
Let
S
be the
position of the source, and be its observed position after being deected. Note that these
are two-dimensional angular positions, and they are therefore represented as vectors. Let
() be the deection angle.
S
, and will be measured in radians. Let () be the lenss
surface mass density (expressed as the mass per unit solid angle, in units such as kg sr
1
,
solar masses per steradian, or solar masses per square arcsecond).
Then, comparing vectors in the source plane, we get
D
S
= D
S

S
+ D
LS
. (6.2)
(By convention,
2
is directed outwards from the deecting mass rather than towards it.)
Using Equation 6.1 to get in terms of , we get
=
S
+
D
LS
D
S
() , () =
4G
c
2
D
L
_
(

) (

) d
2

[
2
. (6.3)
This is known as the lens equation. It gives
S
as an explicit function of , but as an
implicit function of
S
. Moreover, (
S
) need not be single-valued, so sources can be multiply
imaged.
6.4 Time Delays
The deected light experiences a time delay because of:
the increased geometric light travel time T
geom
=
1
2
T
0
(
S
), where T
0

D
L
D
S
cD
LS
;
the delay within the gravitational potential, () (the Shapiro time delay).
The total time delay is therefore,
T =
1
2
T
0
(
S
)
2
() . (6.4)
The potential time delay, , is a scalar function with the dimensions of time. We denote it
here by the capital letter Psi (a related quantity called the lensing potential will be introduced
later and will be denoted by a lower-case psi, : take care not to confuse them). It is given
by
() =
4G
c
3
_
(

) ln [

[ d
2

, (6.5)
The total time delay T has the property that

T = 0
where

represents the gradient in the lens plane


_
i.e.

x
e
x
+

y
e
y
_
.
1
Later on, well use r, x, y as coordinates rather than r, x, y, to remind us that these are angles on the
sky, not distances.
2
The astrophysical convention being that you rst think how a rational person would do it, and then you
change the sign.
79
This is merely Fermats Principle, a standard result in optics. It states that the path taken
by the light minimises the travel time given the particular lens-observer-source conguration.
The four equations
T = 0, T =
1
2
T
0
(
S
)
2
() ,
() =
4G
c
3
_
(

) ln [

[ d
2

, T
0
=
D
L
D
S
c D
LS
, (6.6)
represent a reformulation of the results of Equation 6.3. Although it is possible to work
entirely with Equation 6.3, Equations 6.6 are much more intuitive. In the cosmological
situation, both terms for T need to be multiplied by (1 +z
L
), where z
L
is the redshift of the
lensing object.
The gravitational time delay can be derived directly from general relativity, indepen-
dently of Equation 6.1, and is known as the Shapiro time delay. Radio astronomers can
measure it directly within the Solar System.
6.5 The Einstein Radius and Einstein Rings
Figure 6.3: The Einstein ring produced by symmetrical lensing of a point object by a point
mass.
Consider a point mass M, which happens to be precisely between us and a point source.
In other words
S
= 0 and () = M(). From the symmetry, we expect to observe a ring.
The lens equation (6.3) is solved by =
E
, with

2
E
=
4GM
c
2
D
LS
D
L
D
S
, (6.7)
The interpretation of this is that this perfectly aligned lens will produce an image that is a
ring with an angular radius
E
given (in radians) by

E
=
_
4GM
c
2
D
LS
D
L
D
S
. (6.8)
This circular image is called an Einstein ring.
The physical length R
E
corresponding to the angle
E
is called the Einstein radius and
is given by R
E
=
E
D
L
. The Einstein radius is therefore,
R
E
=
_
4GM
c
2
D
L
D
LS
D
S
. (6.9)
By a Gausss-law type argument, for any circular mass distribution (
r
), (
r
) and
() will be inuenced only by interior mass. So well get the same images for any circular
80
distribution of the mass M, provided it ts within an Einstein radius. Bodies that t within
their own Einstein radius are said to be compact. But the Einstein radius depends on where
the source and observer are:
R
E
(Schwarzschild radius D
L
)
1
2
.
This eectively means that the further away you look, the easier it gets to see examples of
gravitational lensing. This is a surprising fact at rst, but its really just the gravitational
analogue of a familiar fact about glass lenses to get the maximum eect from a lens you
have to be near the focal plane, if youre too near the lens doesnt have much eect.
6.6 The Critical Surface Density
crit
For given values of D
L
and D
S
, for a lens to be compact object you have to pack a mass
M (in projection) into a circle of radius
E
. However, the area of this circle is proportional
to the mass (through the dependence of
E
on the mass). So clearly there has to be a
critical density, say
crit
, such that if
crit
somewhere then there is a compact object
(or sub-object).

crit
corresponds to a mass surface density of M/
2
E
. Substituting for
E
from Equa-
tion 6.8, we nd that

crit
=
D
L
D
S
D
LS
c
2
4G
. (6.10)
It has units of mass per unit solid angle (e.g. kg sr
1
, M

sr
1
or M

arcsec
2
). Note that

crit
depends on the distances D
L
, D
S
and D
LS
. If >
crit
, the lensing is compact.
The fact there is a critical density, and that it depends on distances, has important
astrophysical consequences. For example, a galaxy as a whole (a smooth distribution of
10
12
M

on a scale of 10
5
pc) is not compact to lensing for D
L
10
9
pc cosmological
distances. But clumps within the galaxy may be compact at much smaller distances. In
particular, a star is compact to lensing at distances of even 1 pc.
6.7 The Lensing Potential
It is possible to dene a two-dimensional function () so that is related to the gradient
of in the lens plane as,


S
=
D
LS
D
S
() , (6.11)
from the lensing equation.
is a dimensionless scalar function known as the lensing potential. It denoted by a
lower-case letter and care should be taken to avoid confusing it with the time delay
caused by the gravitational potential. and are related by
= T
0
=
D
L
D
S
cD
LS
, (6.12)
Using this, we can write Equation 6.6 more concisely as
T = 0, T = T
0
_
1
2
(
S
)
2
()

() =
1

_
(

) ln [

[ d
2

, (6.13)
where is the projected mass density in units of the critical density (i.e. /
crit
).
From the second line of Equation 6.13 it should be evident that satises a two-dimensional
Poisson equation

2
= 2 . (6.14)
81
6.8 The Arrival Time Surface
The surface T() is known as the time delay surface or the arrival time surface. Wherever the
arrival time is stationary (i.e., the surface as a maximum, minimum, or saddle point) there
will be constructive interference, and an image. This is Fermats principle. Furthermore, the
less the curvature of the surface at the images, the more magnied the image will be. This
is formalised in the next section.
Try to visualise the arrival time surface. The geometrical part is a parabola with a
minimum at
S
. Having mass in the lens pushes up the surface variously. If () > 1
anywhere, there will be a maximum somewhere near there, hence another image. There
must be a third image too, because to have a minimum and a maximum in a surface you
must have a saddle point somewhere. In fact
maxima + minima = saddle points + 1 . (6.15)
This is a really a statement about geometry that should be intuitively clear, though a formal
proof is dicult.
A good way of gaining some intuition about the arrival time surface is to take a trans-
parency with a blank piece of paper behind it and look at the reections of a light bulb.
Notice how images merge and split, and how you get grotesquely stretched images just as
they do. Deep images of rich clusters of galaxies show just these eects!
6.9 Magnication
By magnication we mean: how much does the image move when we move the source? It
should be clear that this magnication cant be a scalar, because an image doesnt in general
move in the same direction as the source. In fact the magnication is a tensor. Well denote
it by M (the letter A for amplication is also used). Formalising our denition, we have
M
1
=

S

=

2

2
T() . (6.16)
In Cartesian coordinates
M
1
=
_
_
_
_
_
1

2

2
x

x
1

2

2
y
_
_
_
_
_
.
Notice that M
1
is basically taking the curvature of the arrival time surface.
It is helpful to write M
1
in terms of its eigenvalues, and the usual form is like
M
1
= (1 )
_
1 0
0 1
_

_
cos 2 sin 2
sin 2 cos 2
_
. (6.17)
The eigenvalues are of course 1 . The rst term in Equation 6.17 is the trace part
and comparing equations 6.17 and 6.14 shows that it must be while the second term is
traceless. The term with produces an isotropic expansion or contraction, while the term
produces a stretching in the direction and a shrinking in the perpendicular direction; is
known as convergence and as shear.
The determinant of M can be thought of as a scalar magnication.
[M[ = [(1 )
2
+
2
]
1
. (6.18)
82
The area of images on the sky are increased by a factor [M[. The places where one of the
eigenvalues of M
1
becomes zero (and in consequence [M[ is innite) are in general curves
and are known as critical curves. When critical curves are mapped onto the source plane
through the lens equation, they give caustics; a source lying on a caustic gets innitely
magnied.
6.10 Examples of the Magnication: Lensing by a Point-Mass
and by an Isothermal Sphere
For a point mass, the lens equation is

Sx
=
x


x

2
r

2
E
,
Sy
=
y


y

2
r

2
E
,
and this gives
M
1
=
_
_
_
_
_
_
1
_
1

2
r
+ 2

2
x

4
r
_

2
E
2

x

4
r

2
E
2

x

4
r

2
E
1
_
1

2
r
+ 2

2
y

4
r
_

2
E
_
_
_
_
_
_
. (6.19)
Taking the determinant and simplifying, we get
[M[
1
= 1

4
E

4
r
. (6.20)
We shall now consider a circular mass distribution
1
r
. This is known as the isothermal
lens, because it is the 1/r
2
isothermal sphere in projection. The lens equation for the
isothermal lens is

Sx
=
x


x

2
E
,
Sy
=
y


y

2
E
,
and gives
M
1
=
_
_
_
_
_
_
1
_
1

r
+

2
x

3
r
_

2
E

3
r

2
E

3
r

2
E
1
_
1

r
+

2
y

3
r
_

2
E
_
_
_
_
_
_
.
And from this we get
[M[
1
= 1

E

r
. (6.21)
This is shorter in polar coordinates, but tensor components in polar coordinates can get
confusing.
6.11 The Conservation of Surface Brightness
Magnication in lensing conserves surface brightness. We can prove this in a rather inter-
esting way. Let us consider the axial direction as a formal time variable t; then light rays
can be thought of as trajectories. Now allow observers to be at arbitrary transverse position
(say w two dimensional) and arbitrary t. Then as observed at (w, t) is just the local
dw
dt
for the corresponding light ray, up to a constant factor. This means we can make a formal
analogy with Hamiltonian formulation of stellar dynamics, with (up to a constant) playing
83
the role of the momentum, w playing the role of the coordinates, and (w, t) replacing the
Newtonian potential. The phase space density f is the density of photons in (w, ) space,
or the number of photons per unit solid angle on the sky per unit telescope area, i.e., the
surface brightness. The collisionless Boltzmann equation applies (as it does for any Hamil-
tonian system) and it tells us that surface brightness is conserved along trajectories! Surface
brightness must be conserved by the act of placing the lens there too think of surface
brightness before and after going through the lens. QED. We must be careful, though, to
understand along the trajectories correctly. It means we must always be looking at photons
from the same source, so if the image is moved in the sky by lensing we must follow it when
we measure surface brightness.
This means that lensing changes the apparent sizes (and shapes) of objects, but does not
alter their surface brightness. Lensing a source will change its apparent brightness because
its angular area is changed, not because the surface brightness changes.
6.12 The Preservation of Spectroscopic and Colour Informa-
tion
The gravitational lensing of light does not depend on the wavelength: the wavelength of light,
or equivalently the energy of photons, does not appear in the equations describing lensing.
Therefore, the spectrum of a source is not changed if the source is lensed by a foreground
object. Equally, the colour of the source is unchanged.
This provides an important test for the eects of lensing. If two images on the sky are
suspected to be caused by lensing of a single object, we expect that their spectra will be the
same. They should show the same spectral features, and have the same redshift.
One qualication needs to be made. An extended source, such as a galaxy, may emit light
from dierent regions that have dierent spectra and dierent colours. This can produce
some practical changes in observed spectra on account of lensing in some instances, caused
by the dierent magnication of subregions.
6.13 Multiple-image QSOs
Multiple imaging of a quasi-stellar object (QSO) happens when a foreground galaxy lies
E
(in projection) of a QSO, and produces two or four images with arcsecond order separations.
Two-image systems have a minimum and a saddle point, while four-image systems have two
minima and two saddle points. In both cases there is a maximum as well, at the bottom of
the galaxys potential well; but since that is also generally the densest part of the galaxy,
is very high and [M[ nearly vanishes, so these central images are too faint to detect.
Multiple-image QSOs are of great astrophysical interest, and two things make them so.
The rst is that since QSOs are often very time-variable and the dierent images have
dierent arrival times, the images will show the same time-variability, but with osets. These
osets are simply the dierences in T() between dierent images. (So far they have been
explicitly measured for several lenses.) Provided we know (or can model) (), the measured
time osets tell us T
0
, and hence H
0
. Basically its this: normally we can only measure
dimensionless quantities (image separations, relative magnications) in lens systems; but if
we succeed in measuring a quantity that has a scale (the time delays), it can tell us the scale
of the Universe (H
0
). In practice, there is considerable uncertainty about the distribution of
mass in the lensing galaxies, and this translates into an uncertainty in the inferred H
0
that
is much larger than errors in the time delays. Maybe this problem will be overcome, maybe
not.
The second thing has to do with the extremely small size of QSOs in optical continuum.
Now the () of a galaxy isnt perfectly smooth, it becomes granular on the scale of individual
84
stars. This produces a very complicated network of critical lines (in the lens plane), and a
corresponding complicated network of caustics in the source plane (like the pattern at the
bottom of a swimming pool). The optical continuum emitting regions of QSOs are small
enough to t between the caustics, but the line emitting regions straddle several caustics.
As proper motions move the caustic network, the continuum region will sometimes cross
a caustic, and show a sudden change in brightness; the time taken for the brightness to
change is the time it take to cross the caustic. This is the phenomenon of QSO microlensing:
continuum shows it but lines dont. (Its just the gravitational version of stars twinkling
and planets not twinkling.) This has been observed, and modelling the caustic network and
putting in plausible values for the proper motion leads to an estimate of the intrinsic size of
the continuum regions of QSOs. Its very small 100 AU.
Figure 6.4: Examples of gravitational lensing of quasars and distant galaxies by foreground
galaxies. The pictures show images of candidate lenses recorded with the Hubble Space Tele-
scope. [From NASA HST press release 1999-18, produced by Kavan Ratnatunga (Carnegie
Mellon Univ.), NASA and the Space Science Telescope Institute.]
6.14 Galaxy Clusters
Galaxy clusters are generally not in dynamical equilibrium (there havent been enough cross-
ing times since they formed). Their mass distributions and potentials are thus warped in
more complicated ways than for single galaxies. They are also much bigger on the sky and
thus have many more background objects (faint blue galaxies) to lens.
The transparency with a paper behind it and several light bulbs overhead is a good anal-
ogy of lensing by a cluster. Rich clusters show many highly stretched images of background
galaxies, and these are known as arcs. A deep HST image of Abell 2218 shows over a hundred
arcs, including seven multiple image systems.
An arc is close to a zero eigenvalue of M
1
, and is stretched along the corresponding
eigenvalue. Thus each arc provides some sort of constraint on the of the cluster.
Clusters also show weak lensing. Thats when the eigenvalues 1 are too close
to unity to show up as arcs, but if many galaxies in the same region are examined then
statistically a stretching is measurable. The statistical stretching measures the ratio of the
two eigenvalues, and thus /(1 ).
Several groups have been reconstructing cluster mass proles from information provided
by multiple-images, arcs, and weak lensing.
85
Figure 6.5: Gravitational lensing by a cluster of galaxies. This Hubble Space Telescope
image of the cluster Abell 2218 shows many arcs caused by the lensing of distant background
galaxies by the mass distribution in the cluster. [NASA image recorded with the Hubble
Space Telescope by Andrew Fruchter and the ERO Team and released by the Space Telescope
Science Institute (as STScI-2000-07).]
6.15 Microlensing in the Milky Way
The exact nature of dark matter in the Universe is still unclear, despite dedicated research
over many years. The available evidence indicates that a majority of the dark matter is
dynamically cold and that it is collisionless. It could be in the form of subatomic particles,
such as weakly-interacting massive particles (WIMPs), or could be astronomical objects that
emit little or no radiation, known as massive astrophysical compact halo objects (MACHOs).
One possibility is that the dark matter in the Milky Way halo is MACHOs in the form
of brown dwarfs, compact objects below the hydrogen burning threshold of 0.08M

. Such
objects would act as point lenses. Indeed, the gravitational lensing of background stars by
dark astronomical objects in the halo would be one way of detecting individual dark matter
objects, and therefore of identifying the nature of dark matter.
A point lens has two images, at
=
1
2
_

S
2
+ 4
2
E
_
. (6.22)
(There is formally a third image at = 0, i.e., at the lens itself, but for a point mass
this image has zero magnication.) The image separation for a M
0
lens at distances of
10, kpc is < 1 mas, far too small to resolve. What will be observed is a brightening equal
to the combined magnication of both images. Using the result Equation 6.20 for [M[ for a
point lens, and adding the absolute values of [M[ at the two image positions, we get
M
tot
=
u
2
+ 2
u(u
2
+ 4)
1
2
, u =

S

E
. (6.23)
Now because of stellar motions,
S
will change by an amount
E
over times of order a
month, so microlensing in the Milky Way can be observed by monitoring light curves. If
the background source star has impact parameter b and velocity v (projected onto the lens
86
Figure 6.6: Predicted light curves for impact parameters of R
E
(lowest), 0.5R
E
and 0.2R
E
(top). The unit of time is how long it takes the source to move a distance R
E
.
place) with respect to the lens, then
u =
(b
2
+v
2
t
2
)
1
2
D
L

E
. (6.24)
Inserting Equation 6.24 into 6.23 gives us M
tot
(t), i.e. the light curve. This is plotted for
three dierent b in Figure 6.6. The height of a measured light curve immediately gives R
E
/b,
and the width gives R
E
/v.
Though trying to resolve the images images in microlensing seems hopeless with foresee-
able technology, there are some prospects for tracking the moving double image indirectly.
By combining the positions and magnications of the two images, we have for the centroid

cen
=
u(3 +u
2
)
2 +u
2

E
. (6.25)
Such microlensing events are rare, because
S
has to be
E
for signicant magnication.
People speak of an optical depth to microlensing in a eld. This is the probability of a star
being (in projection) within
E
of a foreground lens, at any given time. From Equation 6.24
it amounts to the probability of M
tot
2/

5 = 1.34. Its just the covering factor of discs of


radius
E
(Einstein rings) from all lenses between us and the stars in the eld. The source
stars might be bright stars in the Large Magellanic Cloud (LMC) and the lenses very faint
stars or brown dwarfs in the Milky Way halo. Note that the term optical depth has a very
dierent meaning here to the use of the term in radiation physics, as was used for example
in the discussion of extinction by dust in the interstellar medium.
We can derive an expression giving the optical depth towards a source plane as a function
of the density of microlensing objects in space between the source plane and an observer.
Consider a eld on the sky subtending a solid ange . Microlensing occurs due to objects of
mass M
L
between the observer and the source plane. Consider a thin surface over this solid
angle between a distance D
L
and D
L
+ DD
L
. The fraction of the eld covered by Einstein
radii of lensing sources in this thin shell is
d =
2
E
dN /
87
Figure 6.7: The observed light curve of microlensing event BUL SC3 91382 from the OGLE
survey. The graph shows the observed brightness of a star in the Galactic Bulge, expressed
as the infrared magnitude, plotted against time. The star was observed to brighten and fade
over a period of several weeks. The curve is a t to the data points using the expression
for M
tot
from Equation 6.23 and u from Equation 6.24, after choosing the time of maximum
brightness, the magnitude m
0
before/after the lensing event, and the quantities b/D
L

E
and v/D
L

E
to achieve the best t. The magnitude at any time is tted with m(t) =
m
0
2.5 log M
tot
(t). [Plotted with data provided by the OGLE project.]
where dN is the number of lenses in this thin shell. If n is the number density of lenses,
dN = n D
2
L
dD
L
.
The mass density is = nM
L
. Therefore,
d =

2
E
D
2
L
dD
L
M
L
.
Substituting for the angular Einstein radius and integrating over lens distance D
L
from the
observer to the source plane, we obtain,
=
4G
c
2
D
S
_
D
S
0
D
L
D
LS
(D
L
) dD
L
. (6.26)
The really nice thing about the formula 6.26 is that it doesnt depend on the mass
distribution of the lenses, as long as each mass ts within its own Einstein radius (diuse
gas clouds dont count, nor does any kind of diuse dark matter). So estimated from light
curve monitoring could be used to make inferences about .
How large is through the Galactic halo? To estimate that, we need an estimate for
. Now the Milky Way rotation curve suggests an isothermal halo, =
2
/(2Gr
2
), with
200 km/sec. If we then say that r will be of order the D factors in Equation 6.26, we
get


2
c
2
, or 10
7
to 10
6
.
88
This is a very low gure, showing that the probability of detecting a microlensing event when
monitoring a single star is negligible. However, monitoring very large numbers ( 10
6
to
10
7
) of stars simultaneously can make this feasible.
With some more care, we can estimate the lensing optical depth more accurately. As an
example, consider a study of stars in a (hypothetical) dwarf galaxy companion to our own
Galaxy that lies at a distance S from the Sun in the direction of the South Galactic Pole. The
survey aims to detect the brightening of stars in the dwarf galaxy caused by microlensing by
MACHOs in the Galactic halo along the sight line to the dwarf galaxy, as a test of whether
the dark matter halo of our Galaxy is made out of MACHOs. We shall assume that the dark
matter halo can be represented by an isothermal sphere model in which the density prole
is given by
(r) =

2
2Gr
2
, (6.27)
where r is the radial distance from the Galactic Centre, is the velocity dispersion of particles
moving in this potential, and G is the constant of gravitation. This isothermal sphere model
is likely to be a reasonable representation of the real density prole because it implies a at
rotation curve, providing we do not use it close to the Galactic Centre (because it implies
innite density as r ) or use at very large distances (because it implies a at rotation
out to innite distance).
In this example, the source of light is a star in the companion galaxy at a distance
D
S
= S. The lensing object is a MACHO at a distance D
L
from the Sun. The distance of
the lensing object from the Galactic Centre is r =
_
R
2
0
+D
2
L
where R
0
is the distance of
the Sun from the Galactic Centre. The distance between the lensing object and the source
is D
LS
= D
S
D
L
= D
S
R
0
. Using Equation 6.26,
=
4G
c
2
D
S
_
D
S
0
D
L
D
LS
(r) dD
L
=
4G
c
2
D
S
_
D
S
0
D
L
(D
S
D
L
)

2
2Gr
2
dD
L
=
4G
c
2
S
_
S
0
D
L
(S D
L
)

2
2G(R
2
0
+D
2
L
)
dD
L
=
2
2
c
2
S
_
S
0
(D
L
S D
2
L
)
(R
2
0
+D
2
L
)
dD
L
89
=
2
2
c
2
S
_
S
_
S
0
D
L
dD
L
(R
2
0
+D
2
L
)

_
S
0
D
2
L
dD
L
(R
2
0
+D
2
L
)
_
This can be solved using the standard integrals
_
x dx
a
2
+x
2
=
1
2
ln

a
2
+x
2

+ constant ,
_
x
2
dx
a
2
+x
2
= x a tan
1
_
x
a
_
+ constant .
Using these standard integrals, we get
=
2
2
c
2
S
_
_
S
2
ln
_
R
2
0
+D
2
L
_
_
S
D
L
=0
+
_
D
L
+ R
0
tan
1
_
D
L
R
0
__
S
D
L
=0
_
=
2
2
c
2
S
_
S
2
ln
_
R
2
0
+S
2
_

S
2
ln
_
R
2
0
_
S + R
0
tan
1
_
S
R
0
_
+ 0 tan
1
0
_
=
2
2
c
2
S
_
S
2
ln
_
1 +
S
2
R
2
0
_
S + R
0
tan
1
_
S
R
0
_ _
which simplies to
=

2
c
2
_
ln
_
1 +
S
2
R
2
0
_
2 + 2
R
0
S
tan
1
_
S
R
0
_ _
(6.28)
The distance of the Sun from the Galactic Centre is R
0
= 8.0 kpc. Putting S = 100 kpc as
the distance to the companion galaxy (a reasonable gure), we get S/R
0
= 12.5. Therefore,
= 3.3

2
c
2
.
Observations of stars in the stellar halo of the Galaxy nd = 200 kms
1
. Using this gure
for the hypothetical MACHOs gives = 1.510
6
. That is to say, were the dark matter halo
made of compact objects such as low mass stars, brown dwarfs, planet-size bodies or stellar
mass black holes, we would expect mictolensing events to occur with a frequency dened by
= 1.5 10
6
.
So to have any hope of detecting such microlensing events, it is necessary to monitor
the light curves of millions of stars. A number of surveys have been undertaken in the last
decade, observing elds in the LMC and the Milky Way bulge among others. (The bulge
surveys go through the Milky Way disc, of course, but do also probe that part of the dark
matter halo that extends through the disc.)
3
6.16 Results of Microlensing Surveys
Several surveys of large numbers of stars have been conducted over the past several years to
identify microlensing events. These have observed stars in the Galactic Bulge, in the Large
Magellanic Cloud and in the Andromeda Galaxy M31. These surveys have monitored many
millions of stars regularly for years to search for increases in their brightnesses consistent
with microlensing by possible MACHOs, and also caused by lensing by ordinary stars.
A considerable number of microlensing events have been observed to date. The current
measurements of from observations are 10
7
towards the LMC and 3 10
6
towards
the Bulge. However, this frequency is substantially lower than would be expected were the
dark matter halo of the Galaxy made entirely of MACHOs: too few lensing events have been
3
An estimate of from a survey will include a correction for the detection eciency. Surveys have to be
very wary of spurious detections; hence any light curve possibly contaminated by stellar variability has to be
discarded for microlensing purposes. Detection eciencies are of order 30%.
90
found to explain the dark matter in the halo. Many of the lensing events can be explained
as being caused by main-sequence stars or white dwarfs. How much of the lensing mass is
in brown dwarfs as distinct from faint stars is not entirely clear, but the available evidence
suggests that compact astronomical objects can make up 20 % of the dark matter halo of
the Galaxy. Meanwhile, the huge number of variable stars discovered by these surveys are
revolutionising that eld of study.
91
Chapter 7
The Galaxy: Its Structure and
Content
7.1 Introduction
Chapter 1 included a brief overview of our Galaxy, while later chapters discussed important
processes aecting the Galaxy and other galaxies. Here we bring these concepts together to
develop an understanding of the Galaxy itself. This will include a consideration of each of
the components (disc, bulge, halo, etc.) in detail.
The Milky Way Galaxy is, as far as we know, a typical disc galaxy. Figure 1.8 was a
cartoon to remind you of its dierent components. The luminous parts are mostly a disc of
population I stars and a bulge of older population II stars. We live in the disc, with the Sun
at a distance R
0
= 8.0 kpc from the centre. Apart from stars, the disc also has clusters of
young stars and HII regions, and gas and dust; the gas is mostly observed as an HI layer
which ares at large radii. There is some evidence that there are two or three spiral arms
in the disc (the dust makes it hard to tell). The bulge is accompanied by a bar, though the
dimensions of it are unclear. There are some very old stars (and globular clusters of very
old stars) in the stellar halo. But the most massive part is the dark matter halo, which is
made of dark matter of unknown composition.
That is not all: there are also the small companion galaxies. The best known of these
are the the Large and Small Magellanic Clouds (LMC and SMC) which are 50 kpc away;
these are associated with a trail of debris, mostly HI gas, known as the Magellanic Stream.
Then there is the Sagittarius Dwarf Galaxy which appears to be merging with the Milky
Way now.
7.2 The Mass of the Galaxy
The mass of the Galaxy enclosed within dierent radii can be determined using a variety of
methods. Observations of the rotation curve provide measurements out to 12 15 kpc.
Measurements of the dynamics of globular clusters can constrain the enclosed mass to a
greater distance.
However, while there are good estimates of the enclosed mass of the Milky Way within
dierent radii, it is not known where the halo of the Milky Way nally fades out (or even if
the size of the halo is a very meaningful concept). So the only way to get at the total mass
of the Milky Way is to observe its eect on other galaxies. The simplest but most robust
of these comes from an analysis of the mutual dynamics of the Milky Way and M31 (the
Great Andromeda Galaxy): it is known as the timing argument. This is discussed in the
next section.
92
7.3 The Mass of the Galaxy from Dynamical Timing Argu-
ments
The dynamical timing argument relies on modelling the dynamics of the Galaxy and nearby
galaxies. The Local Group contains two substantial spiral galaxies, the Galaxy and M31 (it
does also contain one less massive spiral, M33, several irregular galaxies of modest mass, and
numerous low mass dwarfs). We shall rst consider the mass constraint that can be obtained
from the dynamics of M31 and the Galaxy, ignoring the other Local Group galaxies.
The observational inputs are (i) M31 is 750 kpc away, and (ii) the Milky Way and M31
are approaching at 121 kms
1
. (The transverse velocity of M31 is poorly determined at
present.) A simple approximation for their dynamics is to suppose that they started out at
the same point moving apart with initial velocities from the Big Bang, and have since turned
around because of mutual gravity. This is not strictly true of course, because galaxies had
not already formed at the Big Bang; however it is thought that galaxies (at least galaxies like
these) formed early in the history of the Universe, so the approximation may be acceptable.
Writing l for the distance of M31 from the Galaxy, and M for the combined mass of both
systems, the equation of of motion for the reduced Keplerian one-body problem is
d
2
l
dt
2
=
GM
l
2
. (7.1)
Here we shall count time from the Big Bang, so that t = 0 refers to the Big Bang. The
current time and separation are t
0
and l
0
.
In considering a Keplerian problem without perturbation we are, of course, assuming
that the gravity from Local Group dwarfs and the cosmological tidal eld is negligible; but
as there are no other large galaxies within a few Mpc this seems a fair approximation.
It is not obvious how to solve this nonlinear equation, but fortunately the solutions are
well known and easy to verify. There are actually three solutions, depending on the precise
circumstances. One solution applies to the case where the combined mass M is too small to
halt the expansion and the two galaxies drift further apart for ever: this is not the case we
have here. A second solution applies to the limiting case where the mass is just insucient
to stop the motion apart (so dl/dt 0 and l as t ). The third solution applies
to the case where the mass is great enough to halt the drift apart and the galaxies fall back
toward each other: it is this case that we have here, where the two galaxies are already
falling towards each other.
This solution is most conveniently expressed in parametric form, as
t =
0
( sin ) ,
l =
_
GM
2
0
_1
3
(1 cos ) . (7.2)
Here
0
is an integration constant. The other integration constant has been eliminated by the
boundary condition that the two galaxies (or at least the material from which they formed)
were at the same position immediately after the Big Bang: i.e. l = 0 at t = 0.
It is easy to show that these equations for t and l are a solution to Equation 7.1 by
dierentiating them to get d
2
l/dt
2
and substituting them into Equation 7.1. This can be
done using
d
2
l
dt
2
=
d
dt
_
dl
dt
_
=
d
d
_
dl
dt
_
.
_
dt
d
_
1
=
d
d
_
dl
d
_
dt
d
_
1
_
.
_
dt
d
_
1
To determine the total mass M, we rst consider the dimensionless quantity
_
t
0
l
0
__
dl
dt
_
t
0
=
sin
0
(
0
sin
0
)
(1 cos
0
)
2
, (7.3)
93
Figure 7.1: The change in the separation l of the Galaxy and M31 with time t since the Big
Bang in the model used for the timing argument.
where the subscripts in t
0
and so on refer to the current time, as is conventional in cosmol-
ogy. The quantity on the left-hand side can be calculated directly from observational data.
Inserting the observed values of l
0
= 750kpc and
_
dl
dt
_
t
0
= 121kms
1
and a plausible value
of 14 Gyr for t
0
(the age of the Universe), we get
sin
0
(
0
sin
0
)
(1 cos
0
)
2
= 2.32 .
This can be solved numerically to give
0
= 4.28. Inserting these values into t
0
=
0
(
0

sin
0
) from Equation 7.2, we get
0
= 2.70 Gyr. Then l
0
= (GM
2
0
)
1/3
(1 cos
0
) gives
(GM
2
0
)
1/3
, and using the value we found for
0
gives
1
M 4.4 10
12
M

. (7.4)
From its luminosity and rotation curve, M31 appears to have approximately twice the
mass of the Milky Way, i.e. M
M31
2M
Galaxy
. Using M = M
M31
+ M
Galaxy
, this implies
that the mass of Milky Way exceeds 10
12
M

. Estimates for the mass of the luminous part


of the Milky Way range from (0.05 0.12) 10
12
M

, which conrms that the majority of


the mass of the Galaxy is unseen (it is dark matter).
It should be noted that this analysis predicts that the Galaxy and M31 will collide,
and consequently merge, at some time, about 3.0 Gyr in the future. However, it fails to
take account of the component of the velocity of M31 tangential to our line of sight. M31
and the Galaxy may have a tangential component, and therefore may have enough angular
momentum that they may not actually come together.
The timing argument can be applied not only to the Andromeda Galaxy, but also to Local
Group dwarf galaxies (which have much less mass and behave just as tracers). Figure 7.2
shows plots of l against dl/dt for some Local Group dwarfs, along with the predictions of
1
It is useful to remember G in useful astrophysical units as 4.98 10
15
M
1

pc
3
yr
2
.
94
Figure 7.2: Distances and velocities of six Local Group dwarf galaxies, and predictions for
dierent values of GM/
0
(by Alan Whiting).
the timing argument for dierent values of GM/
0
. This uses
l =
_
GM
2
0
_1
3
(1 cos ) and
dl
dt
=
_
GM

0
_1
3
sin
1 cos
. (7.5)
This model assumes that the dwarfs have been moving on radial trajectories since the Big
Bang.
7.4 Kinematics in the Solar Neighbourhood
The Milky Way is a dierentially rotating system. The local standard of rest (LSR) is a
system located at the Sun and moving with the local circular velocity (which is 220kms
1
).
The Sun has its own peculiar motion of 13 kms
1
with respect to the LSR.
The rotation velocity and its derivative at the solar position are traditionally expressed
in terms of Oorts constants:
A
1
2
_
v

R

v

R
_
at R = R
0
B
1
2
_
v

R
+
v

R
_
at R = R
0
(7.6)
Observations show that A = +14.4 1.2 kms
1
(kpc)
1
and B = 12.0 2.8 kms
1
(kpc)
1
.
One reason that these parameters are useful is that B vanishes for solid body rotation
(i.e. B = 0 when the angular velocity (R) = constant). Another useful property is that
the gradient of the rotational velocity is v

/R = (A+B) at R = R
0
, which means that
A + B = 0 if the rotation curve is at (because v

is the same as v
circ
, and therefore we
have v

/R = v
circ
/R = 0 for v
circ
= constant). Therefore calculating A+B is a test of
whether the Galaxy has a at rotation curve close to the Suns distance from the Galactic
Centre. Similarly, the angular velocity in the solar neighbourhood is
0
= v

/R[
R=R
0
=
AB.
95
The radial and tangential components of the velocity of stars or gas in circular orbit, v
r
and v
t
, can be written as functions of the galactic longitude l as
v
r
Ad sin(2l)
v
t
Ad cos(2l) + Bd (7.7)
locally (within about 1 kpc), where d is the distance from the Sun.
The advantage of the Oort constants A and B is that they describe the motions of stars
around the Sun in the Galaxy, and they can be measured from simple velocity and distance
data. But now that we have accurate proper motions from the Hipparcos satellite mission,
and hence (combining with ground-based line-of-sight velocities) three-dimensional stellar
velocities in the solar neighbourhood, A and B are less important.
If you take the average (three-dimensional) velocity and dispersions of any class of stars
in the solar neighbourhood, then v
R
) and v
z
) turn out to be nearly zero, while v

) is such
that v

)v
LSR
is negative and
RR
. This is known as the asymmetric drift and essentially
expresses the degree of rotational support versus pressure support. Young stars are almost
entirely supported by v

), like the gas that produced them. The asymmetric drift for young
stars is therefore nearly zero, because v

) v
LSR
. Older stars pick up increasing amounts of
pressure support in the form of
RR
; they then need less v

to support them, and thus tend


to lag behind the LSR. The linear relation can be derived from the Jeans equations, but we
wont go through that because youve probably had enough of Jeans equations for now. . .
When examined in detail using proper motions from the Hipparcos astrometry satellite,
the velocity structure in the solar neighbourhood is more complicated than anyone expected.
Figure 7.3 shows a reconstruction of the stellar (u, v) (i.e., radial and tangential velocity)
distribution in the solar neighbourhood for stars in dierent ranges of the main sequence.
2
Notice the clumps in the velocity distribution which appear for stars of all ages. (And these
are clumps only in velocity space, not in real space.) The idea that there are groups of
stars at similar velocities is itself not newit actually dates from the early proper motion
measurements of nearly a century ago. But these streams have generally been interpreted
as groups of stars which formed in the same complex and were later stretched in real space
over several galactic orbits. The surprising new nding is that the streams are seen for stars
of all ages, which indicates a dynamical origin; they seem to be wanting to tell us something
interesting about Milky Way dynamics, but as yet we dont know what.
7.5 Dynamics of the Galactic Disc
The orbits of stars in galaxies are, in general, not closed paths, as we saw in Chapter 2. This
is equally true of stars in the disc of our Galaxy. The tangential (v

) component dominates
for disc stars (v

is much larger than the v


R
and v
z
components). We can therefore break
the motion of disc stars into two parts: a uniform motion about the Galactic Centre, plus
the motion relative to this uniform rotation. This second part is called an epicycle. The
epicycle is very nearly an ellipse, but the period of the motion around the epicycle is not
the same as the period of the uniform component of the motion. (The term epicycle was
used historically for the complicated system of cycles that was used to t the motion of the
planets before Kepler explained the elliptical orbits about the Sun.)
7.6 The Disc (or Thin Disc)
The disc of the Galaxy contains mostly stars, with some gas. The stars are distributed with
an exponential density prole in both the R and z directions. The density (R, z) is therefore
2
The Schwarzschild ellipsoid and its vertex deviation that you may nd in textbooks should now be
considered obsoletethey are essentially the result of washing out the structure in Figure 7.3.
96
Figure 7.3: Distribution of radial (u) and tangential (v) velocities of main sequence stars in
the solar neighbourhood, recently reconstructed from Hipparcos proper motions by Walter
Dehnen (1998). The upper left panel is for the youngest (and bluest) stars; these are esti-
mated to be < 0.4 Gyr old. The upper right panel is for stars younger than 2 Gyr, and the
lower left panel is for stars younger than 8 Gyr. The lower right panel shows the combined
distribution for all main sequence stars. The Sun is at (0, 0) and the LSR is marked by a
triangle.
Figure 7.4: Epicyclic orbits. The rst diagram shows the orbit of a star in an elliptical orbit
in the disc of a spiral galaxy as viewed in an inertial frame. It moves in a rosette pattern.
The middle diagram shows the same orbit as viewed in a rotating frame, with the frame
rotating with uniform motion and with a period equal to the orbital period of the star. The
third diagram show the orbit of the star in the centre diagram in greater detail, showing the
epicyclic motion.
97
described by
(R, z) =
0
e
R/h
R
e
|z|/hz
,
where h
R
and h
z
are the scale lengths in the R and z directions, and
0
is a constant.
Observations show that h
R
3.5 kpc. The vertical scale height h
z
is dierent for dierent
age stars young stars have smaller scale heights but h
z
= 250 pc is a typical value.
The disc is rotationally supported with a circular velocity v
circ
220 kms
1
at the
Suns position from the centre. There is a small velocity dispersion around this of 15 kms
1
for young stars, 40 kms
1
for old ones. Young stars form in the gas and naturally have a
small velocity dispersion about the mean rotation. The greater velocity dispersion of older
stars has probably been caused by perturbations of the stars during encounters with giant
molecular clouds. We learnt in Chapter 2 that stars are collisionless in galaxies. However,
encounters between stars with giant molecular clouds can perturb stellar velocities in galactic
discs to a limited degree over the lifetime of a spiral galaxy.
Heavy element abundances in disc stars are close to the solar values. Typical metallicities
are [Fe/H] = 0.4 to +0.2.
The gas and its associated dust are concentrated close to the Galactic plane. The gas
moves in circular orbits. The HI gas layer ares and warps at large radius.
The main disc is often called the thin disc to distinguish it from the thick disc, described
below.
7.7 The Thick Disc
The term thick disc is usually given to a distribution of stars that is more extended in the
vertical direction (perpendicular to the plane) than the main Galactic disc (the thin disc).
The term is associated with stars, not gas. It consists of moderately metal-poor, older stars,
with [Fe/H] close to 0.6. The system is rotationally supported, but v
circ
slightly smaller
than for thin disc, a consequence of the stars showing a velocity dispersion that is larger
than for those of the thin disc. The asymmetric drift is 30 50 kms
1
. Only about 2% of
the stars in the solar neighbourhood belong to the thick disc. The density distribution, like
that of the thin disc, is a double exponential function, with probably a comparable radial
scale length to the thin disc, but the vertical scale height is h
z
1.3 kpc. There has been
some controversy over whether it is a distinct component of the Galaxy in its own right, or is
made merely out of a small number of disc stars with extreme metallicities and kinematics.
7.8 The Bulge
The bulge is a spheroidally distributed, but attened, system in the central regions of the
Galaxy, conned to the inner 2 kpc. Its stars are old. They show a very wide range in
metallicity, ranging from [Fe/H]= 1 to +1. It is largely pressure supported.
7.9 The Bar
There is little doubt now that the distribution of stars in the region of the Milky Way bulge
is triaxial there is a (rotating) bar with the positive l side nearer to us and moving away.
The evidence for this was at rst indirect, and took the following form. Consider gas in the
ring, which must move on closed orbits. If it moved on circular orbits in the disc, and we
measured its Galactic longitude l and line of sight velocity v, then all the gas at positive l
(i.e. on one side of the Galactic Centre) would have one sign for v and similarly all the gas at
negative l (on the other side) would have the opposite sign for v. In fact gas at positive l is
seen with both signs for v, and likewise at negative l. So the gas orbits must be non-circular,
98
and hence the gravitational potential must be non-circular in the disc. This suggests a bar
and indeed the observed gas kinematics is well tted by a bar.
Figure 7.5: Schematic of the bar in the Milky Way Bulge, viewed from the North Galactic
pole (left), and from the Sun (right). (From Blitz and Spergel, ApJ, 1991. The right panel
uses minus the usual convention for l.)
The features of a bar can in fact be seen in an infrared map of the bulge, if you know
what to look for. Figure 7.5 shows a bar in the plane, and its eect on an l, b map.
1. The side nearer to us is brighter. Contours of constant surface brightness are further
apart in both l and b on the nearer size.
2. Very near the centre, the further side appears brighter, so the brightest spot is slightly
to the further size of l = 0. The reason is that on the further side our line of sight
passes through a greater depth of bar material, which more than compensates for it
being slightly further.
The features (i) can be discerned in many dierent data sets; the feature (ii) is harder to
nd, it just about shows up in the COBE maps of the bulge.
7.10 The Galactic Centre
Observing the centre of the Galaxy is extremely dicult in the optical because the extinction
caused by dust in the Galactic plane is approximately 30 magnitudes in the V-band. Things
are not so extreme in the infrared, and in the K band (2.2 m) the extinction is a more
moderate 34 mag. However, the available data show that there is a very dense star cluster
at the Galactic centre, with a compact radio source, Sagittarius A

(abbreviated Sgr A

) at
its centre. There is a ring or disc of gas around the centre, about 5 pc across, detected by
its molecular emission.
Orbital velocities immediately around Sgr A

are very high. The available evidence


suggests that there is a compact massive object at the centre of Sgr A

. This is probably a
central black hole with a mass 1 3 10
6
M

.
7.11 The Stellar Halo
The stellar halo is the spheroidally distributed, slightly attened, system that extends far
from the disc. It includes diuse eld stars and globular clusters. The stars are very old
99
(13 Gyr) and are very metal-poor, having [Fe/H] 1 to 2.5. It makes only a very small
contribution to the total mass of the Galaxy. It is a pressure-supported system. The velocity
dispersion is = 200 kms
1
. The asymmetric drift is 190 kms
1
. These two gures mean
that the kinematics of halo stars are very dierent to those of the disc. Only about 1/1000-th
of the stars in the solar neighbourhood belong to the halo. Examples of halo stars in the solar
neighbourhood, of value for example in studying chemical abundances in very metal-poor
stars, have often been found from their high proper motions: relatively nearby halo stars
usually have large motions across the sky compared with typical (disc) stars.
7.12 Globular Clusters
The Galaxy contains about 150 globular clusters. They are compact systems of 10
5
stars. Many of these are very metal-poor, are distributed spheroidally and have randomly
orientated orbits: they appear to be associated with the stellar halo.
However, some globular clusters are only moderately metal-poor. Those globular clusters
having [Fe/H> 0.8 form a more attened system. They may be associated with the thick
disc.
7.13 The Dark Matter Halo
The dark matter halo appears to extend out to large radii, to 100 kpc, as is shown by
studies of the dynamics of companion dwarf galaxies, for example. It dominates the mass of
the Galaxy. It appears to be spheroidally distributed: it does not appear to be concentrated
in the Galactic disc, as we saw in Section 2.20.
The nature of dark matter is unknown, but it appears not to be in the form of stellar
mass compact objects, such as white dwarfs or brown dwarfs, as microlensing surveys have
shown (at least only a small proportion of the dark matter can be in stellar mass compact
objects). Neither is it likely to be in the form of dark compact objects having masses
1000M

, such as massive black holes. Objects of this type would perturb the dynamics
of disc stars, thickening the disc, which is not observed on a signicant scale.
Constraints from primordial nucleosynthesis imply that baryonic matter only contributes
4 % of the closure density of the Universe. Cosmological results indicate that matter con-
stributes 27 % of the closure density. Therefore we expect that most of the dark matter in
the Universe is not in the form of baryonic matter, if the cosmological models are correct.
This is consistent with the results of Galactic microlensing surveys.
The dark matter must be composed of individual particles, be they subatomic particles
or astronomical bodies. To support a spheroidal distribution within the Galaxys potential,
these particles must be moving on mostly randomly orientated trajectories with a velocity
dispersion of 200 400 kms
1
. The dark matter particles do not dissipate signicant energy
in interactions with each other (or with the luminous matter), otherwise it would settle down
to a rotating disc or a single mass at the centre of the Galaxys potential which it has not
done.
7.14 The Local Group
The Galaxy lies in a system of more than 40 galaxies called the Local Group, about 1 Mpc
across. There are two large spiral galaxies the Galaxy and the Great Andromeda Galaxy
(M31) and one spiral (M33) of slightly lower mass. There are a few irregular galaxies,
most notably the Large Magellanic Cloud and the Small Magellanic Cloud, but these are
not particularly massive. All the other members are dwarf galaxies, either dwarf irregulars,
dwarf ellipticals or dwarf spheroidals.
100
A large majority of the galaxies are companions of either the Galaxy or M31. For
example, the Magellanic Clouds are situated 50 60 Mpc from the Galaxy. Several of the
dwarf spheroidal galaxies lie within 200 kpc.
7.15 The Formation of the Galaxy
A fundamental question relating to the Galaxy is how it was formed. Some stars in the
Galaxy, such as those in the thick disc and bulge, and particularly in the halo, are old.
The oldest stars were formed relatively early in the history of the Universe, indicating that
the Galaxy had a relatively early origin. Star formation has continued in the disc, at least,
throughout its history.
There are two main scenarios for the formation of the Galaxy:
the monolithic collapse model, and
the merging of subunits.
The monolithic collapse model was developed in the 1960s, particularly by Eggen, Lynden-
Bell and Sandage. In this picture, the Galaxy formed by the collapse of a protogalactic cloud
of gas that had some net angular momentum. The gas initially had a very low metallicity.
The collapse occurred mostly in the radial direction and some modest star formation occurred
during this time. This produced very metal-poor stars with randomly-oriented elongated or-
bits, which today are observed as the stellar halo. The gas settled into a broad rotating disc,
which was moderately metal-poor by this time as a result of the enrichment of the gas by
the heavy elements created by halo stars. The rotation was the result of the net angular
momentum of the protogalactic cloud. Star formation in this cooling, settling disc produced
thick disc stars which are rotationally supported but have an appreciable velocity dispersion.
The gas continued to settle into a thinner, stable, rotating disc. Residual gas that fell to the
central regions formed the bulge stars. Star formation continued in the gas disc at a gradual
rate, building up the stars of the thin disc.
This model predicted the main features of the Galaxy, and did so very neatly. It explained
the fast rotation of disc stars with their near-solar metallicities, and the randomly orientated
orbits of halo stars with their very low metallicities and great ages.
The merging scenario instead maintained that the Galaxy was built up by the merg-
ing and accretion of subunits. It was rst developed by Searle and Zinn in 1977 for the
stellar halo. The merging model is strongly supported by detailed computer simulations of
galaxy formation. These simulations predict that the primordial material from the Big Bang
clumped into large numbers of dark matter haloes that also contained gas. These small
haloes then merged through their mutual gravitational attraction, building up larger haloes
in the process. The gas formed some stars in these dark matter clumps. A large number of
these subunits produced our Galaxy, with the gas settling into a rotating disc as a natural
consequence of the dissipative, collisional nature of the gas. Star formation in the disc then
formed the disc stars. The stars of the Galactic stellar halo may have come from the accre-
tion of subunits that had already formed some stars. This process of building galaxies by
the merging of clumps to form successively larger and larger units is known as hierarchical
galaxy formation.
Mergers certainly played an important role in the formation and subsequent evolution of
the Galaxy. Detailed computer modelling of galaxy formation provides powerful evidence in
favour of this picture. Indeed the merging process may well be occurring today.
7.16 The Sagittarius Dwarf
We shall end our discussion of our Galaxy with the Sagittarius Dwarf. Although it has been
in the past an independent galaxy, it is today plunging into our Galaxy and appears to be
101
0 10 20 30
-15
-10
-5
0
5
X (kpc)
Figure 7.6: A partial map of the Sagittarius dwarf galaxy, from RR Lyrae variables. We
are at (0, 0), the ellipses around (8.5, 0) represent the bulge, and the four circles indicate the
four microlensing survey elds where the RR Lyraes were found. (From Minniti et al. 1997.)
in the process of being pulled apart by the gravitational inuence of our Galaxy.
It may seem amazing that this fairly substantial companion galaxy of the Milky Way
remained undiscovered till 1993; the reason is that it is located behind the bulge, and thus
has the densest part of the Milky Way in the foreground as camouage. We do not know in
detail how large the Sagittarius Dwarf is, because its stars are dicult to distinguish from
the foreground stars. A lower limit on its size comes indirectly from microlensing surveys,
because they detect RR Lyrae variables in their elds. Figure 7.6 shows its rough extent.
The Sagittarius Dwarf is a highly elongated body. It includes the some globular clusters,
including M54, and is associated with a very faint star stream. It contains mostly moderately
old or very old stars. It is almost certainly being tidally stretched as it passes through Milky
Way halo: that would explain its long, thin structure. It probably will be totally disrupted
over the following 10
8
10
9
years, with its stars being lost into the Galaxys stellar halo.
The Sagittarius Dwarf provides evidence that the Galaxy does accrete small companion
galaxies. It may well have consumed many such galaxies in the past. Indeed, a recent study
of infrared observations has found debris from a dwarf galaxy, which has been called the
Canis Major Dwarf, situated within the Galaxy about 13 kpc from the Galactic centre. This
provides evidence that merging is a signicant process in galaxy evolution, and possibly to
their formation.
102
Appendix A
Revision of Astronomical Quantities
A1. Astronomical Units
At a research and academic level in astronomy, large distances are expressed in parsecs (pc),
while at a popular level they tend to be given in light years (ly).
1 pc = 3.0857 10
16
m = 3.2616 ly
Distances on the scale of galaxies are often expressed in kiloparsecs (kpc), with 1 kpc
1000 pc. Distances between galaxies and on the cosmological scale are usually expressed in
megaparsecs (Mpc), with 1 Mpc 10
6
pc.
Distances on the scale of the Solar System are measured in terms of the semi-major axis of
the Earths orbit, the astronomical unit (AU), with 1 AU = 1.4960 10
11
m.
Masses are often measured in terms of the mass of the Sun, the solar mass M

, with
1 M

= 1.989 10
30
kg
Luminosities, dened as the total power output of radiation (in the form of visible light,
infrared, ultraviolet etc.), are often expressed in terms of the luminosity of the Sun, the solar
luminosity L

, with
1 L

= 3.826 10
26
W
Angular separations on the sky are measured in degrees (deg or

), minutes of arc (arcmin
or

) and seconds of arc (arcsec or

). The abbreviations arcmin and arcsec are used in
preference to min and sec to distinguish them from the minutes and seconds of time that are
used when expressing coordinates of right ascension on the sky.
Wavelengths of light are sometimes expressed in

Angstr om units (

A), with 1

A 10
10
m,
in preference to the nanometre (nm, with 1nm 10
9
m 10

A). Wavelengths of infrared


radiation are often expressed in micrometres (m), with 1 m 10
6
m. The micrometre
is often called the micron.
Time is often expressed in years, with 1 yr = 3.1557 10
7
s.
Long time spans are often measured in Gigayears, with 1 Gyr 10
9
yr = 3.1557 10
16
s.
In all other instances, S.I. units should be used. Unfortunately, some older systems, such as
cgs units, still persist in research articles and textbooks.
A2. Astronomical Magnitudes
The brightnesses of astronomical objects in the optical, near infrared and near ultraviolet
regions of the spectrum are expressed on a logarithmic scale called magnitudes. A magnitude
is the brightness integrated over a some range of wavelength, and consequently any particular
magnitude applies to a certain region of the spectrum. This region of the spectrum is
conventionally selected by passing the light through a coloured lter and that region of the
spectrum is called the lters waveband, passband or the photometric band.
Commonly used wavebands are the U band in the near ultraviolet (around 360 nm wave-
length), the B band in the blue (around 440 nm), the V band in the green/yellow (around
550 nm), the R band in the red (around 640 nm) and the I band in the near infrared (around
790 nm). It is always necessary to specify which waveband is being used when magnitudes
are quoted (and precisely which denition of passband is being used).
The apparent magnitude m
F
of an object in some waveband F is related to the ux of
radiation F
F
in that band at the top of the Earths atmosphere by
m
F
= C
F
2.5 log
10
(F
F
) ,
where C
F
is a calibration constant for that band. The constant 2.5 has been chosen to
maintain consistency with historical denitions of magnitudes. A fundamental consequence
of this denition is that brighter objects have smaller magnitudes. For example, a magnitude
16.3 star is brighter than a magnitude 19.7 star.
Therefore two objects that have uxes F
1
and F
2
in some band will have apparent
magnitudes m
1
and m
2
in that band that are related by
m
1
m
2
= 2.5 log
10
_
F
1
F
2
_
and equivalently,
F
1
F
2
= 10

2
5
(m
1
m
2
)
.
The absolute magnitude is the magnitude that an object would have if it were observed at
a distance of precisely 10 pc. The absolute magnitude therefore measures the luminosity, or
total power output, in the photometric band. Absolute magnitudes are denoted by a capital
M with a subscript indicating the photometric band, such as M
V
for the V-band absolute
magnitude.
The apparent magnitude m
F
and the absolute magnitude M
F
of some object through
some lter F are related by
m
F
M
F
= 5 log
10
(D/pc) 5 + A
F
,
where D is the distance (here expressed in parsecs) and A
F
is the loss of light due to extinction
by intervening material (usually interstellar dust). This equation has to be modied for
distant galaxies, for which cosmological eects are important, by using,
m
F
M
F
= 5 log
10
(D
L
/pc) 5 + A
F
+ k
F
,
where D
L
is the luminosity distance (again expressed here in parsecs), and k
F
is known as
the k-correction (it expresses the eect of redshift on the passband).
Apparent magnitudes are often denoted simply by the name of the waveband, rather
than by a letter m followed by a subscript indicating the band. For example, V as well
as m
V
denotes the V-band apparent magnitude, and B as well as m
B
denotes the B-band
apparent magnitude.
The dierence between magnitudes in dierent wavebands is known as a colour index.
The colour index is an excellent measure of the colour of an object. For example, the (BV )
colour index measures the relative brightness of an object in the blue and yellow parts of the
spectrum.
The calibration constants C
F
for dierent photometric bands are usually dened so that a
star of spectral type A0 V (a relatively hot main sequence star) has zero colour indices. So the
bright star Vega, which happens to be of type A0 V, has (BV ) = 0.00 and (V R) = 0.00.
As an example of the use of magnitudes, if a star is observed to have a B-band apparent
magnitude of B = 17.85 mag and a V-band apparent magnitude of V = 17.05 mag, its
(B V ) colour index will be (B V ) = 17.85 17.05 = 0.80 mag. If it lies at a distance
of D = 2000 pc and there is negligible interstellar extinction between us and the star (i.e.
A
B
= A
V
= 0.00), the absolute magnitudes will be M
B
= B 5 log
10
(D/pc) + 5 A
B
=
+6.34 mag and M
V
= V 5 log
10
(D/pc) + 5 A
V
= +5.54 mag.
Appendix B
Revision of Newtonian Gravitation
B1. Summary
This appendix summarises some basic results relating to gravitation from Newtonian Me-
chanics. This information covers most of the basic principles from physics about gravitation
that are needed for the course.
B2. The Gravitational Field from a Point Mass
The attractive force between two particles of mass m
1
and m
2
a distance r apart is
F
grav
=
Gm
1
m
2
r
2
,
where G is the universal constant of gravitation, with G = 6.673 10
11
m
3
kg
1
s
2
.
Using Newtons Second Law, the acceleration due to gravity at a distance r from a point
mass m is
g =
Gm
r
2
,
directed towards the point mass.
The acceleration due to gravity is the gravitational eld strength.
The gravitational potential a distance r from a point mass m is
=
Gm
r
.
B3. General Results about Gravitational Fields
The acceleration due to gravity g at any point is related to the gradient of the gravitational
potential by
g = ,
in any gravitational eld.
The potential is negative at all times, tending to zero at innite distance.
The potential energy of a particle of mass m at a point in a gravitational eld is
V = m ,
where is the potential at the point. This dention means that the potential energy is
negative.
Gausss Law relates the integral of the gravitational acceleration over a closed surface to the
mass lying inside that surface. If g is the acceleration due to gravity and dS is an element
of the surface S, then
_
S
g dS = 4 GM
S
for any closed surface S, where M
S
is the total mass contained within the surface. This is
the direct equivalent of Gausss Law for electrostatics (
_
S
E dS = Q
S
/
0
).
Substituting for g = , we also have
_
S
dS = + 4 GM
S
The Poisson Equation relates the Laplacian of the potential at a point to the mass density.
It states that

2
= 4 G ,
where is the potential at the point and is the density.
B4. The Gravitational Field within a Spherically Symmetric
Distribution of Mass
The acceleration due to gravity at a distance r from the centre of a spherically symmetric
distribution of mass is
g =
GM(r)
r
2
where M(r) is the mass interior to a radius r, and is directed towards the centre of the
distribution. This result does not depend on how the mass is distributed, other than it is
spherically symmetric. Mass outside the radius r does not aect the gravitational eld at
r in this spherically symmetric case. This is the same acceleration as would be given by a
point mass M(r) at the centre of the distribution.
This result can be derived very easily using Gausss Law.
Consider a spherical sur-
face S of radius r centred
on the distribution.
The acceleration due to
gravity at a point on the
surface is g. The magni-
tude of the acceleration ev-
erywhere on the surface is
[g[ g, from symmetry.
From Gausss Law,
_
S
g dS = 4GM(r) ,
where M(r) is the mass inside the surface and G is the universal constant of gravitation.
But an element of the surface area dS is anti-parallel to the acceleration due to gravity g,
so g dS = [g[ [dS[ = g dS. But since g is constant over the spherical surface,
g
_
S
dS = 4GM(r)
g (4r
2
) = 4GM(r) ,
which gives,
g =
GM(r)
r
2
.
This analysis is possible because of the spherical symmetry.
B5. Gradient Operators
The results presented above use various operators from vector calculus. It may be useful to
quote mathematical expressions for these explicitly for some important coordinate systems.
In a Cartesian coordinate system (x, y, z) with unit vectors , and

k, the gradient of any
function A(x, y, z) is
A
A
x
+
A
y
+

k
A
z
.
The Laplacian operator in the Cartesian system is

2
A (A)

2
A
x
2
+

2
A
y
2
+

2
A
z
2
.
In a spherical polar coordinate system (r, , ) with unit vectors e
r
, e

and e

, these operators
are
A e
r
A
r
+ e

1
r
A

+ e

1
r sin
A

2
A
1
r
2

r
_
r
2
A
r
_
+
1
r
2
sin

_
sin
A

_
+
1
r
2
sin
2

2
A

2
,
for any scalar function A(r, , ).
In a cylindrical coordinate system (R, , z) with unit vectors e
r
, e

and e
z
, these operators
are
A e
R
A
R
+ e

1
R
A

+ e
z
A
z
.

2
A
1
R

R
_
R
A
R
_
+
1
R
2

2
A

2
+

2
A
z
2
,
for any scalar function A(R, , z).
B6. Distributions of point masses
In this section we shall consider the gravitational eects of a series of point masses m
i
which
are located at positions r
i
, for i = 1, N.
The gravitational potential at some position r caused by the distribution is
(r) = G
N

i=1
m
i
[r r
i
[
,
where G is the constant of gravitation.
The acceleration due to gravity at the point r is
g = G
N

i=1
m
i
[r r
i
[
3
(r r
i
) .
The internal gravitational potential energy of the distribution of point masses is
V =
1
2
G

i,j
i=j
m
i
m
j
[r
i
r
j
[
.
B7. Continuous distributions of mass
The gravitational potential at a position r in a continuous distribution of mass enclosed in
a volume V is given by
(r) = G
_
V
(r

)
[r r

[
dV

.
where r

is the position vector of the volume element dV



, (r

) is the mass density at the


position r

, and G is the constant of gravitation. The gradient of this is


(r) = G
_
V
(r

) (r r

)
[r r

[
3
dV

.
So the acceleration due to gravity at the point r is
g = (r) = G
_
V
(r

) (r r

)
[r r

[
3
dV

.
The internal gravitational potential energy of some distribution of mass is
V =
1
2
G
_
V
_
V
(r)(r

)
[r r

[
dV dV

.
where r is the position vector of the volume element dV and where r

is the position vector


of the volume element dV

.
Appendix C: Example Problems
Problem 1: Question
Suppose some category of galaxies has an observed surface brightness prole I(R) = I
0
f(R/R
0
)
with all galaxies having the same I
0
and function f but dierent galaxies having dierent
R
0
. If the mass-to-light ratio is constant everywhere then show that
L v
4
where L is the total luminosity and v is a characteristic velocity.
Problem 2: Question
The Plummer potential has a gravitational potential (r) at a distance r from the centre of
a spherically-symmetric mass distribution that is given by
(r) =
GM
tot

r
2
+a
2
,
where M
tot
is the total mass, G is the constant of gravitation, and a is a constant. Derive
from this an expression for the mass M(r) interior to a radius r, and show that the density
(r) at a radius r is
(r) =
3M
tot
4
a
2
(r
2
+a
2
)
5
2
.
Problem 3: Question
A family of radial density proles (r) that have been popular for the theoretical modelling
of spherically symmetric galaxies have been Dehnen models. These are dened so that the
density proles are
(r) =
q a
4
r
q
r
3
(r +a)
q+1
M
tot
,
where r is the radial distance from the centre of the galaxy, q is an adjustable parameter, a
is a scaling constant (determining the size of the galaxy), and M
tot
is the total mass. The
special case of q = 1, which is called the Jae model, is particularly important because it
is found to t the observed I(R) of ellipticals at least as well as the de Vaucouleurs R
1/4
prole.
What is the mass M(r) interior to a radius r for any value of q?
What is the gravitational potential of a mass distribution having a Jae (r)?
The Dehnen models have an interesting limit as q 0. What is it?
You may use the standard integral
_
r
q1
(r +a)
q+1
dr =
1
q a
r
q
(r +a)
q
+ constant .
Appendix C: Example Problems
Problem 1: Answer
Integrating over the surface brightness gives the luminosity of a galaxy to be L I
0
R
2
0
.
Because I
0
is constant for all galaxies of this type, L R
2
0
for all. The virial theorem
implies M/R
0
v
2
, where M is the mass of a galaxy and v is a typical velocity of stars
in a galaxy. Eliminating R
0
gives L M
2
v
4
. Because the mass-to-light ratio is constant,
M/L = constant, so M L. Substituting for M in L M
2
v
4
gives L L
2
v
4
, which in
turn gives
L v
4
,
the required result.
This is the same as the Tully-Fisher relation for spiral galaxies, or the Faber-Jackson relation
for elliptical galaxies (and observed samples of both types of galaxies do tend to have only
a limited range in I
0
and standard I(R) proles).
Problem 2: Answer
Gausss Law gives
_
S
dS = 4GM
S
for any closed surface S, where is the potential at
a point on the surface and M
S
is the total mass enclosed within that surface. Consider the
surface S to be a spherical surface of radius r centred on the mass distribution. Therefore
= (r) is constant over the surface of radius r.
Using a spherical polar coordinate system (r, , ) centred on the mass distribution with unit
vectors e
r
, e

and e

,
e
r

r
+ e

1
r

+ e

1
r sin

= e
r
d
dr
,
in this case because the / and / terms are zero on account of the spherical sym-
metry. So, is directed radially outwards and [[ = d/dr .
So and dS are parallel over the whole surface. Therefore dS = [[[dS[ cos 0 =
[[dS, which gives in Gausss Law,
[[
_
S
dS = 4 GM(r) ,
using the fact that is constant over the surface. Substituting for [[ = d/dr we get,
d
dr
(4r
2
) = 4 GM(r) M(r) =
r
2
G
d
dr
.
Dierentiating the expression for in the question,
d
dr
=
GM
tot
r
(r
2
+a
2
)
3/2
, which gives M(r) =
M
tot
r
3
(r
2
+a
2
)
3/2
,
the result given in the lectures.
To determine the density , we can consider a thin spherical shell of radius r and thickness
dr centred on the mass distribution. The volume of the shell is 4r
2
dr and its mass is
4r
2
(r)dr where (r) is the density at a radius r. So,
(r) =
1
4 r
2
dM
dr
.
Dierentiating the expression for M(r) derived above using the product rule,
dM
dr
=
3 M
tot
r
2
(r
2
+a
2
)
3/2

3 M
tot
r
4
(r
2
+a
2
)
5/2
=
3 M
tot
a
2
r
2
(r
2
+a
2
)
5/2
.
(r) =
3 M
tot
4
a
2
(r
2
+a
2
)
5/2
,
the result we had to prove.
As an alternative method, we could use Poissons equation
2
= 4G, which in this case
of spherical symmetry gives
(r) =
1
4G

2
=
1
4G
1
r
2
d
dr
_
r
2
d
dr
_
.
Substituting for the expression for d/dr from above and dierentiating would give the
required result.
Problem 3: Answer
Consider a thin spherical shell of radius r

and thickness dr

concentric with the galaxy. The


mass in the shell will be
dM

= 4 r
2
dr

(r

)
Integrating from the centre of the galaxy (r

= 0) to radial distance r,
_
M(r)
0
dM

=
_
r
0
4 r
2
dr

(r

) =
_
r
0
4 r
2
dr

q a
4
r
q
r
3
(r

+a)
q+1
M
tot
.
M(r) = q a M
tot
_
r
0
r
q1
(r

+a)
q+1
dr

= M
tot
_
r
q
(r

+a)
q
_
r
r

=0
on using the standard integral provided. This gives,
M(r) = M
tot
_
r
q
(r +a)
q

0
q
(0 +a)
q
_
= M
tot
r
q
(r +a)
q
,
the required result (for all q ,= 0).
The gravitational potential is related to the mass M
S
inside a surface S by Gausss Law,
_
S
dS = 4GM
S
. If S is a sphere of radius r centred on the galaxy,
_
S
dS =
4GM(r), which in this spherically symmetric case becomes
_
S
d
dr
dS = 4GM(r)
d
dr
_
S
dS = 4GM
tot
r
q
(r +a)
q
4r
2
d
dr
= 4GM
tot
r
(r +a)
for the Jae model (q = 1)

d
dr
= GM
tot
1
r
1
(r +a)
.
Integrating from a radius r to innity (with () = 0),
_
0
(r)
d

= GM
tot
_

r
1
r

1
(r

+a)
dr

.
This can be solved using partial fractions:
(r) 0 = GM
tot
_

r
1
a
_
1
r


1
(r

+a)
_
dr

=
GM
tot
a
_
ln r

ln(r

+a)
_

=r
(r) =
GM
tot
a
_
ln
_
1
1 +a/r

_ _

r
=
GM
tot
a
ln
_
r
r +a
_
=
GM
tot
a
ln
_
r +a
r
_
,
for the potential at a radius r in the Jae (q = 1) model.
Alternatively, we could approach the problem from a more physical perspective and consider
the potential energy released when a particle of mass m is brought from innity to a radius
r in the presence of the gravitational force F = GM(r)m/r
2
. The potential energy at a
distance r from the centre is then V
p
(r) = m(r), from which we could calculate (r). This
would give the same result as the method above.
If q 0, the density prole gives = 0 for r > 0. However,
(0) =
a M
tot
4
lim
q, r0
q r
q
r
3
(r +a)
q+1
.
So q 0 implies that all the mass M
tot
is concentrated at the centre: it corresponds to a
point mass.
Appendix C: Example Problems
Problem 4: Question
The distribution function f in a spherically-symmetric galaxy is related to the mass density
(r) at a radial distance r from the centre by
(r) = 4

2 m
_
0
(r)
_
E
m
(r) f(E
m
) dE
m
,
where E
m
is the energy per unit mass for a star, (r) is the gravitional potential at a radius
r, and m is the mean mass per star. Show that a functional form f(E
m
) = b (E
m
)
7
2
is a
solution to this equation for a Plummer potential, where b is a constant, using the potential
and density given in Question 2. Express b is terms of G, M
tot
and a using the result of
Question 2. The substitution E
m
= cos
2
and the standard result
_
2
0
sin
2
cos
8
d =
7
512
may prove useful.
Assuming m = 0.70M

, what is the value of the distribution function f for (x, y, z, v


x
, v
y
, v
z
) =
(10kpc, 0, 0, 0, 0, 200kms
1
) in a galaxy having a Plummer potential with a softening param-
eter a = 1.70 kpc and a total mass of 2.0 10
12
M

? Note that x = y = z = 0 corresponds


to the centre of the galaxy in this coordinate system.
[ 1M

= 1.989 10
30
kg, 1 kpc = 3.0857 10
19
m, and G = 6.673 10
11
m
3
kg
1
s
2
.]
Appendix C: Example Problems
Problem 4: Answer
Try f = b (E
m
)
7/2
, where E
m
is the energy per unit mass and b is a constant. The density
becomes
(r) = 4

2 mb
_
0

_
E
m
(E
m
)
7/2
dE
m
.
Note that E
m
and are both negative, so E
m
and are both positive. Use the substi-
tution E
m
= cos
2
. Dierentiating, dE
m
= 2sin cos d. The limits of the integral
are:
when E
m
= , cos
2
= 1 . cos = 1 . Take = 0 .
when E
m
= 0, cos
2
= 0 . cos = 0 . Take = /2 .
Using this substitution, the density becomes
(r) = 4

2 mb
_
/2
0
_
cos
2
(cos
2
)
7/2
(2sin cos d)
= 4

2 mb
_
/2
0

sin ()
7/2
cos
7
(2)() sin cos d
= 8

2 mb ()
5
_
/2
0
sin
2
cos
8
d .
Using the standard integral, we get,
(r) = 8

2 mb ()
5
_
7
512
_
=
7

2
2
64
mb ()
5
Substituting for the Plummer potential from Question 1, we obtain,
(r) =
7

2
2
64
mb
G
5
M
5
tot
(r
2
+
2
)
5/2
.
This is the same as the expression for the density of the Plummer potential in Question 1 if
b =
24

2
7
3
a
2
mG
5
M
4
tot
.
So f(E
m
) = b(E
m
)
7/2
is a solution to the equation relating density and the distribution
function in Question 1 if the constant b has this value.
The energy per unit mass for the position and velocity in the question is
E
m
=
GM
tot

r
2
+a
2
+
1
2
v
2
= 8.48 10
11
+ 2.00 10
10
J kg
1
= 8.28 10
11
J kg
1
using a radial distance from the centre of the galaxy of r =
_
x
2
+y
2
+z
2
= 10 kpc =
3.09 10
20
m. But f = b(E
m
)
7/2
with
b =
24

2
7
3
a
2
mG
5
M
4
tot
= 0.1564
(1.70 3.0857 10
19
)
2
(0.70 1.989 10
30
)(6.673 10
11
)
5
(2 10
12
1.989 10
30
)
4
m
13
s
10
= 0.1564
(1.70 3.0857)
2
10
38
(0.70 1.989)(6.673)
5
(2 1.989)
4
10
143
m
13
s
10
= 9.33 10
112
m
13
s
10
So f = b(E
m
)
7/2
gives,
f = 9.33 10
112
(8.28 10
11
)
7/2
m
3
(ms
1
)
3
= 4.82 10
70
m
3
(ms
1
)
3
.
[Part of this question appeared in the May 2005 examination.]
Appendix C: Example Problems
Problem 5: Question
A spherical elliptical galaxy has a total density distribution

tot
(r) =

0
1 +r
2
/a
2
,
as a function of radial distance r from its centre, where
0
and a are constants. Show that
the mass M(r) interior to a radius r has the form M(r) r
3
for r a and M(r) r for
r a.
Consider a population of massless test particles in the potential of this galaxy. Assume
that this population is spherical, non-rotating, isothermal and isotropic, with velocity dis-
persion in each velocity component. What is the radial density distribution
p
(r) of this
test particle population, expressed in terms of M(r) and r?
Solve for
p
(r) in terms of r explicitly for large radii (i.e. for regions where r a) to show
that the density has a power law dependence on radius. What is the index of this power
law? Give a physical interpretation of this index. What is the condition for the density
distributions of the test particle population and the galaxy itself to have similar forms at
large r?
Problem 6: Question
Many of the researchers who perform N-body simulations do so to study the dynamics of
galaxies, but some others us N-body techniques to study the dynamics of globular clusters.
Naively, we might expect the latter group of people would have an easier job, because they
can easily aord as many particles in their simulations as there are stars in the real objects,
and they do not need to worry about gas dynamics. We might therefore expect that globular
cluster dynamics would be a well-understood subject by now. However, many problems have
not been solved fully and plenty of dicult research remains to be done. This problem is to
work out why.
Consider a globular cluster and a galaxy, both 10
10
yr old. The globular cluster has
a size 20 pc across and contains 10
5
stars moving with a typical velocity 15 kms
1
. The
galaxy is 20 kpc across and contains 10
11
stars with a typical velocity 200 kms
1
.
Both of these are simulated using 10
5
particles. Give two reasons why the globular cluster
simulation will be more dicult.
Appendix C: Example Problems
Problem 5: Answer
For a thin spherical shell we have, dM(r) = 4r
2
(r)dr. Integrating from the centre (r = 0)
to a radial distance r, we obtain,
M(r) = 4
0
_
r
0
r
2
1 +r
2
/a
2
dr

= 4
0
a
2
( r a tan
1
(r/a) )
(using a substitution r

= a tan to solve the integral).


When r a, the expansion tan
1
x = x x
3
/3 +. . . gives
M(r) = 4
0
a
2
_
r r +r
3
/3 O(r
5
)
_
=
4
0
a
2
r
3
3
O(r
5
) .
Therefore, M(r) r
3
when r a.
When r a, tan
1
(r/a) /2, so r a tan
1
(r/a) r, which gives, M(r) = 4
0
a
2
r .
Therefore, M(r) r when r a. (This is an important result because M(r) r gives a
circular velocity v
circ
= constant, as is observed in the rotation curves of spiral galaxies).
Actually the asymptotic forms are obvious from
0
(small-r) and
0
/r
2
(large-r).
For a spherically-symmetric potential, the second Jeans equation gives

r
( nv
2
r
) ) +
n
r
_
2v
2
r
) v
2

) v
2

)
_
= n

r
where n is the number density of some system of particles or stars, is the gravitational
potential, while v
2
r
), v
2

) and v
2

) are the mean values of the squares of the velocity in the


r, and directions (from Section 2.18 of the course notes). For an isotropic distribution
with no net rotation, v
2
r
) = v
2

) = v
2

) =
2
, where is a constant. So,
d
dr
_
n
2
)
_
= n

r
.
To nd d/dr, use the fact that the acceleration due to gravity is g = and that
g = GM(r)/r
2
for a spherical distribution of mass. Since the particles are isothermal, is
independent of r. Therefore,

2
dn
dr
= n
GM(r)
r
2
in terms of M(r) and r. Using
p
n and integrating,

2
_
d
p

p
= G
_
M(r)
r
2
dr ,
which gives,
ln
p
=
G

2
_
M(r)
r
2
dr .
This is the density of the test particles as a function of r and M(r).
For large r, we have M(r) = 4
0
a
2
r , which gives,
ln
p
=
4G
0
a
2

2
_
dr
r
=
4G
0
a
2

2
ln r + k
1
,
where k
1
is a constant. Rearranging,

p
= k
3
r

4G
0
a
2

2
,
where k
3
is a constant. This is a power law of the form
p
= r
l
where the index is
l = 4G
0
a
2
/
2
. So the density of the population of test particles will have a power law
dependence on distance at large radii.
Here

Z can be interpreted as the ratio of circular speed to dispersion. The tracer population
will have the same large-r density law as the massive population if Z = 2 (i.e. r
2
and

p
r
2
at large r if Z = 2).
Problem 6: Answer
The crossing time of the globular cluster will be T
cross
20 pc/15 kms
1
1.3 10
6
yr,
while that of the galaxy will be 20 kpc/200 kms
1
1.0 10
8
yr. So, the globular cluster
will be 10
4
T
cross
old, whereas the galaxy will be 10
2
T
cross
old. N-body modelling of the
dynamics of the two objects will use a series of time steps, with the positions of the particles
being computed at each of these steps. However, the globular cluster simulation will need
steps 10
2
times smaller than for the galaxy to achieve steps representing the same fraction
of the crossing time in both systems. Using T
relax
/T
cross
N/12 ln N, the globular cluster
will have T
relax
700 T
cross
10
9
yr, so the globular cluster will be 10 T
relax
old. In
contrast, the galaxy will be T
relax
old. The globular cluster simulation will therefore
have to consider two-body relaxation, while the galaxy simulation can ingore it. Both these
considerations make the globular cluster simulation more dicult.
Appendix C: Example Problems
Problem 7: Question
A star lying in the Galactic plane is observed to have a visual magnitude of V = 13.60 mag
and a colour index (B V ) = 0.98 mag. Its spectrum shows it to be a dwarf star of spectral
type G6 with a solar composition. Stars of this type are known to have an intrinsic colour
of (B V )
0
= 0.76 mag and an absolute visual magnitude of M
V
= +5.20.
What is the extinction by the interstellar medium in the V band between us and the star?
What is the distance of the star? What is the mean extinction per unit distance in the
direction of the star expressed in mag kpc
1
for the V band? Will this extinction per unit
distance be the same for other stars in the sky?
What would you expect the extinction to be towards the star in the I and K photometric
bands (which have central wavelengths of 790 nm and 2.2 m respectively)?
Problem 8: Question
Observations of a part of the interstellar medium of the Galaxy show that a region of hot
ionised gas (with a temperature 500 000 K, number density of ions 6000 m
3
), a region of
cold neutral gas (temperature 50 K, number density of molecules 2 10
7
m
3
), and a region
of warm neutral gas (temperature 10 000 K, number density of atoms 1 10
5
m
3
) are in
contact with each other. Which, if any, of these are in pressure equlibrium with the others?
Problem 9: Question
The mean density in the form of stars in the disc of the Galaxy is observed to vary with
the distance z from the Galactic plane as
s
(z) =
so
e
|z|/hs
close to the Sun, where
so
is
the density of stars in space in the plane, and h
s
is a scale height (
so
and h
s
are therefore
constants at the distance of the Sun from the Galactic Centre). The density of the interstellar
gas
g
is also found to vary exponentially with height, with
g
(z) =
go
e
|z|/hg
, where
go
and h
g
are constants. Observations show that h
s
= 250 pc and h
g
= 150 pc and
so
= 6
go
.
What is the ratio of the surface density of stars,
s
, to that of gas,
go
, at the Suns distance
from the Galactic Centre?
How do you expect the surface density of the dust,
d
, to compare with
s
?
Appendix C: Example Problems
Problem 7: Answer
The (BV ) colour excess of the star is E
(BV )
= (BV )(BV )
0
= 0.980.76 = 0.22mag.
The mean interstellar extinction curve has the relation A
V
= 3.3E
(BV )
. Therefore, we
expect A
V
= 3.3 0.22 = 0.73 mag. The relation between apparent and absolute magnitude
for the V band is V M
V
= 5 log
10
(D/pc) 5 + A
V
. Therefore, the distance is D =
10
(V M
V
+5A
V
)/5
pc = 342 pc. The distance to the star is 340 pc.
The mean extinction per unit distance is 0.73 mag/342 pc = 2.1 mag (kpc)
1
.
[Note the linear relation. This comes from the extinction A
V
= 1.086
V
where
V
is the
optical depth in the V band. Put
V
=
d

V
D, where
d
is the density of dust in space,
V
is a coecient expressing how strongly dust absorbs light in the V band (a constant for the
V band unless the type of dust particles varies substantially), and D is the distance to the
star. So A
V
= 1.086
d

V
D and therefore A
V
varies linearly with distance D.]
This gure depends on the density of dust in space. Therefore it will be large (as in this
case) in the direction of the Milky Way, and small away from the plane of the Galaxy. It
will be large along sight lines that pass through dense gas (such as cold, neutral gas) and
smaller along sight lines through lower density gas (such as hot ionised gas). Therefore the
mean extinction per unit distance varies strongly across the sky.
[In the A
V
= 1.086
d

V
D representation used above, A
V
/D = 1.086
d

V
. The parameter

V
will vary only slightly, but
d
varies greatly. Therefore, A
V
/D varies greatly.]
From the mean interstellar extinction law graph (on page 57 of the course notes), A
I
/E
(BV )
=
2.0, and A
K
/E
(BV )
= 0.4 by extrapolation. So A
I
= 2.0E
(BV )
= 2.0 0.22 = 0.44 mag,
and A
K
= 0.4E
(BV )
= 0.4 0.22 = 0.09 mag. So the extinction in the I and K bands will
be 0.44 and 0.09 mag respectively.
Problem 8: Answer
The ideal gas law relates the pressure P in a gas to the number density n of particles and
the absolute temperature T by P = nk
B
T, where k
B
is the Boltzmann constant. This gives
the pressure in the hot ionised gas as P
hot
= 6000 m
3
k
B
500 000 K = 310
9
k
B
Km
3
(in this problem, it is easier to leave the pressure in terms of the constant k
B
than to
calculate the result explicitly). For the cold neutral gas we nd that the pressure is P
cold
=
1 10
9
k
B
Km
3
. For the warm neutral gas the pressure is P
warm
= 1 10
9
k
B
Km
3
.
We can see that P
cold
P
warm
,= P
hot
. Therefore the cold neutral region and the warm
neutral region are in pressure equilibrium. The hot ionised region is not in equilibrium with
the other two regions.
[Because the hot ionised region has a higher pressure than the others, it will expand by
compressing them, although the higher densities in the others mean that their inertias slow
the process signicantly.]
[In practice, magnetic elds and cosmic rays can provide contributions to the pressure in
addition to the gas pressure considered here. They will increase the presures, particularly
for the hot ionised gas which will have been produced by supernovae (supernovae will produce
cosmic rays and the neutron stars they leave behind can add to the strength of the magnetic
eld). These extra eects are ignored in this case.]
Problem 9: Answer
The surface mass density of stars can be obtained by integrating the density over height z:

s
=
_

s
(z) dz =
_

so
e
|z|/hs
dz = 2
_

0

so
e
z/hs
dz (from the symmetry)
= 2
so
_

0
e
z/hs
dz = 2
so
_
h
s
e
z/hs
_

z=0
= 2
so
h
s
Similarly, for the gas,
g
= 2
go
h
g
. Therefore,

g
=
2
so
h
s
2
go
h
g
=

so

go
h
s
h
g
= 6
250
150
= 10 .
So
s
= 10
g
at the Suns distance from the Galactic Centre.
Dust density
d
closely follows that of gas and observations show that
d
/
g
0.1. Therefore
we expect
s
100
d
at the Suns distance from the Galactic Centre.
[In reality, the density of the interstellar medium is highly variable from place to place. This
exponential law is a mean representation of the decline in density from the Galactic plane.]
Appendix C: Example Problems
Problem 10: Question
One variant on the Simple Model of galactic chemical evolution is the leaky-box model.
This simulates the eect of shocks from supernovae and winds from young massive stars by
allowing gas to leave the box at a rate proportional to the star formation rate. Therefore
the change M
total
in the total mass M
total
in the box is
M
total
= c M
stars
,
where M
stars
is the change in the mass in stars, and c is a constant or proportionality.
Use this to derive an expression for the mass in gas M
gas
(t) at time t in terms of M
total
(0)
and M
stars
(t).
Now modify the closed-box relation between M
metals
and M
stars
by adding an appropriate
leaking term.
Use these two expressions to derive
Z =
p M
stars
M
total
(0) (1 +c) M
stars
.
This expression shows that the leaky box model wont solve the G-dwarf problem? Why?
Problem 11: Question
For gravitational lensing, for very distant sources (i.e., D
S
D
L
), we can write the expres-
sion for the Einstein angular radius as

E
= k
_
M/D
L
,
where k is a constant. Find the value of k in arcsec if M is measured in solar masses and
D
L
in parsecs.
Appendix C: Example Problems
Problem 10: Answer
We can apply the same analysis to this problem as was used for the Simple Model, but the
total mass M
total
is now a function of time t. Since M
total
= c M
stars
, the mass of gas
at time t is
M
gas
(t) = M
total
(0) (1 +c) M
stars
(t) .
Some results from the Simple Model still apply, such as the expression for the change Z in
the metallicity Z of the gas in time t, and the relation between the changes in the mass in
stars and the total mass M
SF
that has participated in star formation up to time t:
Z =
M
metals
M
gas
Z
M
gas
M
gas
and M
stars
= M
SF
,
where is the fraction of the mass participating in star formation that remains in long-lived
stars and stellar remnants. With outow we have
M
metals
= Z M
SF
+ Z (1 ) M
SF
+ p M
stars
c ZM
stars
= p M
stars
Z M
stars
c ZM
stars
.
Substituting in the expression for Z gives the required result
Z =
p M
stars
M
total
(0) (1 +c) M
stars
.
But this is just the closed-box result with p replaced by p/(1 + c), and M
total
= M
gas
(0)
replaced by M
total
(0)/(1 + c). So the leaky box model just changes the eective value of p
and doesnt change the distribution of stellar metallicities.
Problem 11: Answer
The angular size corresponding to the Einstein radius is (from the course notes)

E
=
_
4GM
c
2
D
LS
D
L
D
S
,
where M is the mass of the lensing object, D
L
is the distance to the lensing object, D
S
is the
distance to the source, D
LS
is the distance between the lens and source, G is the constant
of gravitation and c is the speed of light.
For D
S
D
L
, we have D
LS
D
S
.

E
=
2

G
c
_
M
D
L
rad =
2

6.673 10
11
2.998 10
8
_
M
D
L
rad kg
1/2
m
1/2
= 5.45 10
14
_
M
D
L
rad kg
1/2
m
1/2
= 5.45 10
14
_
M
D
L
_
180 60 60

__
1.989 10
30
3.086 10
16
_
1/2
arcsec M
1/2

pc
1/2
= 0.081
_
M
D
L
arcsec M
1/2

pc
1/2
So the constant is k = 0.081 arcsec M
1/2

pc
1/2
.
This means that a lens at a distance of 1 pc (about the distance of the nearest star) imaging
an extragalactic source will produce an Einstein angular radius of about 0.1 arcsec.
Appendix C: Example Problems
Problem 12: Question
An optical microlensing survey images a star eld in the Galactic bulge close to the Galactic
centre. Assuming that the dark matter halo is made from compact objects with approxi-
mately stellar masses and has a density distribution
(r) =

0
b
2
r
2
+b
2
,
where r is the radial distance from the Galactic centre,
0
is the central dark matter density
and b is a constant, derive an expression for the optical depth of microlensing to the eld
in terms of
0
, b and R
0
. Express the result in terms of the distance R
0
of the Sun from
the Galactic centre. You may assume that the star eld is not signicantly aected by dust
extinction for this calculation.
Calculate if R
0
= 8.0 kpc, b = 2.0 kpc and
0
= 2.0 10
20
kg m
3
.
Problem 13: Question
A weakly-interacting massive particle (WIMP) with a mass of 1000 m
p
, where m
p
= 1.6726
10
27
kg is the mass of the proton, lenses the light of a star in the Large Magellanic Cloud,
which is situated 50kpc from the Earth. Calculate the Einstein angular radius of the WIMP
if it lies at a distance 20 kpc from the Earth. How does this gure compare with the angular
radius of the star if it has the same radius, 6.96 10
8
m, as the Sun? Will the microlensing
eect of the WIMP be noticeable? Are dark matter microlensing surveys sensitive to the
lensing of stars by WIMPs?
What is Einstein angular radius of a brown dwarf with a mass of 0.05M

at the same
location as the WIMP? Will the lensing eect of the brown dwarf on the background star
be noticeable if there is a suitable alignment?
Problem 14: Question
The separation l between the Galaxy and M31 is expected to obey the equation
d
2
l
dt
2
=
GM
l
2
,
over time t, where M is their combined mass, on the assumption that they move only under
their mutual attraction. Show that, in addition to the parametric solutions involving sin and
cos discussed in the lectures, the solutions (i) l = (GM
2
0
)
1/3
(cosh 1), t =
0
(sinh ),
where is a parameter, and (ii) l = (9GM/2)
1/3
t
2/3
are also solutions. What do these two
cases represent physically? What observational constraints show that these solutions are
inappropriate to the real GalaxyM31 system?
Appendix C: Example Problems
Problem 12: Answer
Use the equation
=
4G
c
2
D
S
_
D
S
0
D
L
D
LS
(D
L
) dD
L
,
where D
S
is the distance from the observer to the source, D
L
is the distance from the observer
to the lens, and D
LS
is the distance from the lens to the source. Considering the geometry,
D
S
= R
0
, D
LS
= R
0
D
L
and D
LS
= r. Therefore, r = R
0
D
L
and dD
L
= dr. The
optical depth is then
=
4G
c
2
R
0
_
R
0
0
(R
0
r) r

0
b
2
r
2
+b
2
dr =
4G
0
b
2
c
2
R
0
_
R
0
0
_
R
0
r
r
2
+b
2

r
2
r
2
+b
2
_
dr
=
4G
0
b
2
c
2
R
0
_
R
0
2
ln(r
2
+b
2
) r + b tan
1
_
r
b
_
_
R
0
0
using the result
_
r
2
/(1 + r
2
/b
2
) dr = r b tan
1
(r/b) + constant, from the solution to
problem 3. This gives,
=
2G
0
b
2
c
2
_
ln
_
1 +
R
2
0
b
2
_
2 +
2b
R
0
tan
1
R
0
b
_
Substituting for
0
= 2.0 10
20
kg m
3
, b = 2.0 10
3
pc = 2.0 10
3
3.086 10
16
m,
R
0
= 8.0 10
3
pc = 8.0 10
3
3.086 10
16
m, gives = 5.3 10
7
.
Problem 13: Answer
The expression for the Einstein angular radius is

E
=
_
4GM
c
2
D
LS
D
L
D
S
rad ,
where the mass of the lens M is 1000 m
P
= 1.673 10
24
kg, the distance between the
observer and lens is D
L
= 20 kpc = 6.17 10
20
m, the distance between the observer and
source is D
S
= 50 kpc = 1.54 10
21
m, and the distance between the lens and source is
D
LS
= 30 kpc = 9.26 10
20
m. For the WIMP we have
E
= 2.2 10
36
rad.
The star with a radius 6.96 10
8
m at a distance of 20 kpc = 6.2 10
20
m subtends an
angular radius of 6.910
8
/6.210
20
rad = 1.110
12
rad. So the angular radius of the star
is 5 10
23
times the Einstein angular radius of the WIMP. The lensing eect of the WIMP
will take place on a scale that is 10
23
smaller than the scale of the star image. It will be
completely undetectable.
[This question assumes that WIMPs do exist. Despite dedicated experiments to search for
them, little rm evidence has been found that they exist in any signicant numbers.]
A M = 0.05M

brown dwarf will have


E
= 5.4 10
10
rad = 1.1 10
4
arcsec. The
Einstein angular radius of the brown dwarf is 490 times larger than that angular radius of
the background star. Lensing eects will be signicant if there is a suitable alignment.
Problem 14: Answer
The formulae can be shown to be solutions by dierentiation and appropriate substitution
into the dierential equation.
To prove (i) is a solution, dierenting l = (GM
2
0
)
1/3
(cosh 1) and t =
0
(sinh ), we
get
dl
d
= (GM
2
0
)
1/3
sinh and
dt
d
=
0
(cosh 1)
Therefore,
dl
dt
=
dl
d
_
dt
d
_
1
=
(GM
2
0
)
1/3
sinh

0
(cosh 1)
. Dierentiating again,
d
d
_
dl
dt
_
=
(GM
2
0
)
1/3
cosh
0
(cosh 1) (GM
2
0
)
1/3
sinh
0
sinh

2
0
(cosh 1)
2
=
(GM
2
0
)
1/3

0
(cosh 1)
using cosh
2
x sinh
2
x 1
Therefore,
d
2
l
dt
2
=
d
d
_
dl
dt
_
.
_
dt
d
_
1
=
(GM
2
0
)
1/3

0
(cosh 1)
1

0
(cosh 1)
=
(GM
2
0
)
1/3

2
0
(cosh 1)
2
Therefore,
d
2
l
dt
2
+
GM
l
2
=
(GM
2
0
)
1/3

2
0
(cosh 1)
2
+
GM
[(GM
2
0
)
1/3
(cosh 1)]
2
=
(GM)
1/3

4/3
0
(cosh 1)
2
+
(GM)
1/3

4/3
0
(cosh 1)
2
= 0
So,
d
2
l
dt
2
=
GM
l
2
, the original equation of motion.
Therefore the parametric equations are solutions to the equation of motion.
To prove (ii) is a solution, dierenting l = (9GM/2)
1/3
t
2/3
we get,
dl
dt
=
2
3
_
9GM
2
_
1/3
1
t
1/3
and
d
2
l
dt
2
=
2
9
_
9GM
2
_
1/3
1
t
4/3
Therefore,
d
2
l
dt
2
+
GM
l
2
=
2
9
_
9GM
2
_
1/3
1
t
4/3
+
GM
(9GM/2)
2/3
t
4/3
=
2
9
_
9GM
2
_
1/3
1
t
4/3
+
2
9
_
9GM
2
_
1/3
1
t
4/3
= 0
So,
d
2
l
dt
2
=
GM
l
2
, the original equation of motion.
Therefore l = (9GM/2)
1/3
t
2/3
is a solution to the equation of motion.
Case (i) represents the situation where the mass of the GalaxyM31 system is insucient
to reverse their initial movement apart. They continue to move apart at all times, with
dl/dt > 0 even in the limit t . The separation l continues to increase with time at the
present day: it predicts that M31 will still be moving away from the Galaxy today, which is
inconsistent with radial velocity measurements.
Case (ii) represents the situation where the mass of the GalaxyM31 system is just insucient
to reverse their initial movement apart. The separation l continues with time to the present
day, but dl/dt 0 in the limit t . It predicts that M31 will still be moving away from
the Galaxy today, which is inconsistent with radial velocity measurements.

También podría gustarte