Está en la página 1de 545

Gravitation, Gauge Theories and the Early Universe

Fundamental Theories of Physics


An International Book Series on The Fundamental Theories of
Physics: Their Clarification, Development and Application

Editor: ALWYN VAN DER MERWE


University of Denver, U.S.A.

Editorial Advisory Board:


ASIM BARUT, University of Colorado, U.S.A.
HERMANN BONDI, University of Cambridge, U.K.
BRIAN D. JOSEPHSON, University of Cambridge, U.K.
CLIVE KILMISTER, University of London, U.K.
GONTER LUDWIG, Philipps-Universitat, Marburg, F.R.G.
NATHAN ROSEN, Israel Institute of Technology, Israel
MENDEL SACHS, State University of New York at Buffalo, U.S.A.
ABDUS SALAM, International Centre for Theoretical Physics, Trieste, Italy
HANS-JORG EN TREDER, Zentralinstitut fUr Atrophysik der Akademie der
Wissenschaften, G.D.R.

Gravitation,
Gauge Theories and
the Early Universe
Edited by

B. R. Iyer,

N. Mukunda
and
C. V. Vishveshwara

KLUWER ACADEMIC PUBLISHERS


DORDRECHT I BOSTON I LONDON

Library of Congress Cataloging in Publication Data


Gravitation, gauge t,heor'ies and the ear'ly unjversf'.
(Fllnd;UTIent.al the-OI'les of physics)
l. Phvsico.;. 2. Astr'ophvsics. 3. Cosmology.
4. Gallg" fields (Physics) 1. lyet'. B. R. II. Mukllnda,
:-l.
III. Vishveshwara, C. V. IV. Series.

QC71.G64

19'3:;:

530'.1

88-775

ISBN-13: 978-94-010-7664-7
e-ISBN-13: 978-94-009-2577-9
DOT: 10.1007/978-94-009-2577-9

Published by Kluwer Academic Publishers,


P.O. Box 17, 3300 AA Dordrecht, The Netherlands.
Kluwer Academic Publishers incorporates
the publishing programmes of
D. Reidel, Martinus Nijhoff, Dr W. Junk and MTP Press.
Sold and distributed in the U.S.A. and Canada
by Kluwer Academic Publishers,
101 Philip Drive, Norwell, MA 02061, U.S.A.
In all other countries, sold and distributed
by Kluwer Academic Publishers Group,
P.O. Box 322, 3300 AH Dordrecht, The Netherlands.

prill ted Oil acid Fee paper

All Rights Reserved


(. 1989 by Kluwer Academic Publishers
Softcover reprint of the hardcover I st edition 1989
No part of the material protected by this copyright notice may be reproduced or
utilized in any form or by any means, electronic or mechanical
including photocopying, recording or by any information storage and
retrieval system, without written permission from the copyright owner

Table of Contents
Preface

xii

PART I: GRAVITATION AND COSMOLOGY


1.

P. C. V A I D Y A / Introduction to General Relativity


1. From Special Theory to General Theory
2. Einstein's Thought Experiment
3. Geometry or Geometries?
4. Riemannian Geometry and Geodesics
5. Geometry and Gravitation
6. The Line Element
7. Summation Convention
8. Vectors and Tensors
9. Quotient Law
to. The Fundamental Tensor
11. Raising and Lowering the Suffixes (Indices)
12. Length of a Vector
13. Addition of Vectors at a Point
14. Covariant Derivative of a Contravariant Vector
15. Covariant Derivative of a Covariant Vector
16. The Christoffel Symbols
17. Geodesics
18. The Curvature Tensor
19. Natural Coordinates at a Point
20. Symmetry Properties of the Curvature Tensor
21. Bianchi Identities and the Ricci Tensor
22. The Einstein Tensor and the Field Equations of Gravitation
23. Matter Tensor for a Perfect Fluid
24. Exercises

2. C. V. V ISH V E S H WAR A / Introduction to Black Holes


1. Preamble
2. The Schwarzschild Black Hole
3. Properties of the Schwarzschild Black Hole
4. The Kerr Black Hole
5. The Black Hole and the Ergosphere
6. The Penrose Process
7. Charged Black Holes
v

3
3
3
4
5
6
6
8
9
11

12
13
14
14
15
16
18
20
21
21
23
24
25
26
27

31
31
31
33
37
38
40
41

vi

Table of Contents
8. Conclusion
References

41
42

3.

B. R. IYER / Black-Hole Thermodynamics and Hawking Radiation

43

4.

C. V. V ISH V E S H WAR A / Introduction to Relativistic Cosmology


1. Preamble
2. The Cosmic Spacetime
3. Cosmological Models
4. Dust Models
5. Radiation Models
6. Models with Nonzero Cosmological Constants
7. Observational Contacts
8. Conclusion
References

51
51
51
52
53
54
55
56
58
58

1. V. N A R LI K A R /Relics of the big Bang


1. The Early Universe
2. Thermodynamics of the Early Universe
3. Primordial Neutrinos
4. The Neutron/Proton Ratio
5. The Synthesis of Helium and Other Nuclei
6. The Microwave Background
7. Anisotropies of the Microwave Background
8. Cosmology and Particle Physics
9. Survival of Massive Particles
10. Problems of the Very Early Universe
References

59
59
61
65
69
72
75
78
79
81
82
86

5.

6.

A. K. RAYCHAUDHURI / An Approach to Anisotropic

Cosmologies
1.
2.
3.
4.

Motivation
Killing Vectors and Bianchi Types
Kinematics - Analysis of the Velocity Field
Perfect Fluid Solutions Classified According to Kinematic
Properties
Some
Anisotropic Cosmological Solutions
5.
6. Problems
7.

P. S. JOSHI/Topics in Spacetime Structure


1. Introduction
2. The Manifold Model
3. Spacetime Diffeomorphisms
4. Killing Vector Fields

89
89
91
93
99
101
105
107
107
107
108
110

Table of Contents
5. Boundary Attachment and Conformal Campactification for
Spacetimes
References
8.

A. R. P R A S A NN A / Differential Forms and Einstein-Cartan


Theory
1. Basic Definitions
2. Algebra and Calculus of Forms
3. Connection and Curvature Forms
4. Einstein-Cartan Theory - The Gauge Theory of Gravity
5. Gravitation in the Presence of Fermionic Matter
References

PART II: INTRODUCTION TO PARTICLE PHYSICS AND


GAUGE FIELD THEORIES
9.

R. P. S A X E N A / Introduction to Classical and Quantum


Lagrangian Field Theory
1. Classical Lagrangian Field Theory
2. Canonical Quantization
3. Discrete Symmetries
4. Interacting Fields
5. Invariant Perturbation Theory
6. Primitive Divergences in QED
7. QED as a Renormalizable Theory
8. V-A as a Nonrenormalizable Theory
9. Dimensional Regularization
Further Reading

10. J. PAS U PAT H Y / Introduction to Particle Physics, Symmetries


and Conservation Laws
1. Introduction
2. Charge Independence of Nuclear Forces - Isotopic Spin
3. Strange Particles
4. Nucleon Number Conservation
5. Lepton Number Conservation
6. Discrete Symmetries
7. ys-Invariance and Weak Interactions
8. Strong Interactions: Quarks and Gluons
9. Need for Colour
10. Gauge Invariance
Further Reading

Vll

113
117

119
119
121
123

126
129
132

133

135
135
138

142
145
146
153
155
156

157
162

163

163
166
169
170
171
172
178

179
180
181

184

Table of Contents

Vlll

11. G. RAJASEKARAN /Building up the Standard Gauge Model of


High-Energy Physics
1. Introduction
2. U(1) Gauge Theory
3. Spontaneous Breakdown of Symmetry - Goldstone Model
4. Higgs Model
5. SU(2) Gauge Theory
6. Spontaneous Breakdown of SU(2) Symmetry
7. One More Model
8. General Case of Non-Abelian Symmetry Breakdown
9. SU(2) x U(l) Model
10. 'Standard Model' before Gaugl~ Theory
11. Current Algebra and SU(2) x U(I) Charges of the Fermions
12. The Electroweak Gauge Theory
13. Consequences of the Electrowe:ak Theory
14. Renormaliza bili ty
15. Spontaneous Symmetry Breaking and Phase Transitions
16. Deep Inelastic Scatterng, Asymptotic Freedom and Colour
SU(3)
17. The Renormalization Group Equation
18. Formal Derivation of the Renormalization Group Equation
19. Solution of the Renormalization Group Equation
20. Hydrodynamic Analogy
21. Fixed Points and Asymptotic Freedom
22. Asymptotic Freedom of QCD
23. Infrared Problem and Colour Confinement
24. Tests of QCD
25. The Standard Model of High Energy Physics
26. Beyond the Standard Model
References

185
185
185
187
190
191
193
194
195
196
199
201
203
206
210
213
214
218
220
221
222
223
225
228
230
231
234
236

12.

K. C. WALl/Introduction to Grand Unification Theories


1. Grand Unification - A Survey of Basic Ideas
2. Grand Unified Theory Based on G = SU(5)
3. Spontaneous Symmetry Breaking
4. Predictions of Minimal SU(5)
5. Baryon Asymmetry
6. Phase Transitions in the Early Universe
Bibliography

237
237
247
260
267
274
277
280

13.

B. R. SIT A RAM / Topology and Homotopy


1. What is Topology?
2. Why the Recent Interest in Topology?
3. Homotopy Theory

281
281
282
282

Table of Contents

14.

IX

4. Chern Classes
References

285
286

N. MUKUNDAJIntroduction to Compact Simple Lie Groups

287

PART III: QUANTUM EFFECTS IN THE EARLY UNIVERSE


AND APPROACHES TO THE UNIFICA nON OF
FUNDAMENTAL FORCES

~3

15.

B. R. lYE R J Quantum Field Theory in Curved Spacetime: Canonical


Quantization
297
1. Quantum Field Theory in Curved Spacetime
297
2. Canonical Quantization of the Scalar Field in CST
300
3. The Conformal Vacuum
305
4. A Toy Model with Particle Creation
308
5. The Adiabatic Vacuum
312
References
314

16.

D. L 0 H I Y A / Zeta Function Regularisation and Effective Action


in Curved Spacetime
I. The Riemann Zeta Function
2. Applications
3. Path Integral Formulation for QFT in CST
4. Conformal Anomalies
5. Phase Transition in a De Sitter Universe
References

315
315
318
321
333
336
341

N. PAN C HAP A K E SAN /Inflationary Cosmology and Quantum


Effects in the Early Universe
1. Quantum Field Theory in Curved Spacetime: A Short History
2. Problems in Standard Cosmology
3. Inflation
4. Free Lunch
5. The 'New' Model
6. Evolution of the Scalar Field
7. Linde's Chaotic Inflation
8. Hawking's Limits on Inflationary Models
9. Quantum Effects in the Early Universe
10. The Fundamental Problem
11. De Witt-Schwinger Expansion of Green's Function
12. Renormalization
13. Other Methods
14. Example of Back Reaction
15. Applications
References

343
344
345
348
349
350
351
354
355
360
360
361
365
368
368
370
371

17.

Table of Contents

18. T. PADMANABHAN/Quantum Cosmology

~ The Story So Far 373


1. Introduction
373
2. Minisuperspace of Conformal Degree of Freedom
375
3. Quantized FRW Universes
384
4. Applications of Quantum Gravity
390
5. Critique, Comparison and Open Questions
397
Appendix 1: Schrodinger Approach to Field Theory
400
Appendix 2: The Wheeler~De Witt Equation
401
Notes and References
403

19. A. M A H E S H WAR 1/ The Photon, the Graviton and the


Gravitino
1. The Photon
2. The Graviton
3. The Gravitino
4. The Rarita~Schwinger Lagrangian

20.

A. M A H E S H WAR I / The Vierbein, Vielbeins and Spinors in


Higher Dimensions
1. The Vierbein
2. Vielbeins
3. Spinors in d-dimensions

21. A.

MAHESHWARI/Kaluza~Klein

Theories

1. Kaluza~Klein Theories
2. Spontaneous Compactification and Isometry Groups
3. Harmonic Expansions, Chiral Fermions and All That
References
22.

23.

J. SAM U E L / Kaluza~Klein Cosmology


1. Introduction
2. Five-Dimensional Kaluza~Klein Theory
3. Remarks
4. Dimensional Reduction
5. Cosmology
References

N. M U K U N D A / An Elementary Introduction to the Gauge


Theory Approach to Gravity
1. Introduction
2. The Yang~Mills Construction
3. Gauging a Special Relativistic Matter Lagrangian
4. Kinematics of the Gravitational Variables
5. The Gravitational Action

405
405
407
410

411

415
415
420
420

423
423

433
440
447
449
449

450
453
454
456
465

467
467
468
472

474
477

Table of Contents

XI

6. Translational Gauge Potentials


References

478
479

24.

B. R. S IT A RAM / Graded Lie Algebras


1. Introduction
2. Examples of Graded Lie Algebras (GLAs)
3. Maps of GLAs
4. Classification of GLAs
References

481
481
482
483
485
485

25.

R. K. KAUL/Supersymmetry and Supergravity

487
487
489

1. Introduction
2. Coleman-Mandula Theorem and Supersymmetry Algebra
3. Representation of the Supersymmetry Algebra on One-Particle
States
4. Representations of the Supersymmetry Algebra on Fields and
Invariant Lagrangians
5. Spontaneous Breakdown of Supersymmetry
6. Pure N = 1 Supergravity in Four Dimensions
7. N = 1, D = II Supergravity
8. N = 1, D = 10 Supergravity
9. Concluding Remarks
References
26.

492
496
502
508
512
514
520
521

H. S. S H A RAT C HAN DR A / An Overview of Superstring Theory 523


1. Introduction
523
2. Duality
524
3. The Veneziano Formula
526
4. Free Relativistic String
526
5. Orthonormal Gauge
527
6. Quantization
528
7. Light Cone Quantization
528
8. Hamiltonian Formalism
530
9. Quantization
531
10. Lorentz Covariance
531
11. Spectrum
532
12. Closed Strings
533
13. Interacting Strings
533
14. Field Theory Limit
534
15. Superstrings
535
16. Problems and Prospects
537
References
538
Index

539

Preface
This book evolved out of some one hundred lectures given by twenty experts at
a special instructional conference sponsored by the University Grants Commission, India. It is pedagogical in style and self-contained in several interrelated
areas of physics which have become extremely important in present-day
theoretical research. The articles begin with an introduction to general relativity
and cosmology as well as particle physics and quantum field theory. This is
followed by reviews of the standard gauge models of high-energy physics,
renormalization group and grand unified theories. The concluding parts of the
book comprise discussions in current research topics such as problems of the
early universe, quantum cosmology and the new directions towards a unification
of gravitation with other forces. In addition, special concise treatments of
mathematical topics of direct relevance are also included.
The content of the book was carefully worked out for the mutual education of
students and research workers in general relativity and particle physics. This
ambitious programe consequently necessitated the involvement of a number of
different authors. However, care has been taken to ensure that the material
meshes into a unified, cogent and readable book. We hope that the book will
serve to initiate and guide a student in these different areas of investigation
starting from first principles and leading to the exciting current research
problems of an interdisciplinary nature in the context of the origin and structure
of the universe.
We are grateful to all the authors, who in spite of their busy professional
commitments, gave us their precious time and enthusiastic support. It is
a pleasure to thank Ms Moksha Halesh, Mr Ramachandra Rao and Mr Raju
Varghese for patient and valuable help in preparing the manuscript for
publication. And finally we acknowledge the cooperation and encouragement of
our publishers Kluwer Academic Publishers, The Netherlands in bringing out
this volume.

Xlll

Part I:
Gravitation and Cosmology
The first part of this volume, dealing essentially with classical general relativity
and cosmology, consists of discussions at several different levels. It begins with an
elementary, but adequate, presentation ofthe basic tenets and mathmatical tools
of general relativity. Glimpses of some interesting developments in black-hole
physics and cosmology are presented with suitable introductory articles preceding these discussions. Finally, some of the advanced mathematical techniques
that have become indispensable to current research are also introduced.
The chapter by P. C. Vaidya (Chapter 1) takes the special theory of relativity as
its starting point and demonstrates how the phenomenon of gravitation naturally
leads to the Riemannian geometry of curved spacetime. The basic mathematical
techniques of tensor analysis and differential geometry are developed ending with
Einstein's field equation. This article forms the foundation for all others involving
classical general relativity.
Black-hole physics has been one of the most important developments in
general relativity during the past two decades or so. An elementary introduction
to the geometrical structure of black holes is provided by the first of the two
chapters by C. V. Vishveshwara (Chapter 2). The characteristic properties of the
nonrotating and rotating black holes are pointed out, compared and contrasted
using simple mathematics. Some of the important results that have emerged in
black-hole physics are also briefly described.
The above chapter serves as a preamble for the next one by B. R. Iyer (Chapter
3). One of the most fascinating features of black holes is the Hawking radiation
and the consequent quantum evaporation of black holes. This phenomenon is
discussed, first considering black hole thermodynamics. Ideas such as the
reversible and irreversible processes, the thermodynamic quantities associated
with the black hole - especially the notion of its temperature - and, finally, the
Hawking radiation exhibiting a Planck ian distribution corresponding to this
temperature, are the main points focussed on in this chapter.
The second chapter by C. V. Vishveshwara (Chapter 4) is again a short
introduction to the relativistic cosmological models. The fundamental observational facts of isotropy and homogeneity leading to the simple Robertson-Walker
geometries are explained. The different Friedmann models and their evolution
are considered. Finally, the observational contacts of these different models are
discussed. This article is the prelude to the two chapters on cosmology to follow.
The possible creation of the universe as a whole with the Big Bang has excited,
intrigued and tantalized all cosmologists. In Chapter 5,1. V. Narlikar considers in
sufficient detail the formation and evolution of the relics of the Big Bang. After
considering the thermodynamics ofthe early universe, Narlikar goes into various

B. R. lyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 1-2.
:[} 1989 by Kluwer Academic Publishers.

Part I

questions related to these relics such as the synthesis of helium and the
characteristic features of the microwave background. The interplay between
particle physics and cosmology, which has become increasingly intense in recent
years, is anlayzed. Some problems related to the very early universe, including
galaxy formation, are also touched upon.
Although the universe, as we see it today, appears to be isotropic to an
extraordinary degree, it is not inconceivable - or rather it should be expected
- that the universe was once anisotropic. A. K. Raychaudhuri's Chapter 6 starts
by setting out the motivation for the study of anisotropic cosmological models. It
then offers the mathematical basis for the study of such models as well as the
description of some of the exact solutions of this genre. Killing vectors that spell
out spacetime symmetries, are defined and the Bianchi classification of spacetimes based on the structure of the Killing vectors described. After considering
the kinematics of matter flow, some of the known solutions are presented and
their properties described.
Global techniques have found an important place in the study of spacetime
structure. In Chapter 7, P. S. Joshi elucidates some of the mathematical concepts
underlying these techniques. After introducing the idea of a differentiable
manifold, diffeomorphisms of spacetime, Lie derivatives and Killing symmetries
are introduced. The chapter ends with the treatment of the conformal compactification which facilitates the study of null boundaries of spacetime.
Differential forms have proved to be of great efficacy in computations and
analyses within the framework of general relativity. In Chapter 8, A. R. Prasanna
develops the algebra and the calculus of differential forms. The results are then
applied to the Einstein-Cartan theory, wh:lch includes spin as a source term in
addition to the usual energy-momentum of the distribution. The relation of this
theory to a possible gauge theory of gravity is also examined.
To sum up, this part of the book centres around classical general relativity and
cosmology. It includes foundations for more advanced topics as well as glimpses
of some problems of current interest. It should profitably serve as a take-off point
for the different directions in general relativity.

1. Introduction to General Relativity


P. C. VAIDYA
Department of Mathematics, Gujarat University, Ahmedahad, 380009, India

1.

From Special Theory to General Theory

We shall assume familiarity with the special theory of relativity (SR). Two inertial
observers, i.e., two observers who move uniformly in a straight line relative to
each other, describe nature in identical terms. Certainly, aesthetics demands that
if nature would not show preference for one or the other of two observers in
uniform rectilinear motion relative to each other, then it should also not show
any preference between two observers with any type of relative motion. This
implies that we must search for a more general principle of relativity demanding
invariance, not merely under Lorentz transformations, but under more general
transformations arising out of nonuniform relative motion oftwo observers. This
was one motivation for going over from SR to the general theory of relativity
(GR).
The other motivation was more of a practical nature. SR could encompass
almost the whole of physics but could not include gravitation. If we try to make
the Newtonian law of gravitation invariant under Lorentz transformations, one
reaches the conclusion that inertial mass is not equal to gravitational mass, which
goes against the conclusion of the Eotvos experiment. One should not be
surprised, however, at such a negative consequence because (1) SR assumes the
existence of inertial observers and (2) in the presence of gravitation, no two
observers can continue to move uniformly in a straight line. In the presence of
gravitational fields, the nearest we can get to the concept of an inertial observer is
that of a free falling observer who has an accelerated motion relative to a distant
inertial observer. This, then, is another motivation. In order to accommodate
gravitation in the scheme of relativity, one has to generalize SR and evolve
a scheme under which the laws would be invariant, as described by two observers
with any general relative motion.

2.

Einstein's Thought Experiment

The gravitational field from which one cannot run away in our earth-bound
laboratories, is the gravitational field of the earth. In order to get an insight into
the possible contents of the general principle of relativity, Einstein devised
a thought experiment involving a free-falling observer in the Earth's gravitational
field. Imagine an observer 0 1 travelling down in a lift from the top floor of a tall
3
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 3-29.
1989 hy Kluwer Academic Puhlishers.

P. C. Vaidya

skyscraper, while the cable supporting the lift has snapped so that he is in
a free-falling lift. In this free-falling lift, 0 1 conducts an experiment. He takes out
a coin from his pocket, brings it up to the level of his eyes and leaves it there. What
will he observe? He will observe that the coin remains poised in the air at the point
where he left it. To him, it does not appear to fall. For comparison, consider
another observer O2 on the ground. He finds that the lift is falling freely and with
it observer Oland the coin are also falling freely with acceleration g (of
Newtonian theory). Observations of O2 are in accordance with Newton's law of
gravitation but the observations of 0 1 contradict that law. One conclusion that
one can immediately draw from this thought experiment is that if two observers
have relative accelerated motion (e.g. 0 1 and O2 here) they will not subscribe to
Newton's law of gravitation in identical terms. The introduction of a general
principle of relativity would imply a new look at the law of gravitation.
To see what the 'new look' would be like, let us proceed with the thought
experiment. Suppose that 0 1 now gives a slight push to the coin in a direction
parallel to the floor of the lift. What would be the result? Because of the push, the
coin will start moving in the direction of the push and 0 1 will find that the coin
moves in a straight line parallel to the floor until it reaches the opposite wall of the
lift. For 0 1 the coin moves uniformly in a straight line.
Let us now see how O2 will describe the same situation. For O2 , the coin was
falling under gravity and when it was given a horizontal push, its path became
a parabola, like the path of a Newtonian projectile. Again, the two descriptions
(those of 0 1 and O 2 ) differ but now the difference is geometrical. Observer 0 1
describes the path of the coin as a straight line, while O2 describes the same path
as a parabola. This thought experiment suggests that one must look into the
paths of particles in order to accommodate gravitation with a general principle of
relativity.
We shall now turn our attention to geometry but will return to the thought
experiment after our brief excursion through geometry!

3.

Geometry or Geometries?

The geometry that we have studied in schools is known by the name of Euclidean
geometry after the great mathematician Euclid (3rd century B.C.) who collected
the known geometrical knowledge of his day and arranged it in a logical sequence
of axioms and theorems. His axioms were like self obvious truths. One and only
one straight line passes through two given points or all right angles are equal these are examples of his self-obvious axioms. But then he introduced one
axiom which could not be classified as self-obvious. This has come to be known as
the parallel postulate. It states that given a straight line and a point outside it, one
and only one straight line can be drawn parallel to the given line and pass through
the given point. It is clear that this is not at all 'self-obvious' and Euclid himself hesitated a great deal before accepting it as an unproved assumption. We

Introduction to General Relativity

may note that several well-known theorems of our school geometry are based on
the validity of this axiom, e.g. the theorem about the sum of the three angles of
a triangle being two right angles.
The hesitation which Euclid experienced in accepting the parallel postulate as
an unproved assumption troubled later mathematicians for more than 1500
years. Geometer after geometer attempted to prove this postulate on the basis of
the other axioms of Euclid but with no success. Ultimately, a Russian
mathematician, Lobachevsky in 1829, first conceived the idea that it may be
possible to prove that the parallel postulate cannot be proved! And he succeeded
in doing so. He replaced Euclid's postulate by the following. Given a straight line
and an outside point, two straight lines can be drawn parallel to the given line and
pass through the given point. He did not find any logical flaw following from this
assumption and, thus, developed a perfectly logical geometry known as
Lobachevskian geometry, where the sum of three angles of a triangle is always
less than two right angles.
About 25 years later, Riemann developed another geometry. He changed
Euclid's postulates about straight lines and replaced the parallel postulate by the
following. Given a straight line and a point outside it, no straight line can be
drawn parallel to the given straight line and pass through the given point. We
could construct a logically consistent geometry in which the concept of parallel
straight lines is absent. This geometry is known as Riemannian geometry. In this
geometry, the sum of three angles of a triangle is always greater than two right
angles.
Let us mention some interesting features of this geometry which would enable
us to interpret the results of Einstein's thought experiment
4.

Riemannian Geometry and Geodesics

We who live on the surface of a spherical Earth, should really be more familiar
with Riemannian geometry. It is easy to see that the geometry on the surface of
a sphere should be Riemannian. We have only to realize that there will be no
parallel straight lines on the surface of the sphere. But then, what is a straight line
on a sphere? Our intuitive notion of a straight line is limited to straight lines on
a plane. For a curved surface, a straight line joining any two points on the surface
is defined as a curve on the surface along which the distance between the two
given points is the least. This curve is given a special name, a geodesic. What are
geodesics on a sphere? The calculus of variations can be used to find the shortest
distance between two points on a sphere. We are going to do this later on in
a more general case. It will be found that geodesics on a sphere are large circles.
And any two large circles on a sphere always intersect. If we take a large circle on
a sphere and try to draw a parallel circle on the sphere, we find that it becomes
a small circle and so will not be a geodesic. Hence, no two geodesics on a sphere
can be parallel. The geometry on a sphere is Riemannian.

P. C. Vaidya

Although geometry on the earth's surface is Riemannian, we use Euclidean


geometry, for the normal planning of roads, buildings or playgrounds. This is so
because in the region under consideration, the curvature of the earth can be
neglected and it could be regarded as a plane. It is a general property of
Riemannian geometry that if the geometry in a certain region is Riemannian, then
for a small neighbourhood of a point in that region, Riemannian geometry can be
replaced by Euclidean geometry.

5.

Geometry and Gravitation

With the last noted property of the Riemannian geometry, we can now return to
Einstein's thought experiment. Consider the regions of the observation of the
two observers Oland O 2 . For observer 0[, the region of observation is limited to
the falling lift or (to use our earlier notation) to a small neighbourhood of 0 1. For
O 2 , the region of observation is much larger, covering part of the earth, the falling
lift and the surrounding space. Could it not be that the geometry applying to the
observations of both Oland O 2 is Riemannian which, for 0 1 , becomes Euclidean
in his neighbourhood (i.e. in the falling lift)? If that be so, then both Oland O 2
observe the path of the coin as geodesic and, for 0 1 , the geometry being
Euclidean, this geodesic is the straight line, while for O2 the observed curved path
will be the geodesic of the Riemannian geometry.
One more point should be noted. When we say that the region of observation of
0 1 is small, that smallness not only refers to the dimensions of the lift, but also to
the small time-interval during which the lift continues to fall freely. Hence, the
Riemannian geometry that we have been mentioning here is four-dimensional
(3 + I-dimensional).
Einstein's thought experiment has prepared us for the basic postulate of the
general theory of relativity, viz. the spacetime observations of events in
a gravitational field form a four-dimensional continuum. The geometry of this
continuum is Riemannian and that freely falling objects in the gravitational field
describe the geodesics of this Riemannian manifold. Given a gravitational field,
how to find the corresponding Riemannian geometry is the central problem of
GR to which we shall concern ourselves from now on.

6.

The Line Element

We know that the geometry on the surface of a sphere is Riemannian. Consider


a point P on the surface of a sphere of unit radius (Figure I a). Its position is
given by two angular coordinates (8, cp),8 being the colatitude (L NOP) and cp the
longitude (cp = LX UP). If 8 is constant and cp varies from 0 to 2n, the point
P moves along the small circle of radius sin 8. If cp is constant and () varies from
o to n, the point P moves along the meridian large circle N PS (Figure 1a). If

Introduction to General Relativity

--- ---x

5
Fig. la

.'

~ds

------p .
'- sinedcp

Fig. Ib

Q(e + de, cp + dcp) is a neighbouring point on the sphere at an infinitesimal


distance from P (Figure I b) then
(1)

It is customary to regard relations like (1) as defining a line element. Formula (1) is
the line element on the surface of a sphere or a line element of a two-dimensional
Riemannian space.
If, however, we were considering two neighbouring points on a plane with
coordinates (x,y), (x + dx,y + dy), the line element would be
(2)

P. C. Vaidya

As the geometry on the plane is Euclidean, we can say that (2) is a line element of
a two-dimensional Euclidean space.
On comparison of (1) and (2), we see that there are many points of similarity
between them, e.g. both are quadratic in differentials of the coordinates.
However, there is one striking difference. The coefficients of quadratic term in (2)
are all constants, while those of (1) are functions of the coordinates. The one
feature that distinguishes a line element of Riemannian space from that of
a Euclidean space, is that the coefficients of the quadratic terms (some of them at
least) are functions (not constant functions) of the coordinates in a Riemannian
space. We begin our study of Riemannian spacetime with this distinction in mind.
In a four-dimensional space time, let (xl, x 2, x 3, x 4 ) be the coordinates of an
event. A neighbouring event will have the coordinates (Xl + dxl, x 2 + dx 2, x 3 +
dx 3, X4 + dx 4). We take the line element as a quadratic in dxl, dx 2, dx 3, dx4.

+ 2gddxl)(dx2) + 2g13(dxl)(dx 3) + 2g 14(dx l )(dx 4) +


+ gddx 2)2 + 2g23(dx 2)(dx 3) + 2g 24(dx 2 )(dx 4) + g33(dx 3)2 +
+ 2g 34(dx 3 )(dx 4) + g44(dx 4)2

ds 2 = gll(dx l )2

or we may write it as
4

ds 2 =

I I

(3)

gikdxidxk,

i=l k=l

where gik are 10 functions of 'position', i.e. of the four variables xl, x 2 , x 3 , X4,
symmetric in i and k, gik = gki'
7.

Summation Convention

The types of summations involved in (3) appear very frequently in this study.
A good deal of conciseness in the notation can be introduced by using what has
been termed by Einstein as the summation convention. Whenever a suffix is
repeated in a term, it will imply a summation of terms for the values 1,2,3,4 ofthe
repeated suffix and we shall drop the summation symbol ~. Thus,
AIBI

+ A2B2 + A3 B3 + A4 B4 (

itl Ai Bi )

will simply be written as AiBi. With this convention, (3) can be written as
ds 2 = gik dXi dXk.
Let us go back for a while to SR. There we have the Minkowskian line element
ds 2 = c2 dt 2

dx 2 - dyl - dz 2

which is again of form (3) with gil = g22 = g33 = -1,g44 = c 2, gij = 0; i i= j.
Clearly, this is a metric of Euclidean space, and we know that this is invariant

Introduction to General Relativity

under Lorentz transformations, which are linear. Also, we have a four-dimensional vector formalism, e.g., the velocity four-vector, the four-vector of
electro-magnetic potential as also an electro magnetic field tensor. We would
now go on to develop a similar formalism for the Riemannian metric (3) under
any general transformations.

8.

Vectors and Tensors

The simplest vector which we encounter in physics is the displacement vector


dxi = (dx 1 , dx 2 , dx 3 , dx 4 ). Let us see how this vector behaves when we carry out
any general transformations of coordinates from Xi to X'i
X'1 =fl(X 1,X 2,X 3,X 4 ),
X'2 =f2(X 1,X 2,X 3 ,X 4),
X'3 =f\X 1 ,X 2 ,X 3 ,X 4 ),
X'4

= f4(X 1, x 2 , x 3 , X4),

X'i = fi(X 1, x 2 , x 3 , x 4),

or better still,
The displacement vector in the new coordinate system will be dx'; and as each X'i
is a function of Xl, x 2 , x 3 , X4,
,; ax'i
1
dx = -a
1 dx
x

OX,;

ox

dx .

ax'i

+ -a
2
X

dx

ax'i

+ -a
3
)x

dx

ax'i

+ -a
4
X

dx

(4)

Note that we have used the summation convention in (4). Also (4) is essentially
a set of four equations corresponding to i = 1,2,3,4 giving four components of
the vector dx d in terms of components (dxl, dx 2 , dx 3 , dx 4 ) of dx j .
We say that (4) is the transformation law of the vector dxi. This law was
obtained with the help of the calculus for the simplest vector - the displacement
vector. On the basis of this law, we define any vector Ai as a set of four functions
(called components) which transform according to the law

a"

A" =~Ak
axk

(5)

We shall call such a vector A i a contra variant vector.


We also have another elementary vector following from the differentiation
process. That vector is the gradient of a scalar. Let cp = cp(xt, x 2 , x 3 , x 4 ) be

10

P. C. Vaidya

a scalar function of position. Then acp/ilx i , i = 1,2,3,4, give components of


a vector - the gradient of cp.
Let us see how this new vector transforms in our scheme. In the coordinate
system X'i, the gradient will be acp/ax'i and we know that
acp

acp ax!

acp ax 2

acp ax 3

acp ax4

=--+--+--+
-axli ax! ax'i ax 2 axli ax 3 ax'i
ax4 ax li '
or again, using the summation convention, we may write
acp

acp axi

We would rather state it as

(6)
Formula (6) is essentially the transformation law of the gradient vector acp/ax j to
acp/ax'i.

Proceeding exactly as we did with (4) to get definition (5), we now define any
vector Bi as a set of four functions (called components) which transform
according to the rule
ax j
Bi = ax,iBj'

(7)

We call such a vector Bi a covariant vector.

Exercises
(1) Show that if the coordinate transformation Xli = X'i(X \ x 2 , x 3 , x 4) is linear,
the two laws of transformation (5) and (7) of contravariant and covariant vectors
are the same.
T[ Ai, Bj are contravariant vectors and C i ' D j are covariant vectors, show that
the law of transformation of the 16-component product
. . axli ax'i
A'IB'] = _AkBl
axk axl
'
IS

axk axl
C;D} = axli ax'jCkDl,
.
ax'i axl
AI! D} = axk ax'jA kD I ,

(5)

Show further that


axi ax'k
ax'k ax j

axli axk
ax'i

= axk

. { = 1 if i == j,

= bj = 0

if i 7'" j.

Introduction to General Relativity

11

Hence, show that the product Ai Di is invariant.


We now extend our definition of vectors to cover 16-component quantities called
second-rank tensors. Any set of 16 functions Tii transforming according to the
law

is a contravariant tensor of the second rank. Similarly, we can define covariant


tensors of the second rank Fij and mixed tensors of the second rank Eij by the
respective transformation laws.

Note that from the previous exercises (2), (3) and (4), we see that the products of
vectors like Ai Bj, CiDj and Ai Dj form second-rank tensors of types indicated by
the position of the indices. As a matter of fact, the general law of tensor
transformation was suggested by the transformation laws indicated in these
exercises.
It may now be easy to generalize and define the transformation law of a tensor
of any arbitrary rank or nature like Tg: g~::: g:as follows

Exercise
Reverse transformation: Given that

oxo ox

T
- Tab
k=
OX,i-OX,k

show that

9. Quotient Law
This law gives a criterion for determining whether a given set of 4" functions forms
a tensor of nth rank or not. We shall first take the simple case of n = 1.
Let Bi be a set offour functions of coordinates xa. Given an arbitrary vector A k ,
if it so happens that the product BiAk = Tik tranforms as a tensor, then Bi must

12

p. C. Vaidya

form a vector. Bi Ak transforms as a tensor so that

a ox b
(OX'C
- -ox --B
- A ,)
- OX'i OX'k a ox b C

(since Ab transforms as a 4 vector)

and this holds good for the arbitrary vector A k Therefore, Bi - (ox a/ OX 'i ) Ba = 0;
Bi transforms as a vector.
We shall now state the general law. Let B~~:~ i:';~~ ... inbe a set of 4" functions of
a G
k2 ... k~ ... in
x.
IVen an arbltrary vector A k,k2 ... k~, 1f t he pro duc t A k,k2 ... k~ Bk'
im.,i~+2
2
transforms as a tensor, then
,i;';:';... in must be a tensor. We leave the proof as
an exercise.

Bt:

10.

The Fundamental Tensor

We now turn to the basic line element ds 2 = gik dxi dxk. We have seen that dx i is
a vector and so dx i dx k is a contravariant tensor of the second rank. Add to this
the basic geometric requirement that ds 2 is an invariant and we have all the
ingredients of the quotient law to show that gik must be a second rank covariant
tensor. This tensor determines the nature of the corresponding Riemannian
geometry and so it is often called the fundamental tensor or the metric tensor.
Consider the 16 functions gik arranged as a 4 x 4 matrix Ilgikll. If the
determinant of this matrix is not zero, it will have an inverse matrix. Call the
inverse matrix I gik II. It follows from the rules of inversion of matrices that
gikl l = c51, where c5l is the Kronecker delta. We first show that c51 is a tensor.

Exercise
To show that c51 is a tensor, we use the quotient law. Let Blk be any arbitrary
tensor. Then the product Blk c51 = Bik which is a tensor and so by the quotient rule
c51 becomes a tensor.
Returning to our equation gik II = bl, we again use the quotient rule to conclude
that gkl is a second-rank contravariant tensor.

13

Introduction to General Relativity

Exercise:
The assumption g = det I gik II i= 0, as a matter of fact g is < O. Signature - 2,
(1)

~
1
.
g' = -(cofactor of gik III g),
g

(2)

For the following two metrics find gik,g and gik

11.

(a)

ds 2 = -

(b)

ds 2 =

(1 _2~)

(I -2~)

-1

dr2 _ r2 d(:J2 _ r2 sin 2 8d<)?2

(1- 2~)dt2

du 2 + 2 du ds - r2(d8 2 + sin 2 8 d<)?2).

Raising and Lowering the Suffixes (Indices)

We have seen that displacement is a contravariant vector. The velocity and


acceleration which are defined in relation to the displacement are, therefore,
contravariant vectors. Thus, according to Newton's second law, force which is the
product of mass and acceleration also becomes a contravariant vector.
On the other hand, we have also seen that the gradient of a scalar is a covariant
vector. And so for a conservative system of forces, force becomes a covariant
vector. How can we then have an equation like force = mass x acceleration for
a conservative system of forces? Well, we have such an equation because the
geometry of the situation allows us to associate a covariant vector with every
contravariant vector and vice versa. This is accomplished by the process of
raising and lowering a suffix which we now proceed to describe.
With a contravariant vector Ai, we associate a covariant vector Ai gik and call it
A k, Ak = Aig ik We call this process of association, the process of lowering the
suffix. Similarly, we associate a contravariant vector Bi gik with a given covariant
vector Bi and call it Bk. We call this the process of raising the suffix. It will be seen
that we are using the metric tensor ofthe geometry to associate a covariant vector
with every contravariant vector and vice-versa. This method is self-consistent
because we can easily show that if we begin with a contravariant vector Ai, and
apply the process oflowering the suffix, we get the covariant vector Ak.lfwe now
use the process of raising the suffix to this, we shall get back the original vector Ai.
The process of raising and lowering the suffixes could be used for tensors of any
rank. For example, we can associate two mixed tensors Til = Tik gkl and
T/ = Tik gil with a second-rank contravariant tensor Tik. Of course, if Tik is
symmetric in i and k, the two mixed tensors are identical and then one can write it
Again, we can use this process of raising and lowering repeatedly and
as T/ or
thus, from Til' we can further lower i and get Tml . As a matter of fact,
Tml = gmiglk Tik.
The raising and lowering of suffixes brings out the basic mathematical fact that

n.

14

P. C. Vaidya

the rank of a tensor is an important characteristic and not its covariant,


contravariant or mixed nature.

12.

Length of a Vector

We can associate a scalar gik AiAk = Ak Ak with the vector Ai and call it A 2, the
square of the length of the vector, and take A as the length (or magnitude) of
the vector. Similarly, we can define the angle between two vectors Ai and Bi by
the equation

AB cos

e=

gik AiBk = Ak B k

For a given vector Ai' the square of the length A2 = Ai Ai can be positive, negative
or zero and, accordingly, we call the vector time-like, space-like or null. A null
vector is a vector of zero length but need not be the zero vector.

13.

Addition of Vectors at a Point

Vectors have been defined as a set offour functions satisfying certain transformation equations. At each point of spacetime, the vector becomes a set of four real
numbers. In another coordinate system at the same point, a given vector will yield
another set offour real numbers. The numbers of the new set are connected with
those of the old set by the vector transformation law. We give this unusually long
explanation for a comparatively simple matter to emphasize the fact that all the
algebra we have done, refers in essence to a set of real numbers defined at each
point of the spacetime.
If we have two vectors Ai and Bi at a spacetime point P , we can define
a quantity Ai + Bi at P and this will also satisfy the law of vector transformation

so

Thus, defining the addition of vectors at the same point is possible.


However, if Ai is given at a point P and Bi is given at a point Q, can we define
Ai + Bi? As a set of real numbers, we can carry out the summation but the new set
will not transform according to vector law into another coordinate system. This is
because the multiplying coefficients ax" lox'i (16 of them!) are also defined at
a point and these weighing numbers become different for two points P and Q.
This situation is very easy to understand. Even in Euclidean geometry, we did
not add two vectors at different points, but we have almost forgotten this

15

Introduction to General Relativity

disability of ours because, in order to add a vector A at P and a vector Bat Q, we


could always replace Bat Q by a parallel vector at P and then add the two (say, by
parallelogram rule). In Riemannian geometry, we do not have a parallel vector at
P because of the absence of a parallel postulate, and so we cannot afford to forget
this disa bili ty.
In the next section, we shall try to see how we can carry out some sort of
addition of vectors at neighbouring points in Riemannian geometry. After all, if
we wish to develop a vector (or a tensor) calculus in Riemannian spacetime, we
must define the 'difference' between vectors at neighbouring points.
14.

Covariant Derivative of a Contravariant Vector

If A; is a contravariant vector, we have the transformation law

Differentiate this with respect to x'k,


aA d
a (ax'; )
0 (ox'; ) ox b
OX'k = OX'k oxa A a = OX b ox a A a OX'k
02 X'; OX b
--Aa
OX bOXa OX'k

=---

OX'; oA a OX
+_
_. __
OXa OXb OX'k

OX'; OXb oA a
02 Xd OXb
--+-__ __ A a
OXa OX'k OXb OXa OXb OX'k '

=- --

OX'; OXb
.
A" ,k - -GX'k
A a,b
oxa

02 Xli

OXb

a
+---Oxa OXb OX~ A

which is not the law of transformation of a tensor. This is not surprising because
in finding the derivative aAa/ox b, we have to take the vector A a + bAa at
Q(Xb + (ixb) and subtract from it the vector A a at P(x b). We cannot do this because
we do not have a parallel postulate which would have enabled us to transfer
a vector A a at P to a parallel and equal vector at Q.
We get round this difficulty in the following manner. A; is a vector field (Figure
2), so that at every point of spacetime where we have a vector at P(x 1), the vector is

Fig. 2.

P. C. Vaidya

16

A i. At a neighbouring point Q(x' + bx'), the vector of the field is Ai + bA i. We


now transplant Ai from P to another vector Ai + ~Ai at Q and use this
transplanted vector as the representative of Ai at P for defining a derivative. The
real assumption is about ~Ai. But we take ~Ai to depend linearly on Ai, as well as
on the displacement bx'. We assume
~Ai = -l~,Akbx'.

To use the familiar jargon, we call Ai + ~Ai at Q a vector obtained by the parallel
transport of Ai from P to Q. We must, of course, show that
Ai

+ ~Ai

Ai - n,Akbx'

n,

is a vector at Q. After all, the coefficients


(64 in all) are undetermined. We shall
see how they should transform in order to make Ai + ~Ai a vector at Q. Let us,
however, proceed with our differentiation problem. We now define the derivative
of Ai at P by the formula
Lt

(Ai

+ bA/) bx

dxl~O

(A i
,

+ ~Ai)

We can do the subtraction indicated because both the vectors are at the point Q.
Call this derivative A\,
i

A, =

Lt

Lt

(Ai

+ bAil - (Ai + ~Ai)


bx

'Jxl~O

bA i

~,
Jxl ~ 0 (jx

oA i

'k

+1L,A

= OXl + lkl A

Ai"

+ lk, A

Ifthe transplanted vector Ai + ~Ai is a vector, this derivative Ai;, is a tensor or,
conversely, if we prove that Ai;, transforms as a tensor, we would have proved that
the transplanted Ai + L1A i was a vector.
We shall prove that Ai;, is a tensor or a little later on. For the present we shall
call it the covariant derivative of the contravariant vector Ai.
Let us now find the covariant derivative of a covariant vector.

15.

Covariant Derivative of a Covarialllt Vector

For this purpose, let us find the vector Bi + ~Bi at Q transplanted from Bi at P.
We note that Ai Bi is a scalar which defines the angle between the two vectors and
we know Ai Ai and Bi Bi as the lengths of the vectors. We must ensure that on
parallel transplantation of the two vectors Ai and B i, the lengths of these vectors
should remain unchanged as also should the angle between them. Therefore, we

Introduction to General Relativity


must have .1(Ai B;)

17

=0

.1A i Bi

+ Ai .1Bi = 0;

.1A i =

-r~IAkt5xl

but

- nlA k t5xl B;

+ Ai .1Bi = 0,

so

Ak[.1Bk - r~lt5xl BJ

= 0,

and this has to be true for any arbitrary

Ak, so

.1Bk = rbt5xl Bi = rklt5x1 Ba'


Now we are ready to find the covariant derivative Bi;k
Bkl
,

= dx'~O
Lt

Bk;l

= Bk,l -

(Bk

+ t5B k) - (Bk + .1Bk)


~ I
ux

So

q,Ba,

We have to settle two issues before we proceed further. (1) we must find the
unknown functions (64 ofthem) r~, and (2) get their law oftransformation which
will ensure that Ai;k is a tensor! We do it now
We know that Ai is a vector, A'i = axri/ax mAm. Also,

A;k = A:k + rik A'


and in the new coordinates

A';;k = A'i,k

+ ni A".

But we have already found the law of transformation of


,i
aA';
A ,k = ax'k'

Use it here

.
ax'; ax b
a2 x'; ax b
.
A";k =axa
- -ax'k
A a ,b +-----Aa+r"A"
axa axb ax'k
lk
ax'; ax b
'k[A'a'
= -a
xa ax

a2 X'i ax b

b- nb A' ] + -xaaax ax 'k Aa + r1kA"

ax'; ax b
ax'; ax b
= axa ax'kAa;b - rIb A' axa ax'k

a2 x'; ax b

+ axaaxb ax'k Aa + rlkA".

18

P. C. Vaidya

If A'i;k is to transform as a tensor, then

OX'i ox b
A" = - _ A a
;k oxa OX'k ;b
and, therefore,

But

OX'I
OX'i OX b
02 Xti OX b
r;:l _Am = -rabAm - - - - __ Am
m
a
ox
ox OX'k m
ox m oxb OX'k '

02 X'i OX b
OX'I OX'i OX b
r " ' lm- = -a - - r ab - - m
---OX
OX OX'k m
OX OX b t7X'k'
OX'I OX m OX ti OX b OX m
r"'l -- = - - __ r a b
ox m aX'" oxa OX'k aX'" m
b

OX'i ox ox
r,,'nOXa
= - - --rab t7x'k aX'"
m

02 Xti OX b t7X m
ox m oxb OX'k ox,n'

-- - - - - - - -

02 xti oxb ox m
- - - -- --.
ox m OX b OX'k OX,n

This is the transformation law of nn.


One can generalize this notion of covariant derivative to tensors of any rank.
e.g.

Tik;l = Tik,l

+ r~l T nk + r~l Tin,

Sik;l = Sik,l - ril Snk - r~l Sin'


and so on.

16. The Christoffel Symbols


The rtl introduced above is called the affine connection used for connecting
a vector at P to a vector at a neighbouring point Q. We now specialize this
connection by assuming (1) nl = rib (2) gik;l = O.
When connections follow these assumptions, the geometry of spacetime
becomes Riemannian geometry. We simplify assumption (2) to get nl in terms of
gik and its derivatives

gik;l = gik.l - r?l gnk - r kl gin = O.

r k1il Then
r k1il + r i1w

Define gnk r;'l ==

gik.l =

(1)

19

Introduction to General Relativity

+ r k1li ,
r i11k + r 11ik ,

gkl,i = r 11ki

(2)

gli,k =

(3)

+ (2) gik,l + gkl,i

Take (1)

(3)
- gli,k = 2rk1il ,

r k1il = 1(gik,1

r klil g b

+ gkl,i -

1 b(
="2g
gik,l

r71 = tgnk(gik,l

+ gkl,i -

+ glk,i

r71 are often written as

gli,k)'
gli,k ),

- gil,k)'

{ii} and are called three-index symbols, The symmetry in

i and I reduces their number from 64 to 40,


Examples,
(1) g = Ilgikll, The rule for differentiation of determinant says that
1,4

dg

L (cofactor of gik)dg ik
i,k

or
dg = ggik dg ik ,
therefore
dg
g

'k

g' dg ik = - ,

Again,
gik gik = 4,

therefore
gik dg ik

+ dg ik gik

and so
gikdg
(2)

nk

ik

dg
= -g ik dg ik = - ,
g

= tln(gin,k + gkn,i - gik,n)


1

="2g

(3)

gik rlk

kn

gkn,i =

1 8g 1

2"

8xi"g = 8xi(log

~
g),

v-

= tgik gln(gin,k + gkn,i - gik,n)


+ giknlng,
= _.1g1ngikg,
2
lk,n
t1
In.k

But gik gin = b~, therefore (gik gin),k = 0, So


gik,k gin

+ gik gin,k =

0,

20

P. C. Vaidya

going back to the argument


g ikr1'k = _

~
_ glngik ,k g.
::l n

o
oxn

V -y

In

uX

-log

= _ gin

= -

~gln
2

cg _ glk

1 [In 0 ;--:
;--: a In]
FfJ
g ox n V - g + V - g ax" g

;--:1

= - - - -n[V _ggn].

yCg Ox

17.

Geodesics

Having now access to Riemannian geometry, let us find the differential equations
of geodesics which are going to be the paths of freely falling objects. One can
derive these by using either of the following two properties.
(1) It is a curve joining two points P 1 and P2 such that the distance between
the two points measured along it is the least.
(2) It is a curve such that if you move along the curve from a point P to
a neighbouring point Q, the tangent vector to the curve does not suffer any
change (measuring the 'straightness' property of the geodesics).
We shall derive the equation by using the second property.
Let the equation of the curve be given by Xi = Xi(A) in terms of a parameter A.
We know that the tangent vector to the curve at a point P(x i ) is dxijdX Call it Vi.
We now go a distance <5s along the curve from P to Q. The total change suffered by
the tangent vector in going from P to Q is Vi + <5 Vi - (Vi + .1 Vi) and we want
this to be zero in the limit <5s --+ O. Therefore, we want
Lt

bs~O

<5 Vi - .1V i
<5s
= O.

Hence
<5V i + r

Lt

bs~O

dV i

kl
<5s

V k <5xl

dx

=0

'

-+rkIV -=0,
ds

~
ds

ds

(dXi)
r i dx dx
ds + kl ds ds -

d 2 Xi
ds2

. dx k dx l

+ r"l cis cis = o.

0,

21

Introduction to General Relativity

These are the equations of the geodesics.


18.

The Curvature Tensor

We cannot construct tensors of higher rank from the fundamental tensor by


covariant differentiation, because we have taken gik;l = O. However, there is
a way of getting over this difficulty. Consider the second covariant derivative of
Ai, i.e. (Ai;k);l

+ r~n An,
(Ai;k);l = (Ai;k).l + r~l An;k - fkl Ai;n'
(Ai;kl;l = A i.kl + fin.1An + fin A n.l +
+ r~l(An.k + r::'k Am) - fkl(A i.n + r~n Am)
Ai;k = Ai,k

A i.kl

+ fin A n.l + r~IAn.k -

+ fi".l An

+ r~l r::'k Am -

rZlAi,n+

r~l r~n Am.

By interchanging k and I, we write similarly

+ fin A".k + r~k A",l - ri'k Ai," +


fin,k A" + r~k r::'l Am - r kl r~" Am.

(Ai;lh = Ai.lk
+

Therefore, on subtraction
Ai;k;l - Ai;l;k
=

(rim" - fim,k

Am R imlk

+ r~l r::'k -

r~k r::'zjAm

where

Now the left-hand side is a tensor and on the right, Am is an arbitrary vector.
Therefore, by the quotient law R imlk is a tensor. It is called the curvature tensor for
reasons which will be made clear later.

19.

Natural Coordinates at a Point

In order to work out the properties of the curvature tensor, the use of a special
coordinate system is very helpful. In Section 4 we noted a property of Riemannian
geometry that in the immediate neighbourhood of a point, one can forget the
Riemannian complication and trcat the geometry as Euclidean. We shall now go
about showing how this can be done.
In Euclidean spacetime (or flat spacetime), gik can be reduced to constants, so

P. C. Vaidya

22

we take the line element in Euclidean spacetime as ds 2 = ~ik dx i dX\~ik are


constants.
We now show that in a Riemannian spacetime, we can choose a coordinate
system in which at a point P

gik(P)

and

~ik

gik.I(P)

= 0,

i.e., at the point P, of course, gik become numbers and these numbers are the same
as ~ik and, further, the first derivatives of gik at P all vanish. This vanishing of the
first derivatives at P ensures that not only at P but within a small neighbourhood,
gik will remain constant and so we shall have Euclidean geometry in this
neighbourhood. Let us first see that such a choice of coordinate system is
possible.
First consider any general coordinate system X'i. We want to find a new
coordinate system X'i with the required properties. In the X'i system, take a point
P with coordinates x~ and let the connection at P be (r;/,)p, Then in a small region
round P, define a new coordinate system Xi by the equation
(I)

At the point P, X'i = x~ and so Xi - !(r~,)pxk X, = 0, which means that at


P, Xi = to the first order of smallness of the radius of the neighbourhood of P.
Thus, in the new system P becomes the origin of coordinates. Again from (1) we
find

and
0 2 X'i
oxk ox'

P =

(rkl)p.

Now use the law of transformation of n, as


.

r;:, =

OX'i ox b ox e
OXa OX'k OX'I

r be -

02 X'i OX b OX m
OX mOX b OX'k OX',

Therefore, at the point P


(r;i,)P

= b~ bZ b/(n:e)p + (r~b)p bZ bi,

therefore
(r~,)p

= (nl)p + (r~,)p

and so (r~l)p = 0.
We have shown that gkl,i = r;;i gin
P vanish
(gkl,i)P

+ rti gkn and so when the forty r's at

= 0.

Thus, we have been able to choose coordinates in the immediate neighbourhood


of a point P such that the first derivatives of gik vanish and so gik can be taken as

23

Introduction to General Relativity

constants in that neighbourhood. Of course, at P the second derivatives of gik will


not vanish and so globally the geometry is Riemannian, and only in the
immediate vicinity of a point can one treat it as Euclidean. This special type of
coordinate system is known as the natural coordinate system at point or the
locally inertial coordinate system at the point. The use of this natural coordinate
system considerably simplifies mathematical proofs in tensor calculus, as we shall
subsequently see.

20.

Symmetry Properties of the Curvature Tensor

We have found that


R imlk

= nm,l -

r~l,k

+ r~l r::'k -

r~k r::. I ,

One property is easily seen from the very method of deriving this tensor.
Ai;k;l -

A\k

Am R imlk

that the tensor Ri m1k is anti symmetric in I and k. This enables us to write down all
the four terms on the right-hand side if we can write two specific terms, We should
fix our attention on the second and fourth terms on the right. We therefore write
R imlk

= -r~l,k + r~k,l

+ r::'kr~l'

r::'lr~k

One may now be able to write down


or
R Pbed = -

r&e.d + r&d,e -

r be nd

+ rbd r~e'

Lowering the suffix p, we write

R abed

= gap R P bcd = gape - rbe,d

+ rbd,e - fbe r~d + fbd r~e]'

We now use natural coordinates at a point P and consider R abcd in these


coordinates in the neighbourhood of P. There, the first derivatives of gik vanish,
but the second derivatives do not, so the derivatives of the r's do not vanish while
the r's themselves vanish.
In these coordinates,

R abed

= gap [

a
- axd

p]
r bep+a
aXe r bd

a [1.

- gap aXd 2g

pn(

gnb,e

+ gnc,b -

gbe,n

)]

P. C. Vaidya

24

Again, remembering that


R abed

gik,l

gik,ln

i= 0, we find

+ gne,bd - gbe,nd) +
+ !c5~(gnb,ed + gnd,be - gbd,nJ
![ - gab,ed - gae,bd + gbe,ad +
-!c5:(gnb,ed

+ gab,ed + gad,be =

but that

}Cgad,be

+ gbe,ad

gbd,aJ
gae,bd -

gbd,ae].

This expression for R abed is valid in a natural coordinate system at a certain point.
Since the symmetry properties are independent of the coordinate system, we can
draw conclusions about these properties from the above simple expression for
R abed The earlier conclusion about antisymmetry in c and d is easily verified, but
we also find that there is anti-symmetry between a and b. These two antisymmetries considerably reduce the number of independent components. A fourthrank tensor has, in all, 256 components. Now any two suffixes lead to 16
components. If, however, there is antisymmetry between these suffixes then, of
the 16, the four diagonal components vanish and of the remaining 12, we get six
pairs of equal and opposite signs. Thus, only six independent components are
present. In R abed the anti symmetry in cd leads to six independent components and
the antisymmetry in ab leads to six independent components corresponding to
each independent component due to cd. Thus, the double antisymmetry leads to
6 x 6 = 36 independent components.
We also perceive another symmetry property. If the pair ab is interchanged
with the pair cd, the expression is unaltered. R abed = R edab . Now the pairs ab and
cd have led us to 36 components. Of these 36, six are diagonal components and 30
are nondiagonal. These 30 nondiagonal components give 15 pairs of equal
components. Thus, the total number of independent components further reduces
from 36 to 15 + 6 = 21.
Moreover, we can check that R abed + Raedb + Radbe = 0, which is essentially
only one relation, R 1234 + R 1342 + R 1423 = O. This reduces the number of
independent components by 1 and so we ultimately find that, out of 256
components, only 20 are independent.

21.

Bianchi Identities and the Ricci Tensor

The Bianchi identities are


Rabed:e

+ Rabde:c + Rabee;d = O.

In natural coordinates at a point P, we write

Again, in natural coordinates

nl vanish

and so covariant differentiation is

25

Introduction to General Relativity

equivalent to ordinary derivative. We can write


and now it is easy to check that
Rabed;e

+ Rabde;e + Rabee;d =

0.

Since this is a tensor identity, its holding good in one coordinate system ensures
its universal validity.
In order to define the Ricci tensor, let us revert to the general coordinate system
and write

= c, we reduce the rank of the tensor to two and call it

Setting p

W abp

Rab =
Rab

= - r gbp

+ rgp.b -

+ r~pnb.

is called the Ricci tensor and we can also write it as


Rab =

cP
+ ax a 8x b log

- rgbp

We also define a scalar R as

22.

r~br~p

Rab

j= g

gab Rab =

r:p

r~b -

r:b ax" log~.

R~ and call it the Ricci scalar.

The Einstein Tensor and the Field Equations of Gravitation

The Einstein tensor Gab is built out of the Ricci tensor and the Ricci scalar R in the
following manner.
Gab = Rab -

tg ab

R.

The reason for building up such a combination will be obvious when we prove
that
Gab;b

= 0, i.e.

(R"b -

Rh = 0.

tg ab

We now prove this. We have the Bianchi identities


Rabed;e

+ Rabde;e + Rabee;d =

0.

We raise the first two suffixes and rewrite


R ab ed;e

+ Rob de;e + Rob ee;d = 0.

Now we carry out a double contraction and set a = d and b = c


Ra\a;e
Rbb;e -

+ Rabae;b + Rabeb;a =
Rabea;b -

Re - R~;b

Rhaeb;a =

R:;a =

2R~;a - b~ R;a = 0,

(R~

tb~R);a

= 0.

0,

0,

0,

26

P. C. Vaidya

On raising the suffix e, we find


(Rae _!gae R);a = 0

or

Gae;a = O.

Now Gab;b = 0 is a divergence equation and we know that a divergence of


a vanishing physical quantity implies conservation of that quantity. A distribution of matter and energy will produce a gravitational field and the Riemannian
geometry that describes this gravitational field will depend on this source. We
know that physically matter and energy can be described by a second-rank
tensor; also, that matter energy is conserved so the second-rank matter tensor
must have its divergence zero. In Riemannian geometry, we do have a geometrical
object Gab whose divergence vanishes. Here is the link between geometry and the
sources of the gravitational field. Einstein postulated his law of gravitation as
Gab = _kTab
k is a constant and Tab is the matter energy tensor.
On comparing his equations with Newtonian equations of gravity, he found that
if the gravitational field is weak enough and if the velocities involved are much
smaller compared to the velocity oflight, then to that order of approximation, his
theory gives the exact Newtonian Poisson equation, provided

Thus, if there are strong gravitational fields and if high velocities are involved,
one must use the full Einstein equations
Rab _ !gab R

= _ 81t~ Tab.
e

In space devoid of matter. yab = 0 and we get


R ab _ tgab R = 0;
gab R ab = tgab gab R.
R

= 2R

or

= 0,

therefore R ab = O.
Einstein's equations of gravitation in empty space satisfy Rab = O.

23. Matter Tensor for a Perfect Fluid


We shall end this chapter by finding a simple expression for the matter tensor Tik
representing a perfect fluid.
From any general coordinate system Xi, we transform to a locally inertial
system x~. In this system
(gik)O

= diag( -1, -1, -1,1) (e = 1).

27

Introduction to General Relativity

In the neighbourhood of the point where this xh system holds, the perfect fluid is
characterized by its density and equal pressure in all directions, Therefore, the
matter tensor in this system has

Tt/

T(}

= p,

a#- b,

Also, in this local frame the fluid is at rest so that


dx~

ds=
vb

V6

vij

0,

1,

V~ = 1.

Now we are ready to find Tik in the general coordinate system Xi, We have the
transformation xh ---+ Xi giving
ik
ex i ox k ab
T =--To
ox~ ox~

axi ox k
uXo uXo

= --;4 --;4 P

ox i ox k

(axi axk

ox i axk)

+ ~;;---r
+ ;)2 ;)2 + ;;---3;;---3
oXo uXo uXo uXo uXo uXo

p,

but the law of transformation of gik gives

axi c)x k

= ox~ ox~

(axi jJx k
- aX6 oX6

ox i ox k ox i OXk)
+ axil axil + oxij oX6

Also, in the general coordinate system the velocity of the fluid is Vi then
, o xi
ox i
V' = ax~ V~ = ox~'
hence

Tik = Vi V k P + p(Vi V k _ gik)

= (p + p)Vi V k _
24.

pgik,

Exercises

1. Consider the line elements


(1) ds 2 = -

2M)-1 dr 2 - r2(d0 2 + sin


(1 - -r-

(2) ds 2 = 2du dr

(1 - 2~)

du 2

Od<p2)

(1- -r2M) dt 2

r2(d0 2 + sin 2 0 d<p2)

(i) For (1) write down gik' Calculate g, calculate gik,

28

p. C. Vaidya

(ii) For (2) do the same as for (1).


(iii) (2) is a transform of (1). Find the equations of transformation.

2. Consider the line element (3)


(3) ds 2 = 2(du + a sin 2 8 dcp) dt-

-(1 +

r = t

2~rcos

)(dU + asin 2 ctd/Wct


- (r2 + a 2 cos 2 ct)(dct 2 + sin 2 ct df32),
- u.
r

+a

This is the Kerr metric.


Write down gik,g,gik.
3. The line element
ds 2 = dt 2 -

R2(t)[~
+ r2(d8
1- kr
2

+ sin 2 8 dcp2)]

is to be transformed to the form


ds 2 = dt 2

R2(t)el'[dr 2 + r2(d0 2

+ sin 2 8dcp2)].

p = per)

Find the law of transformation as well as the function p(i').


4. Show that the special relativity line element
ds 2 = c2 dt 2

dx 2 - dyZ- dz 2

can be transformed to the form

(4) ds 2 =2dudV-dyZ-dz 2 ;

u=X 1 ,y=X 2 ,Z=X 3 , V=x 4

Write gik,g and gik for the line element (4).


5. Show that for (4)
(a)

.
VI

(1)2)",0,0')2

A )

is the velocity vector i.e. it is a unit time-like vector.


(b)

. (1)2/ 0, 0, )2A )

VI =

is a unit space-like vector.


(c)

Vi

is orthogonal

Vi.

(d)

Wi = (0, sin 8, cos 8, 0)

29

Introduction to General Relativity

is a unit space-like vector and is orthogonal to


(e)

. (1J2Jc'

w' =

Vi.

J2

-cos e, sin e, A )

is a null vector and is orthogonal to Vi and Wi.

2. Introduction to Black Holes


C. V. VISHVESHW ARA
Raman Research Institute, Bangalore 560 080, India

1.

Preamble

One of the most interesting phenomena, which is entirely an outcome of the


general theory of relativity, is that of the black hole. Although the mathematical
existence of the black hole, as embodied in the exact solution of Schwarzschild,
dates back almost to the advent of general relativity itself, most of the relevant
aspects related to it were discovered essentially after the mid-sixties. These
include the geometric structure of the black hole, the physical phenomena
associated with this structure, the differences between static and rotating black
holes, perturbational effects, black hole thermodynamics, quantum field theory
in the black hole gravitational fields and the uniqueness of the black hole
solutions. These discoveries, incorporating several diverse areas of physics and
mathematics, have revealed an extraordinarily rich variety of phenomena
associated with black holes. In the following discussion, we shall consider the
basic, elementary structural aspects of static and rotating black holes - pointing
out the inherent differences between them - and the corresponding physical
phenomena. At the end, we shall touch upon a few other features of black holes
that would be of relevance to some of the subsequent articles.
2.

The Schwarzschild Black Hole

Soon after Einstein proposed his general theory of relativity, Karl Schwarzschild
in 1916 derived the spacetime - or equivalently the line element or the metric named after him. This is the Schwarzschild exterior metric, which is a vacuum
solution with the energy-momentum tensor set equal to zero in the Einstein field
equation. The line element can be written as
(1)

with time t ranging from - 00 to + 00, the radial coordinate from 0 to 00 and the
polar coordinates f) and cp, respectively, from 0 to n and from 0 to 2n. This metric
is spherically symmetric which is indicated by the angular part of the metric and
from the fact that goo and gil are functions ofr alone and do not depend on f) and
cp. The only constant parameter appearing in the metric is rn which can be

31
B. R. /yer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 31-42.
CO 1989 by Kluwer Academic Publishers.

c. v.

32

V ishveshwara

identified with the mass M producing the gravitational field. We have


m

GM/c 2

and, of course, we use the geometrized units G = c = 1 in which case m


measured in centimeters.
Notice the following properties of the metric.

(2)
IS

(i) If we set m = 0, we get the flat space as should happen in the absence of
a gravitational source.
(ii) The metric components are independent of time gab., = O. Further, there
are no mixed terms between time and space, i.e. gl'o = 0 (/1 = 1,2,3). This
absence of terms like dt dcp (g,fP == g03 = 0) indicates the source is not
rotating and, therefore, there is no rotation inherent to the spacetime
either. Spacetimes which exhibit these two properties are called static'.
(iii) As r tends to infinity, the metric becomes that of flat spacetime. This is
asymptotic flatness. As we go farther from the isolated source of mass m,
the gravitational field progressively diminishes to zero.
(iv) There exists a rigorous theorem kilOwn as the Birkhoff theorem (1923)
which states that any spherically symmetric vacuum solution of Einstein's
equations is necessarily the Schwarzschild solution. Consequently, the
outside of the following mass configurations is described by the Schwarzschild exterior metric: (a) a static spherical mass distribution, (b) a spherical
mass distribution undergoing radial pulsations, (c) a spherical mass
distribution in radial collapse (or explosion), (d) spherical nonrotating
black hole. Here the metric is considered to be valid down to the origin
r = o.
Now let us consider the last of the situations above. At r = 0, the metric
components become pathological. This has been found to be a true singularity.
Scalars computed out of the curvature tensor blow up there. On the other hand,
at r = 2m, goo becomes zero and gIl becomes infinite. Once upon a time this
surface had been dubbed 'the Schwarzschild singularity'. It was not taken
seriously because the numerical value of 2m is quite small for any ordinary matter
distribution and the surface lies well within the matter. For instance, 2m is about
a mere 3 km in the case of the Sun. Within the matter distribution the metric is not
the one given by Equation (1) and r = 2m has no special significance. It has,
indeed, a special significance only if it is in the matter-free region. This happens as
a result of continued gravitational collapse, as in the case of a star or stellar core of
mass greater than about three times that of the sun which has exhausted its
nuclear fuel and in which no force can withstand the enormous inward
gravitational pull. As the star collapses unchecked, r = 2m surface is exposed and
the mass finally reaches the singularity r = 0, being progressively compactified in
the process. No one knows the ultimate fate of the collapsed matter, although
quantum effects are expected to prevent the actual formation of the singularity.
F or our purpose, however, the important event is the collapse through the surface

I ntroduction to Black Holes

33

r = 2m which can no longer be ignored as was done previously. The pathology


exhibited by the metric components on this surface is only a coordinate effect
similar to what happens in the case of polar coordinates at the poles. Curvature
scalars are well behaved at r = 2m. Furthermore, coordinates can be found in
terms of which no undesirable features are displayed at this surface. Such
coordinates were discovered by Kruskal and Szekeres. On the other hand,
strikingly interesting properties are exhibited by this surface and it is identified
with the static or the nonrotating black hole.
3.

Properties of the Schwarzschild Black Hole

We shall now consider three basic properties ofthe Schwarzschild (or the static or
the nonrotating) black nole.

(i) Static Limit


Consider the metric component goo = (1 - (2m/r)). For r > 2m, goo> and,
correspondingly, t is a legitimate time coordinate. But for r < 2m, goo < and,
therefore, t can no longer measure time. In this region a new time coordinate a mixture of t and r - will have to be defined. The metric will then no longer be
independent of this new time and, hence, the spacetime will cease to be static.
Because of this, the surface r = 2m is called the 'st?tic limit'. A related
consequence is as follows. Outside the static limit we can define 'static' particles
with (r, e, cp) = constant with only time t changing. This is possible only up to the
static limit within which t loses its character of being time. Objects will have to
have both t and r coordinates changing within the static limit, that is they have
necessarily to be in a state of fall. We shall make these statements a little more
precise and coordinate independent as follows.
Consider the vector field ~a = b == (1,0,0,0) defined at every point. This
defines translation along time t which leaves the metric unchanged, since it is
independent of t. Therefore, ~a is a vector defining a direction of symmetry, the
motion along which leaves the spacetime geometry unaltered. Such a vector is
called a Killing vector. Consider then the four-dimensional square ~2 of ~a, which
is a scalar and, therefore, coordinate independent but coincides with goo in the
Schwarzschild coordinates. That is,

(3)

We can speak of ~2 without reference to any coordinate system; but it is


convenient to refer to the Schwarzschild coordinates to extract specific information. We note
>0; ~a timelike for r> 2m; goo> 0,
{
~2 =0; ~a null on r = 2m; goo = 0,
<0;

~a

(4)

spacelike for r < 2m; goo < 0.

Thus, the static limit is the surface on which the time-like Killing vector becomes

c. v.

34

V ishveshwara

null. For r > 2m, we can define static particles (sources, observers and so on) with
four velocities following the Killing direction
(5)

This is not possible on or within the static limit ~2 <: o.


The above discussion shows that the Schwarzschild black hole r = 2m is the
static limit on which the Killing vector, which is asymptotically (r ~ (0) time-like,
becomes null.
(ii) Infinite Redshift Surface
Suppose we consider in any spacetime an observer with a four-velocity ua . Let
him encounter a particle moving with four-momentum pa. Then the energy of the
particle, as measured by the observer, is given by

uapa.

(6)

For instance, in flat spacetime a static observer has ua = (1,0,0,0) and pa = (E, p)
and the above statement is true. The energy measured depends on the state of
motion of the observer as given by ua and changes according to Equation (6). This
formula is true in the local elemental flat spacetime and, hence, in any coordinate
system of an arbitrary spacetime, since it is a scalar equation.
Now let us specialize to the static observers and sources following the Killing
vector direction for whom ua = ~a/(~b~b)1/2 as we have seen. Let us also assume
that pa is the four-momentum of a geodesic so that with proper parametrization
(7)

We have seen that ~a defines the symmetry of spacetime. Whenever we have a


symmetry, there is a conserved quantity. In mechanics corresponding to an
ignorable coordinate x(al, the conjugate momentum P(a) is a constant. A similar
situation exists here. Along the geodesic, the scalar ~apa = constant. To prove this
we have to assume the Killing equations satisfied by ~a as will be shown in the
chapter by A. K. Raychaudhuri in Chapter 8 ofthis volume. These equations are
(8)

Taking the directional derivative of ~apa along the geodesic tangent pa, we have
(9)

The first term vanishes by virtue of Equation (8) (~a;b is anti symmetric, but papb is
symmetric and the contraction between the two gives zero) and the second term is
zero because of the geodesic Equation (7). Thus, ~apa = constant along the
geodesic.
Suppose a particle, say a photon following a geodesic with four momentum pa
is emitted at a point 1 by a static source with four-velocity u,!, and is observed at
point 2 by a static observer with four-velocity U2. Then the ratio of the energies

35

Introduction to Black Holes


measured at these two points (by static observers) is
EdE2 = (u apa)l/(u apa)2
= (~a Pall (~b ~b)11/2 IW Pa)2 (~b ~b)2 1/2

(10)

= W~b)il2/W~bWl,
since

~apa =

constant. Identifying E

2m)1/2

II(

hv, we get

2m)1/2

vo
Iv = ( 1 - - s
1- - '
rs
ro

(11)

where 0 and s stand for observer and source, respectively. This is the gravitational
redshift formula for static sources and observers in the Schwarzschild spacetime.
Ifro is kept finite and larger than 2m, as rs approaches 2m, we see that Vo goes to
zero. In other words, as the static source approaches the black hole, the redshift
tends to become infinite in the limit.
The black hole is therefore an infinite redshift surface for static sources and
observers.
(iii) One- Way Membrane
The property of static limit showed the impossibility of defining static particles
within the black hole. This, as we found, is directly related to the idea of infinite
redshift. However, the most important property of the black hole - from which it
derives its name - is that material particles and light can enter it but cannot come
out again. We shall now discuss this defining property of the black hole.
Consider a surface given by the equation
f(x b ) = constant.

(12)

Herefis any function ofthe spacetime coordinates represented by x b The normal


to the surface na is given by the gradient of the function evaluated on the surface.
So
na = fa

(13)

Then the square of the normal is given by


n 2 == nana = gabfafb'

(14)

The normal na is time-like, space-like or null according as n2 is greater than, less


than or equal to zero. Let us concentrate on the case when n2 = O. The surface
f(x b ) = constant is then said to be a null surface. Let us now see the significance of
such a null surface.
Consider flat spacetime with Cartesian coordinates (t, x, y, z). A wavefront
moving along x direction has the equation
f(x b ) = t - x = constant.

(15)

c.

36

v.

V ishveshwara

The normal to the wavefront is then given by


na

= fa = (1, -1,0,0),

(16)

so that with the diagonal metric (1, -1, -1, -1), we find

n2 = (1)2 - (_1}2 = O.

(17)

Therefore, the wavefront is a null surface. It can be shown, that at every point on
the wavefront, the light cone is tangential to the surface, as shown in Figure 1. Any
time-like trajectory of a material particle confined to within the light cone can
cross the wavefront in only one direction. It cannot recross the wavefront in the
opposite direction. To do this the trajectory will have to turn around and go out
of the light cone. Physically, what this means is that once the wavefront has
crossed a material particle, the particle will have to travel faster than light in order
to catch up with the wavefront and recross it in the other direction. Equivalently,
a particle can cross a wavefront in only one direction. Therefore, the wavefront is
a 'one-way membrane'.

WORLD LINE OF
A MATERIAL PARTICLE

Fig. 1.

The above property is true for any null surface. The light cone is tangential to it
and it behaves as a one-way membrane. Material particles can cross it in one
direction and cannot come out. In flat spacetime, only travelling wavefronts are
examples of null surfaces. When there is a gravitational field, the situation can be
different, as in the case of Schwarzschild spacetime.
Consider the family of surfaces defined by
f(x b ) = (a(a =

(1 _2~)

= constant.

(18)

These are two-dimensional spheres r = constant (provided we also take the


section t = constant). Each of these surfaces has the normal
na =

( (b).a = (2m)
O,~, 0, 0 .
b

(19)

37

Introduction to Black Holes


Then

n2

gabnanb

= -

(1 - 2~) e7r.

(20)

When we set r = 2m, we see that the surface becomes null. It is like a spherical
wavefront frozen in space held in place by gravitation. Therefore, the black hole,
being a null surface, is a one-way membrane. Particles can go in but cannot come
out (including light which can also be seen from Figure 1). This is why it is called
a black hole.
From the foregoing we see that the nonrotating Schwarzschild black hole is
characterized by three properties. The first two- static limit and infinite redshift - are interrelated and are a consequence of the time-like Killing vector
becoming null on the black hole. The third property of one-way membrane is the
defining characteristic of the black hole and is the outcome of its being a null
surface. When rotation is introduced, these properties no longer coincide, as we
shall see in the case of the rotating Kerr black hole.

4. The Kerr Black Hole


The Kerr spacetime, which incorporates the rotating black hole, was discovered
in 1963 by R. P. Kerr. The line element is commonly written in the BoyerLindquist coordinates as
ds2

1-

2mr) d 2
T
t -

[( r 2

2mar sin 2 8

dt d<p -

2
4
' 28
+ a 2 )Sill
+ 2ma rr sin 8] d<p 2 -

r 2 - rd8 2
- -dr
~
,

(21)

where r = r2 + a 2 cos 2 8 and ~ = r2 - 2mr + a2 .


We note some of the salient properties of the above metric.
(i) The metric components are time-independent gab,! = O. That is, the
spacetime admits a time-like Killing vector ~a = 15 0, There exists a cross-term
between time and space coordinates, namely dt d<p (g03 1= 0), which cannot be
transformed away. This is the term that incorporates the rotation inherent to the
spacetime. Such spacetimes that are independent of time but have rotational
terms are called 'stationary'.
(ii) The metric components are independent of the coordinate <po This means
that the spacetime is axially symmetric. Correspondingly, there exists an axial
Killing vector rt = 15'3.
(iii) As r tends to infinity, the metric becomes asymptotically flat, as should be
the case with gravitational fields due to isolated sources.

c.

38

v.

V ishveshwara

(iv) If we set a = 0, we not only get g03 = 0 (rotation goes to zero), but the
metric actually becomes Schwarzschild. Therefore, the rotation of spacetime
depends on the parameter a. In 1918, Lense and Thirring showed that, in the firstorder approximation, the exterior metric: due to a spinning sphere of constant
density was

ds 2 = (Schwarzschild line element)

+ 2 G/ sin 2 8(c dl) dip,


c r

(22)

where J is the angular momentum of the spinning sphere. Now expanding the
Kerr metric to first order in a, we find that
ds 2

= (Schwarzschild line element) - 2ma sin2 8 dt dip.

(23)

Therefore, one can identify


rna =

GJ

-2'

(24)

or a = angular momentum per unit mass.


(vi) By similar identification (zeroth-order terms), we find that m is the mass
parameter as before. We have used this fact already in identifying a.
5.

The Black Hole and the Ergosphere

We shall adopt a similar line of reasoning, as followed in the case of the


Schwarzschild black hole, and look for surfaces with the properties exhibited by
the former.
(i) Stationary Limit
In the Kerr metric, the time-like Killing vector is given by (u = 6 0 and its square
(25)

The stationary limit (we use the word 'stationary' instead of ,static' because of the
rotation inherent to the spacetime) is the surface where the coordinate t changes
from time-like to space-like. This is where the Killing vector (" becomes null. The
stationary limit is thus given by
= 0 or r2 - 2mr + a 2 cos 2 0 = 0 that is

fosl
is!

m (m

2 -

a cos 0)1/2.
2

(26)

The subscripts 'osl' and 'is!' stand for 'outer stationary limit' and 'inner stationary
limit', respectively. We can concentrate on the outer stationary limit in our
discussions and ignore the inner one.

39

Introduction to Black Holes

(ii) Infinite Redshift Surface


As in the case of the Schwarzschild metric, stationary sources and observers can
be defined up to the stationary limit, following the Killing direction by fourvelocities ua = ~a/(~b~b)l/2. With respect to these, the redshift is once again given
by

(27)

As the location of the source approaches the stationary limit


=
or goo = 0,
infinite red shift occurs.
The above two properties follow exactly as in the case of the Schwarzschild
metric. But can the stationary limit be identified with the black hole in the present
case? If so, it should be a null surface with the one-way membrane property,
things going in but not coming out. It is easy to compute the normal to the
stationary limit and check whether the square of the normal is, in fact, a null
vector. The answer is in the negative. The stationary limit is not a null surface. It is
not the black hole. We therefore now seek a null surface which can be identified
with the black hole.
(iii) One-way Membrane: The Black Hole
Consider a surface given by the equation f(r, 0) = constant. We require the
surface to display the same symmetries as the spacetime and be independent of
t and ((1. The normal to the surface is then given by the gradient
(28)

na = fa = (0, fr' fo, 0).

Then
n 1 = gab nanb = gl1(fn)l
=

~ (fr)

+ ~ (fo)

+ gll(fo)l
(29)

since gl1 = gl/ and g12 = gi.i.


Setting n1 = 0, cancelling ~ i= 0, noting fo i= would lead to a nonperiodic
function eCo as the angular part of J, which we do not admit, we finally have
~f; = 0. This has the solution ~ = 0. Since ~ = r1 - 2mr + a1, fo =
as
required. Therefore, the null surface or the one-way membrane or the black hole
is given by the solution to the equation ~ == r2 - 2mr + a2 = 0:

(30)

For all practical purposes, we are concerned with the outer sheet r + = m +
(m 1 - a1)1/1. The black hole is also referred to as the event horizon, because no
information regarding any event occurring within it can be communicated to the
outside world by means of either material particles or light. Since r _ is within r + ,
for the outside world r + is the black hole. As we had promised, we have shown
how the properties of the stationary limit and the one-way membrane separate
out because of rotation.

c. v.

40

V ishveshwara

Note that the black hole can exist only if a :::; m. For a > m, there is no black
hole and the true singularity at r = 0 will be visible to the external world. Such
a singularity is known as a naked singularity.
When a = 0, r + = 2m and coincides with the outer stationary limit, while r _
and the inner stationary limit both go to zero. We have, of course, recovered the
Schwarzschild case.
The surfaces we have come across are indicated schematically in Figure 2. The
region between the outer static limit and the black hole has been christened the
'ergo sphere', 'ergo' standing for work. For all appearances, the black hole can
only absorb things, nothing can come out and, hence, it is nothing more than
a cold, dead entity. The indication that this is not true, that the black hole is more
alive than it seems to be, came with the discovery of the possible energy extraction
from the black hole. This process - the Penrose process - takes place in the
ergosphere and is only possible because ofthe existence of this region. Hence, the
name 'ergosphere'. We shall now briefly outline the principle behind the Penrose
process.
9=0

---f7t---;-;?-l*+----f7t--- 9 =11/ 2
--=- Ergosphere

Fig. 2.

6.

The Penrose Process

As we have already seen, along a geodesic with four momentum pa the quantity
~a Pa is conserved and is proportional to the energy E. During processes like the
collision of particles following geodesics, conservation of energy is formulated by
the use of this quantity. Let a particle moving along a geodesic with four
momentum p1 enter the ergosphere from outside and split into two particles
following geodesics with four-momenta P'2 and P3. After the split, P3 falls
into the black hole and P2 escapes out of the ergosphere. Applying energy
conservation at the point of split, we have
WPa)j = (~apa)2 + WPa)3'
E j = E2 + E3

(31)

I ntroduction to Black Holes

41

When ~a is time-like, energy E has to be necessarily positive. Therefore, we must


have E 1 > 0 and, since P2 goes out of the ergo sphere, E2 > O. On the other hand,
since P3 does not come out to the region where ~a is time-like at all, there is no
such condition on E 3. Therefore, if P3 is such that E3 < 0, then we can have
E2 > E 1 Or the particle comes out with more energy than that possessed by the
particle that went in. There do exist geodesics with E3 < 0 and, in principle,
energy extraction is possible. During the process, the black hole slows down: the
excess energy comes at the expense of its rotational energy. Energy can be
extracted until the rotation goes to zero and the spacetime becomes Schwarzschild with no ergosphere. Although this energy extraction is possible, it has been
shown to be astrophysically untenable. But as a physical process associated with
a rotating black hole, it is a very interesting one indeed.

7.

Charged Black Holes

So far we have discussed black holes that are uncharged. Corresponding to the
two spacetimes we have considered there are two charged versions that are
solutions to the Einstein-Maxwell equations. The charged version of the
Schwarzschild spacetime is known as the Reissner-Nordstrom spacetime with
the metric

(32)

where Q is the charge.


Similarly, the charged version of the Kerr metric is known as the KerrNewman metric and is given simply by redefining L\ by L\ == r2 - 2mr + a 2 + Q2.
Properties of the charged black holes in these two cases can be studied following
the same lines as in the uncharged cases.

8.

Conclusion

As mentioned at the outset, a wealth of extraordinary results has emerged from


the study of black holes. Some aspects of black-hole thermodynamics and
quantum evaporation will be discussed later in this book. Physical processes
occurring in the black-hole field have been studied using perturbation theory, by
considering electromagnetic fields, neutrinos and gravitational radiation superposed on the background of Schwarzschild and Kerr geometries. It has been
shown that the Schwarzschild black hole is stable against such perturbations.
A complete rigorous proof of stability does not exist for the Kerr black holes, but
it is almost certain that they are stable as a number of results indicate, in spite of

42

c. v.

V ishveshwara

rare, occasional false alarms. A striking result in the black-hole theory is their
uniqueness. We now possess complete proof, as a result of years of efforts, that the
only possible black holes are the Schwarzschild, Kerr and their charged versions.
In addition to all this, the black hole has been invoked generously in the realm of
astrophysics to explain highly energetic sources. Thus, objects that were once
either totally ignored or whose existence was strongly doubted, have come to
assume a reality of their own.

References
1. R. Adler. M. Bazin, and M. Schiffer. Introduction to General Relati1'ity. McGraw Hill (1975).
2. R. M. Wald, General Relatidty, Univ. of Chicago Press, (1984).

3. Black-Hole Thermodynamics and


Hawking Radiation
B. R. IYER
Raman Research Institute, Bangalore 560 080, India

The most generally known solution to Einstein's field equations that contains
black holes is that of the Kerr-Newman, family, which describes axisymmetric
matter-free spacetimes and represents rotating and electrically charged black
holes. This solution is a three-parameter family labelled by total mass energy M,
angular momentum J and charge Q. The line element is given by
Ll
.
sin 2 e
ds 2 = r,(dt - a sm 2 edeW - ~((r2

+ a 2)d -

a dt)2 -

L 2 - Lde 2
- -dr
Ll
'

(la)

where
Ll

r2

+ a2 -

2Mr

+ Q2;

J=Ma.

(lb)

Let us recall that the infinite red shift surface and the static limit coincide and are
located at
r = r00 = m

+ J m2
_

a 2 cos 2

e _ Q2

(2)

while the horizon is located at


(3)

In general, if a #- 0, the horizon and static limit do not coincide and the region
in between is called the ergosphere, since energy can be extracted from this region
of the Kerr black hole, as was shown by Penrose [4]. In view of the variety of
configurations that can collapse and form a black hole, one would expect to find
equilibrium black holes with a large number of parameters characterizing the
various physical properties of these collapsing configurations. It turns out,
however, that when a body undergoes a gravitational collapse to form a black
hole, very little survives to tell the external observer what the black hole originally
consisted of. The effect of the collapse is to impose a type of coarse-graining when
only distant measurements are concerned. The object settles down rapidly to
a quasi-stationary state characterized by mass, angular momentum and charge.
Wheeler summarized this with his famous aphorism 'black holes have no hair'.
43
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 43-49.
1989 by Kluwer Academic Publishers.

(<';)

44

B. R. Jyer

Black-hole baldness was partly Bekenstein's [2] PhD problem in a way.


Wheeler argued that if one transferred a piece of matter from the exterior into
a black hole, then, an external observer cannot uniquely determine the entropy of
the lost material from the associated change in mass, angular momentum and
charge of the black hole. Thus, for the external observer with no possible inside
information, the second law loses its predictive power. We say that it is
transcended. In trying to think of this problem, Bekenstein was impressed by
Christodoulou's results concerning the efficiency of physical processes involving
black holes. It was proved that the most efficient processes were associated with
the reversible changes of the black hole and the less efficient ones with an
irreversible increase of a quantity called the irreducible mass of the black hole.
The result rang a thermodynamic bell. Any doubts about the generality of this
result were dispelled by Hawking's area theorem that 'the surface area of the
boundary or horizon of any black hole cannot decrease and will always increase
in any dynamical process'. As we shall see below, the black hole area is
proportional to the square ofthe irreducible mass, so that Christodoulou's result
is a special case of the area theorem. The theorem thus points to a formal analogy
between the black-hole area and entropy, both of which increase. To proceed
further, we consider a Kerr-Newman black hole of mass M, angular momentum
J and charge Q. The area of the outer horizon at rh+ is seen to be
A = 4n(r:2

+ a 2).

For a Schwarzschild black hole,

a= Q = 0, r: = 2M,

(4a)

(4b)
For two equilibrium states of the black hole differing by bM, (jJ and bQ, from
equation (4a) we have
(5)

where
(M 2

_ a2 _

Q2)1/2

= --------'.-r:2 + a 2

0h

<l> _
h -

a
+2

rh

+a

2'

(6b)

+a

2'

(6c)

r: Q

+2

rh

(6a)

K, Oh and <l>h are the surface gravity at the horizon, the angular velocity of the
horizon and the electric potential of the horizon, respectively.
For a Schwarzschild black hole
K

=-'

4M'

(7)

Black-Hole Thermodynamics and Hawking Radiation

45

Physically, II: can be understood as being the acceleration due to gravity at the
horizon. In the Kerr case, II: is proportional to the magnitude of the acceleration
of a particle corotating at the horizon.
Consider a reversible process where the area is constant (isentropic!). The
extraction of angular momentum and charge then involves a decrease in mass.
This is what happens in the Penrose process where an infalling particle extracts
angular momentum (charge). The ergosphere consequently shrinks and we can
only continue until both J and Q reduce to zero so that we are left with
a Schwarzschild black hole of mass M ir , the irreducible mass of the black hole.
Since the area is left constant, we obtain

r;2

+ a2

(2M iY

or
M ir2 = 4:1( rh+ 2

+ a 2),

I.e.

= 16nM;'.

(8)

In an irreversible process, M ir increases. Equation (5) resembles the first law of


thermodynamics that gives the change in internal energy du as the sum of the heat
transferred to the system and the work done on it.
du = Tds

+ dw

(9)

Thus, if some multiple of A is analogous to entropy, a multiple of II: is analogous to


temperature, with the other two terms representing the work done on the system
in changing its angular momentum by bJ and charge by bQ. The bA term cannot
be interpreted as work, since unlike the other terms that can increase or decrease,
bA always increases. In addition to the above formal similarity, we can see why
A is probably a measure of the black-hole entropy. To do this, following
Bekenstein, we estimate the amount of information that has gone to make up
a black hole of a given size. We count the number of particles that are needed to
make the black hole and assign one bit of information to each. Classically,
this number is infinite, since there is no lower limit to the mass of the constituents
that can be used. If, however, the quantum nature of matter is taken into account,
it becomes clear that only those particles can be used whose Compton
wavelength is smaller than the size of the horizon. Thus,

h
2GM
- < -2me
c '

I.e.

he
m> 2GM'

(10)

The number of particles, though large, is now finite and is given by


N

= M = (2G)M2
m

he

'

(11)

which, as required, is proportional to the area of the horizon ofthe hole. This has
also been extended to the rotating case [8]. Entropy is proportional to M for an

46

B. R. lyer

ordinary body. Thus, black-hole entropy is not just the entropy of the matter that
makes the black hole.
The thermodynamic analogy goes further and we also have the analog of the
zeroth and third laws too. The analog of the zeroth law is the result that the
surface gravity of a stationary black hole IS constant over the event horizon. The
analog of the third law is the statement that it is impossible, by any procedure
however idealized, to reduce K to zero by a finite sequence of operations. There is
no rigorous proof of this result but one can give a few plausible arguments. For
example, ifby a finite sequence of operations one makes K = 0 or M = a, then by
just one more step can one attain a > M or, in fact, a naked singularity. This
violates the unproved but almost universally believed cosmic censorship
hypothesis due to Penrose that all singularities arising from gravitational
collapse are covered by an event horizon. [t should be pointed out that the proof
of the second law (area theorem) is also based on the cosmic censorship
hypothesis.
Now if, following Bekenstein, we insist that the above analogy is in fact the
thermodynamic description of black holes, we end up with a serious problem. For
what does black-hole temperature mean physically? To assign a temperature to
some object means that it can be in equilibrium with a heat bath at the same
temperature. To do so, a black hole will have to emit heat energy at the same rate
as it absorbs it. This is not possible, however, since the event horizon only allows
radiation to go in, but does not let it go out. This forced the opinion contained in
a paper entitled 'The four laws of black-hole mechanics' [3J that the analogy was
suggestive but formal with no profound implications. Although surface gravity
was analogous to temperature, the thermodynamic temperature of a black hole
was zero, since it could not radiate.
In this period, however, rotating (charged) black holes were shown to be
capable of emission. The possibility arose since, as was first realized by Penrose,
the ergosphere contains negative energy trajectories which could be used to
extract energy from the black hole. This is referred to as the Penrose process [4].
What one does is to drop a particle in the ergosphere so that it splits into two. One
fragment is then captured in one of the negative energy orbits, while the other
escapes to infinity with more energy than the incoming particle. Technically this
is possible, since the Killing vector which is time-like at infinity becomes spacelike in the ergo sphere. There exists a physical process for waves which is the
analog of the Penrose process for particles. This is called superradiance [5].
Waves incident on rotating (charged) black holes are scattered with increased
amplitude in certain modes called classical superradiant modes
(12)
where w, m and e are the frequency, the projection of the angular momentum
along the rotation axis and the charge of the incident wave. In a particle
description, this corresponds to an increase in the particle number, i.e., stimulated
emission, and this phenomenon of amplification by reflection is called super-

Black-Hole Thermodynamics and Hawking Radiation

47

radiance. It has been shown that boson waves superradiate, while fermion waves
do not. In general, the Pauli exclusion principle forbids superradiance in the
fermion cases. Using the first and second laws, one can exhibit the possibility of
super radiance for boson cases. Associated with the above stimulated emission is
the spontaneous emission in classical superradiant modes leading to a loss of
mass, angular momentum and charge of the black hole. This typical quantum
mechanical effect is called the Zeldovich-Starobinsky-Unruh [6J emission and
was one of the early computations in quantum field theory on black-hole
backgrounds. Although the black hole loses mass, angular momentum and
charge, it can be shown that emission in the superradiant modes does not violate
the area theorem. Thus, the rotating (charged) black hole may be viewed as an
excited state of black-hole solutions, but once the available energy is radiated
away, one is left with a nonrotating uncharged black hole which thus appears to
be the ground state of black-hole solutions. As we have seen earlier, however, the
thermodynamic analogy seems to point to a different direction.
In early 1974 came the resolution in the work of S. Hawking [7]. He showed
that quantum field theory in the background of a collapsing black hole predicts,
at late times, a radiation of particles in all modes with a characteristic thermal
spectrum at a temperature equal to fzK/2nkc, as would be required by the
thermodynamic analogy. Thus, physically the temperature of the black hole
corresponds to the temperature ofthe quantum radiation from the black hole. Its
characteristic wavelength is 2M so that it is not meaningful to ask where the
radiation originates in the black hole. The effect depends crucially on the
existence of the collapse but not on its details. Moreover, as h ---> 0, T ---> 0, so that
the effect is purely quantum mechanical. Thus, classically, black holes are indeed
black. For a Schwarzschild black hole, we may write

(M )0

hc 3
T=8nGkM=6xlO- s Mo

K.

(13)

so that it is clear that the temperature is negligible for stellar mass black holes. It is
significant only for primordial or mini black holes of mass 10 15 - 16 gms that
could have formed by density fluctuations in the early universe. For such a black
hole, there can be nett emission leading to a decrease in mass, i.e., an increase in
temperature leading to a catastrophic explosion. Using Stefan's law and one
species of particles, one can show that the lifetime for a black hole to radiate itself
away is given by

T=

10 76 ( M )3 sec.
Mo

(14)

It should be noted that the quantum evaporation entails a loss of mass and,
hence, a decrease in area or entropy of the black hole. This decrease may be
thought of as due to a flux of negative energy into the horizon that allows
violation of the area theorem. This decrease in entropy is, however, more than
compensated by the entropy of the thermal radiation in the exterior. Thus, the

B. R. Jyer

48

second law should be generalized to include the sum of ordinary entropy and
black-hole entropy.
Hawking radiation is an example of particle creation by strong gravitational
fields. Vacuum fluctuations of the matter field create virtual pairs outside the
horizon of the black hole. If the tidal gravitational force is strong enough so that,
over a Compton wavelength, it can provide energy to cross the mass gap, then the
virtual pairs can become real, taking energy from the gravitational field. This
implies that
2
( GMm).c) .
2mc ~ (GMlc 2)3 Icc

or

.
GM
)'c ~ -2

(15)

Equating this to the mean separation of the black-body photons hclkT, one
obtains Tcx 11M as required. The thermal nature of the emission seems to be
related to the existence of the horizon, since other spacetimes with horizons, like
Rindler spacetime and De Sitter spacetime, seem to also yield a thermal radiation
of particles.
The Hawking effect was the crowning glory of the subject of black-hole
thermodynamics. It was a golden moment in the subject of quantum field theory
in curved spacetime that led to a focussing of interest on this field. This included
the extension to different spacetimes and spins and alternative derivations in an
attempt to obtain a physical understanding of the effect. The techniques
employed in these investigations, especially in the context of quantum field
theory in cosmological models, will be discussed in later chapters. An overview of
the various models studied, as well as the different techniques employed is
beautifully summarized in Isham [9J and we cannot do better than refer to it.

Exercises
I. Using the area theorem, prove the foillowing:

(1) A black hole cannot bifurcate or break up but two black holes can coalesce.
(2) The energy extraction limit in the Penrose process is 29%.
(3) The upper bound on the amount of energy that can be carried away as
gravitational radiation in a collision of two black holes is (1- 2 -1/2) for
nonrotating black holes and (1 - 2 - 1) for rotating ones.
II. Using the first law and the area theorem, obtain an expression for the
superradiant modes of the boson field in a Kerr-Newman background.

III. Verify, using the first law, that emission in the superradiant modes does not
violate the area theorem.
IV. An interesting way to spin the Kerr black hole faster is to drop the rotating
particles along the spin axis so that one can deliver more angular momentum
than M2 and, hence, raise the ratio of PI M 2. Show that if the spin of the particles
is less than or equal to two, this is not possible. Quantum field theory seems to
provide the singularity with a fig leaf.

Black-Hole Thermodynamics and Hawking Radiation

49

References
1.
2.
3.
4.
5.

6.
7.
8.
9.

D. W. Sciama, Vistas in Astronomy 19, 385 (1976); P. C. W. Davies, Rep. Prog. Phys. 41,1314 (1978).
1. D. Bekenstein, Phy. Rev. D7, 2333 (1973), D9, 3292 (1974).
1. M. Bardeen, B. Carter and S. W. Hawking, Commun. Math. Phys. 31, 161 (1973).
R. Penrose, Rev. Nuovo Cim. 1 252 (1969).
Y. B. Zeldovich, JETP Lett. 14, 180 (1971); C. W. Misner, Bull. Am. Phy. Soc. 17,472 (1972); W. H.
Press and S. A. Teukolsky, Nature 238, 211 (1972).
Y. B. Zel'dovich, JETP 35,1085 (1972); A. A. Starobinsky, JET P 37,28 (1973); W. G. Unruh, Phys.
Rev. DlO, 3194 (1974).
S. W. Hawking, Commun. Math. Phys. 43,199 (1975).
W. Zurek and K. Thorne, Phys. Rev. Lett. 54, 2171 (1985).
C. 1. Isham in M. Papagiannis (ed.) Eighth Texas Symposium in Relativistic Astrophysics.

4. Introduction to Relativistic Cosmology


C. V. VISHVESHWARA
Raman Research Institute, Bangalore 560 080, India

1.

Preamble

Perhaps the grandest application of the general theory of relativity is in the sphere
of cosmology. The theory is not merely useful here, but indispensable. Starting
from simple observational facts combined with a limited number of conceptual
assumptions, one can build elegant cosmological models based on general
relativity. These models have been quite successful in providing a framework for
considering various physical processes in detail and in explaining cosmological
observations.
In building cosmological models, one is inevitably led to making certain
a-priori assumptions. For instance, we take it for granted that the physical laws
that have been found to be valid on the terrestrial scale, are also valid on the
cosmological scales of both spatial distances and time. The universe is the totality
of all that exists with no outside platform from which it can be viewed. It is
self-contained and all its properties must be accounted for in terms of its
components. As far as we know, the universe is unique. We cannot study several
universes in existence and abstract their common properties. So far, no theory,
including general relativity, has been able to yield a unique mathematical
model as the only possibility. The best theory that is available, namely the general
theory of relativity, offers different types of possible models and only observation
can ultimately decide which one of these corresponds to reality.
2.

The Cosmic Spacetime

The starting point for building general relativistic cosmological models is, of
course, fundamental observations on very large scales. These scales should be
large enough to consider individual galaxies as point. The galaxies can then be
treated as forming a smooth fluid filling the universe.
Probably the most important observation for the present purpose is the
isotropy of the universe. In whichever direction we look, the universe appears the
same. The distribution of galaxies in the sky along with their apparent
magnitudes and red shifts, the distribution of radio sources and the temperature
of the cosmic microwave background radiation all exhibit a remarkable isotropy.
Now combine the observed isotropy with the Copernican principle that we

51
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 51-58.
~) 1989 by Kluwer Academic Publishers.

c. v.

52

V ishveshwara

occupy no special place in the universe. This means, that from wherever you
observe, the universe should appear isotropic. Then the conclusion one arrives at
is that the universe is homogeneous. Its properties are the same at a given moment
of time wherever you go. The fact that the combination of isotropy and the
Copernican principle leads to homogeneity, can actually be proved mathematically. We now further assume that this homogeneity is inherent to the spacetime
itself and thereby arrive at the cosmic geometry.
In order to construct the four-dimensional homogeneous cosmic spacetime, we
assume that the world lines of galaxies, treated as points, are geodesics. Each
galaxy is characterized by constant spatial coordinates. Only the cosmic time
common to all galaxies changes along each world line. These are the comoving
coordinates.
A given surface t = constant is spacdike and represents the state of the
universe as observed by all galaxies at that moment. On this surface, physical
quantities such as the density, pressure and temperature of the cosmic fluid is
constant. This is the statement of homogeneity. At different times of course, these
values may be different. Reflecting this physical homogeneity, each space section
t = constant is geometrically homogeneous. Mathematically, this means that the
spatial section admits three independent spacelike Killing vectors at each point.
This homogeneity of the universe is referred to as the cosmological principle. All
points on any spatial section are the same and there are no preferred positions.
If the spatial geometry of the universe at a given time t is assumed to be
homogeneous and isotropic, then the line element takes the remarkably simple
form
(1)

The constant k can take on the values 0, 1 or - 1. And S(t) is an arbitrary function
of time t which, as we shall see, is determined from the field equations and depends
on the equation of state of the cosmic fluid. This line element is known as the
Robertson-Walker line element. Corresponding to k = 0, + 1 and -1, the
spatial sections are, respectively, flat (zero curvature), three-sphere (constant
positive curvature) and three-hyperboloid (constant negative curvature). The
metric can also be expressed in other convenient forms, but for our purpose, we
can use the line element as given in Equation (1).

3.

Cosmological Models

The dynamics of the Robertson - Walker geometry depends on the scale factor
S(t) which tells us how the spatial geometry changes with time. This function is
determined, for a particular value of k, by the field equations and the equation of
state relating pressure and density of the cosmic fluid. Thus, for different choices
of these factors, different models are obtained. These are referred to as the

Introduction to Relativistic Cosmology

53

Friedmann models in honour of A. Friedmann, one of the founders of relativistic


cosmology. The field equations, as we have seen in Chapter 2, can be written as
Gab

8nG

(2)

- 2 - Tab'

The energy-momentum tensor for a perfect fluid with density p and pressure p, is
given by
(3)

with c = G = 1. Here ua = (1, 0, 0, 0) is the four-velocity of any galaxy in


comoving coordinates. As Friedmann discovered, the field equations inevitably
lead to dynamic solutions. Einstein, in order to accommodate a static universe as was his belief in regard to the universe we live in - changed his field equation
to
(4)

where A is the cosmological constant which can give rise to a repulsion cancelling
the normal gravitational attraction. With Hubble's observations, the expansion
of the universe came to be a reality, thereby dismissing the idea of a static
universe. In what follows, we shall assume A = 0, although this term has been
invoked from time to time, most recently within the context of the inflationary
universe.
Einstein's field equations for the Robertson-Walker metric can be summarized
as
.
8n
S2 = _pS2
3

and

(5)

p + 3(p + p) S = 0,

(6)

where the dot represents derivative with respect to time.


The pressure p arises from the peculiar motion of galaxies, radiation, magnetic
fields, cosmic rays and so on. Two extreme cases correspond to p = 0, for dust and
p = pl3 for radiation. At the present epoch, one can assume, to a very good
approximation, that p = 0, while in the early times, one can show the universe
was dominated by radiation so that one could assume p = pl3 to describe those
stages.

4.

Dust Models (p

0)

If we set p = 0, Equation (6) can be integrated to give


pS3 = constant = C,

(7)

c.

54

v.

V ishveshwara

which is a conservation equation. Substituting this in Equation (5), we get

(8)
From this equation, one can show that if S starts from zero at t = 0, it increases
continuously for k = 0,1 (ever-expanding open universe) whereas, for k = 1 it
oscillates between and Smax (oscillating closed universe). We shall very briefly
consider these three cases below.

(i) Einstein~De Sitter Model (k = 0)


As we have already mentioned, the spatial sections are flat and infinite in extent
(open model). Equation (8) can. be readily integrated to give S(t) ex t 2/3 or

3H )2/3
S(t) = ( T t
,

(9)

where, as we shall see, Ho is the Hubble constant. In this model, the universe starts
at t = from a singular state with S(t) = and the curvature scalars infinite. Then
it expands forever according to the time dependence given by Equation (9).

(ii) The Open Hyperbolic Model (k = -1)


The space sections are three-dimensional hyperboloids of constant negative
curvature and, therefore, infinite in extent. From Equation (8), one can show that
for small values of t (early times), S(t) - t 2/3 and for large values of t (late times)
S(t) '" t (unaccelerated expansion). Once again, the universe starts from a singular
state at t = and expands forever.

(iii) The Closed Elliptic Model (k = + 1)


In this case, the space sections are three-spheres with positive constant curvature.
Their radius at time t is given by S. Equation (8) with k = 1 can be integrated in
parametric form to obtain a cycloid for S(t). At t = 0, S(t) = and the universe
starts from a singular state at this moment. At t = C, S = and the universe
attains its maximum expansion with S = Smax = C and then recontracts to the
singular state with S = 0.
The behaviour of these three universal models is indicated in Figure 1.

5.

Radiation Models

If the cosmic fluid is assumed to be made of radiation, then p = pl3 and Equation
(6) can be integrated to give
pS4 = constant.

(10)

In the beginning when S is small, the radiation density predominates over matter
density, as can be seen by comparing Equations (7) and (10). It can be shown that
in Equation (5) k can be ignored when Equation (10) is substituted and early times

55

Introduction to Relativistic Cosmology


S(t)

k =- 1

k=O
(EinstE'1n dE' sittE")

k+1

Fig. 1.

are considered. Then one can easily integrate equation (5) to obtain
S = S o t l/2 ,

(11)

where So = constant. With the help of Equation (10) and the law p = aT\ where
a is the radiation density constant, one can find the relation between the
temperature T and time t. The dependence of Ton t turns out to be Text - 1/2.

6.

Models with Nonzero Cosmological Constant

We shall very briefly mention two of the better known models with 1\ "# O.
(i) The Einstein Static Model
As was mentioned earlier, the motivation for introducing 1\ came from Einstein's
desire to accommodate a static universe in his theory. Such a model is possible
with k = + 1,1\ = 4nGpo = So2 = constant. The universe is closed with finite
volume V = 2n 2 S~.

(ii) The De Sitter Universe


This model has the parameters k = p = p = O. The line element can be written as
(12)

where H 0 = (41\j3)1/2 is the Hubble constant. This line element can also be
derived by assuming 1\ = 0 but P = - p.
The de Sitter model conforms to the 'perfect cosmological principle', according
to which the universe is not only homogeneous in space but also looks the same to
a fundamental oberver at all epochs. This model has therefore been used in
connection with the steady-state universe. The exponential expansion of space in
this model has been recently invoked in the context of the idea of the inflationary
universe.

c. v

56
7.

V ishveshwara

Observational Contacts

We shall now discuss some of the observational parameters and their relation to
the theoretical models described above.
(a) Cosmological Redshift
In a homogeneous Universe since all points are equivalent, we can choose our
own radial coordinate as r = 0. A light ray from another galaxy with spatial
coordinates (r, 8, cp,) propagates along a null geodesic (ds 2 = 0) with 8 = 8 1 =
constant and cp = CP1 = constant (d8 = dcp = 0). Let the moments of emission
and reception be given by te and t" respectively. Then, using line element (1) with
ds = d8 = dcp = 0, we see that the travel of this signal should be such that
(13)

Considering another signal emitted at te + Me and received at tr + Iltr (or the


above two signals can be successive crests of a continuous wave), we have

(14)
Ifwe assume that Ilte and Mr are small compared to the expansion timescale, then
we get
(15)
Noting that Ilt is the proper time of the cosmological observer (goo = 1), we see
that wavelength A = Ilt (since c = 1), that is, the emitted wavelength Ae = Ilte and
the wavelength measured by the receiver )or = Ilt r. The cosmological redshift is
defined as
(16)
so that we finally arrive at the relation
(17)

If z > 0, we have cosmological redshift which requires that S(t,) > S(t.). This
means that by the time the light ray reaches the observer, the universe will be in
a more expanded state than when the light was emitted. Also note the redshift is
symmetric with respect to the interchange of emitter and observer, as it should be,
since no point is special.
(b) The Hubble Constant

The Hubble constant is defined as


(18)

Introduction to Relativistic Cosmology

57

where T is known as the Hubble time. In general, Hand T are functions oftime and
their values correspond to the present time t = to. This is sometimes made
specific by the use of subscript O. The value of to itself gives the time elapsed since
the beginning ofthe universe and, hence, gives the age ofthe universe. In Figure 1,
Hubble's constant is given by the intercept of the tangent to the S(t) curve,
drawn at t = to, with the t axis. From the shape of the curve, it can be seen - as
well as can be demonstrated analytically - that T > to, that is, the Hubble time is
greater than the age ofthe universe. For instance, in the case of Einstein-De Sitter
universe with k = 0, Equation (9) gives
(19)

Therefore, determination of T = H- 1 sets an upper limit to the age of the


universe.
The determination of H can be carried out by the linear Hubble law connecting
redshifts with distances to the galaxies when these quantities are small. From
Equation (13), ignoring the r2 term and assuming S(t) can be equated
approximately to S(to), (tr = to = now), we get

re

(to - te)/S(t O)'

(20)

Expanding S(te) around to, we obtain


S(te)

~ S(to) + (te -

to)

Gt

S(to)

Dividing throughout by S(to), setting H = SIS and equating this to (1


obtain

(21)

+ Z)-l, we
(22)

or
(23)

where De = reS(to) is the proper distance (approximate if k i= 0) between the


observer and receiver corresponding to the epoch t = toThus, the red shift is approximately proportional to the distance to the
observed galaxy - the famous Hubble law. The constant of proportionality, the
Hubble constant, and hence Hubble time, can be determined by observation. In
the forties, T as determined by observation, turned out to be about 3 x 109 years,
less than the age of the solar system (4.5 x 10 9 years) deduced by the study of
rocks, meteorites and stellar evolution! However, later refinements have given
H which does not lead to this contradiction. For instance, H - 50km S-l Mpc- 1
gives T - 1.8 X 10 10 years.
(c) Deceleration Parameter

As the nomenclature suggests, this parameter, q, indicates the rate of the slowing

c. v.

58

V ishveshwara

down of the universal expansion. By determining it, one can decide whether our
universe is open or closed.
The deceleration parameter is defined as
q ==

-SS

= -

H2(t)

(S)S .

(24)

From the field equations it is a straightforward procedure to show that

q=

(43n Gr2)p = p/2pc'

(25)

where Pc = 3H 2 /8nG is known as the 'closure density'.


For the Einstein-De Sitter model, q = !- and P = Pc' For k = + 1, q > 1, P > Pc
and k = -1, q < 1, P < Pc' Thus, the value of q tells us whether the density in the
universe is enough to close it or not. For r ~ 10 10 years, the closure density
Pc ~ 2 X 10- 29 gms/cc. The observed density of the luminous matter is of the
order of 10- 31 gms/cc. The deficit between this and Pc is known as the 'missing
mass' in the universe, a term which seems to indicate an a-priori preference for
a closed universe. The accurate measurement of q is important for a knowledge of
the type of universe we live in.
We arrived at the linear Hubble law by the Taylor expansion of S(t) to first
order. If we retain the second-order term, we can show
(26)

Thus, q can, in principle, be deduced from the measurement of the redshift as


a function of distance. But distance measurements are not accurate enough to
yield an accurate value for q. The deceleration parameter can be deduced
observationally from the magnitude-redshift relation. Again, the results are not
conclusive. Other observational tests of cosmological models include radiosource counts and angular diameters of galaxies. The final verdict as to the nature
of our universe remains in the future. Whether we live in an open or closed
universe remains an open question.
8.

Conclusion

In the foregoing, we have discussed very briefly some aspects of the general
relativistic cosmological models and their implications to observations. The
purpose of this is to serve as a short introduction to other chapters to follow that
will treat in greater detail some of the basic features of our universe.
References
1. J. V. Narlikar, Introduction to Cosmology, Jones and Bartlett (1983).
2. A. K. Raychaudhuri, Theoretical Cosmology, Clarendon Press (1979).
3. S. Weinberg, Gravitation and Cosmology, John Wiley (1972).

5. Relics of the Big Bang


J. V. NARLIKAR
Theoretical Astrophysics, Tata Institute of Fundamental Research,
Homi Bhabha Road, Bombay 400005, India

1. The Early Universe


All Friedmann models have an epoch in the past when the scale factor S was zero.
We refer to this epoch as the big bang epoch. To mathematicians, the big bang
implies a breakdown of the concept of spacetime geometry, and they have come
to recognize it as an inevitable feature of Einstein's general relativity. It is a
feature that prevents the physicist from investigating what happened at S = 0 or
prior to it. To some physicists, this abrupt termination of the past signifies an
incompleteness of the theory of relativity. To them, a more complete theory of the
future may show a way of avoiding the catastrophic nature of the S = 0 epoch.
A universe that has been expanding forever or that has been oscillating between
maximum and minimum (but finite) values of S, might result from such a theory.
Here we will continue to put our faith in the validity of general relativity and
push our investigations into the past of the universe as close as possible to the
S = 0 epoch. The purpose of such investigations will be to find out whether we
can point to any present-day evidence that the universe indeed had a past epoch
when S was close to zero. In short, we will be looking for relics of the big bang.
Pioneering work in this field was done by George Gamow in the mid-1940s.
Gamow was concerned with the problem of the origin of elements. Starting from
the (then available) basic building blocks of neutrons and protons, Gamow
attempted to describe the formation of nuclei of deuterium, helium, and so on.
The process envisaged by him involved nuclear fusion, that is, a process in which
nuclei are formed by bringing together neutrons and protons. Astrophysicists
were already sure by the 1940s that such processes operate inside stars, where the
necessary conditions of high temperature and density were known to exist.
Gamow pointed out that similar conditions must have existed in a typical
Friedmann universe soon after the big bang.
We know from cosmological equations that the density of the universe was
very high at small values of S. What about temperature? A simple calculation
shows how the temperature also might have been high. This calculation requires
the assumption that, at present, we have a radiation density U o that is a relic of an
early hot era. With this assumption, the radiation energy density at a past epoch
S is given by
S~

(1)

u = UoS4'
59
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 59-88.
1989 by Kluwer Academic Publishers.

60

J. V. N arlikar

where S = So at the present epoch. Cosmological equations also tell us that at


a critical value of the scale factor the contribution of radiation energy density
equals that of matter energy density, and that prior to this epoch the former was
more dominant. Gamow therefore. assumed that, in the early epochs, the
dynamics of expansion were determined by radiant energy rather than by matter
in the form of dust.
If we wish to make a simplified calculation, we can assume that the radiation
was in blackbody form with temperature T, so that
(2)

where a is the radiation constant. This means that in the early stages of the big
bang universe
(3)

We also anticipate that the space-curvature parameter k will not affect the
dynamics of the early universe significantly, and set it equal to zero. Thus, from
Einstein's equations for the (8) component, we get

S2

8nGa

- = - -2 T
S2
3c

(4)

Further, from (1) and (2) we get


T =

S'

A = constant.

(5)

Substituting (5) into (4) gives a differential equation for S that can be easily solved.
Setting t = 0 at S = 0, we get
S

3c2
A( - -

32nGa

)-1 /4t

l/2

(6)

and, more importantly,


3c2 )1/4
t- 1/2
32nGa

T= ( - -

(7)

Notice that all the quantities inside the parentheses on the right-hand side of
the above equation are known physical quantities. Thus, we can express the
above result in the following form
(8)

In other words, about one second after the big bang the radiation temperature of
the universe was 1.52 x 10 10 K. The universe at this stage was certainly hot
enough to facilitate nucleosynthesis, as Gamow supposed.
The idea of a hot big bang, as the above picture is called, depends therefore on
the assumption that there is relic radiation present today. It is commonly believed

Relics of the Big Bang

61

that the microwave background radiation first discovered in 1965 by Arno


Penzias and Robert Wilson is this relic radiation. We will return to the details of
this evidence later. For the present, we will accept this evidence as confirming
Gamow's notion of the hot big bang and proceed further.
2.

Thermodynamics of the Early Universe

Considerable progress has been made in our understanding of the properties of


particles and their basic interactions, since the days when Gamow and his
colleagues R. A. Alpher and R. Hermann did their calculations of primordial
nucleosynthesis. In the following pages we will briefly outline the basic principles
on which the modern calculations are usually based.
First, it is necessary to specify the building blocks from which nuclei were
constructed in the early epochs. The physicist would naturally like to imagine
that the universe started with the simplest possible material composition
(whatever that may be!) and that more complex structures were built out of
simpler ones by physical interactions. Thus, the cosmologist is forced to take
stock of the knowledge of particle physics. While Gamow and his colleagues took
the existence of particles like protons, neutrons, electrons, and so on for granted,
modern particle physicists believe that a more basic framework accounts for the
creation or existence of these particles.
Here we take up the story from the stage when baryons (neutrons and protons),
leptons (electrons, muons, neutrinos, and their antiparticles) and photons (the
particles of light) are already in existence and are in thermodynamic equilibrium
as particles of an ideal gas. Later, we will consider the more speculative and earlier
epochs and discuss how these particles came into existence.
Before proceeding with calculations, we must clarify what is meant by
'thermodynamic equilibrium' and 'ideal gas'. We have already mentioned that in
these early epochs the dominant form of energy was in particles moving
relativistically. The question arises, therefore, whether these particles were
interacting with one another or whether they were moving freely. The ideal gas
approximation implies that the particles were mostly moving freely. Such
particles would interact and collide, of course, but these instances are assumed to
have occupied very brief time spans, and their effects on motions may be
otherwise neglected. We will shortly express this idea in a quantitative manner.
The collisions and scatterings of the particles would, however, have helped to
redistribute their energies and momenta. If these redistributions occurred
frequently enough, the system of particles as a whole would have reached a state
of thermodynamic equilibrium. In this case for each species of particles there is
a definite rule governing the number of particles in a given range of momentum.
For thermodynamic equilibrium to be reached, the timescales for successive
scatterings should be small compared to the expansion time scale for the universe.
Again, we will express this idea quantitatively in a short while.

62
2.1.

J. V. Narlikar

Distribution Functions

Assuming ideal gas approximation and thermodynamic equilibrium, it is then


possible to write down the distribution functions for any given species of particles.
Let us use the symbol A to denote typical species (A = 1,2, ... ). Thus, nA(P) dP
denotes the number density of species A in the momentum range (P, P + dP),
where
P)
nA (

2[ (E A(P)-I1A\)
IJ-1
kT

gA
2n2h3 P exp

(9)

In the above formula, T = temperature of the distribution, 9 A = number of spin

states of the species, k

Boltzmann constant, and


(10)

is the energy corresponding to rest mass rnA of a typical particle. Thus, for electron
gA = 2; for the neutrino gA = 1, rnA = 0, and so on. The + sign in Equation (9)
applies to particles obeying the Fermi-Dirac statistics (these particles are called
jerrnions), while the - sign applies to particles obeying the Bose-Einstein
statistics (particles known as bosons). For example, electrons and neutrinos are
fermions, and photons are bosons.
The quantity I1A is the chemical potential of the species A. For a detailed
discussion of chemical potentials, see any standard text on thermodynamics and
statistical mechanics. We note here that in any reaction involving these particles,
the I1A are conserved Gust as electric charge, energy, spin, and so on are
conserved). Because photons can be absorbed or emitted in any number in a
typical reaction, we set I1A = 0 for photons. Since particles and anti-particles
(such as electrons and positrons) annihilate in pairs and produce photons, their
chemical potentials are equal and opposite.
Apart from the dynamic quantities and the electric charge, several other
quantities are found to be conserved in the interactions of particles. These are the
baryon number, the muon lepton number, and the electron lepton number. In
computing these numbers, a value of + 1 is assigned to a particle and - 1 to its
antiparticle. The electron lepton number counts electrons (e -) and their
neutrinos (v e ), while the muon lepton number counts muons (11-) and their
neutrinos (v 1'). Under these conservation rules, reactions like these
are permitted, while a reaction like the following is not:
n->p+e-+v e
(Later we will consider the situation in which the baryon number is not
conserved. At the epochs that we are concerned with here, however, we may safely
assume the conservation of baryon number to apply.)
Hence, if we assume that in any reaction electric charge, the baryon number,

Relics of the Big Bang

63

the electron lepton number, and the muon lepton number are conserved, then we
have only four independent chemical potentials - those corresponding to
protons, electrons, electron neutrinos, and muon neutrinos. From (9) we see that
the total number of particles per unit volume in each of these species is needed to
determine the corresponding Il A and that the number densities will be large for
large IlA > O. These number densities are not known with any degree of accuracy,
except that (as we shall shortly see) the ratio

NB
Ny

= Number density of baryons _10- 8 _to- 10


Number density of photons

is small compared to 1.
The smallness ofthe baryon number density suggests that the number densities
of leptons may also be small compared to Ny, and it is usually assumed that this
hypothesis provides a good justification for taking IlA = 0 for all species. We will
assume that IlA = 0 for all species in our calculations to follow.
We then get the following integrals for the particle number density (N A)' the
energy density (G A)' pressure (p A)' and entropy density (SA):

NA=

GA

gA
2n 2 h 3

gA
2n 2 h3

gA
PA = 6n 2 h 3

SA

2.2.

foo

p 2 dP
exp[ E A(P)/kT]

l'

(11)

foo

p 2 E A(P)dP
exp[EiP)/kT] l'

(12)

foo

C2p 4 [E A(P)]-1 dP
exp[EA(P)/kT] l'

(13)

Jo

Jo

= (PA + GA)/T.

(14)

High- and 11M-Temperature Approximations

The above expressions become simplified for particles moving relativistically. In


this case
m c2
T-t-=TA

(15)

The details are given in Table I for the different species of interest. The numbers
are expressed in units of the quantities for the photon (gA = 2, symbol y):

Ny

Sy

= 2.404 (kT)3
n2

ch '

= 4n 2 k (kT)3
45

ch

(16)

1. V. Narlikar

64

Thermodynamic quantities for various particle species at T TA.

Table I.

Particle species A

Symbol

Electron
Positron

e
e+

Muon
Antimuon

/1
/1+

Muon, electron
neutrinos
and
their antineutrinos

vll '

Pions

Ve

v.u' Ve
rr+

gA

NA/N,

EA/E,

5.93 x 10 9

2
2

3/4
3/4

7/8
7/8

7/8
7/8

1.22 x 10 12

2
2

3/4
3/4

7/8
7/8

7/8
7/8

3/8

7/16

7/16

3/8

7/16

7/16

1/2
1/2
1/2

1/2
1/2
1/2

1/2
1/2
1/2

2
2

3/4
3/4

7/8
7/8

7/8
7/8

TA.
(K)

1.6 x 10 12

rr
rro
Proton
Neutron

p
n

1013

Tn - Tp
-1.5 x 10 10

sA/S,

In this approximation, consider the electrical potential energy of any two


electrons separated by distance r. This is given by
e2
U=-.

Now the average interelectron distance is given by N; 1/3 ~ ch/kT. Thus, average
interaction energy is

However, kT measures the energy of motion of electrons. Thus, the interaction


energy is e 2 /hc ~ 1/137 of the energy of motion. Since this fraction is small, we are
justified in treating the electrons as free gas.
By contrast, at low temperatures T TA we have for all species with rnA # 0

=
A

gA (rnAkT)3 /2 ex p (- TA)
h3

2n

T '

PA=NAkT,

(17)

We will often refer to this limit as the nonrelativisitic approximation. (For the
photon and zero rest mass neutrino TA ,= 0 and this approximation never
applies.)

Relics of the Big Bang


2.3.

65

The Behaviour of Entropy

We now recall the conservation law satisfied by e and p in the early stages of the
expanding universe, the law given by

dS (eS ) + 3pS = 0,

(18)

and use it in conjunction with the second law ofthermodynamics. This law tells us
that the entropy in a given volume S3 stays constant as the volume expands
adiabatically. From (14) we therefore get

d
d [S3
]
dt(S3 s) = dt T(P + e) = 0,

(19)

where s = l:ASA is the total entropy of all the particles in the expanding volume.
Rewriting (19) with the help of (18) we get
d (S3 p)

0= dt

(1)

+ T dt (S e) + (S e) dt T
3

= ~ (S3 p) _ 3pS2 S + S3 e~ (~)


dt

dt T '

that is,

dp
1
dT = T(P + e).

(20)

This relation can be directly derived from (12) and (13) by a simple manipulation
of the integrals. Then, starting from (20), we can derive (19). We will use the
constancy of
S3

= T(P + e)

(21)

in our later calculations.


In the high-temperature approximation we get p = e/3 oc S-4 from (18).
Hence, from the constancy of a we recover the relation (5) T oc S - 1. A simple
relation like this does not hold if the high-temperature approximation is not
valid.

3. Primordial Neutrinos
From Table I we see that for T < 1.5 x 10 12 K, the only particles that can be
present with appreciable number densities in thermal equilibrium are J1, e, v.'
vI" Ve , vI" and y. The baryons (p and n) and pions (n, nO) will be cooled below

J. V. N arlikar

66

their critical temperatures T A , so that for them the low-temperature approximation holds. The photons, e and /1 follow their respective distributions of the
type (9). The neutrinos, however, require some attention, since this phase happens
to be crucial in determining the extent of their survival.
The neutrinos are absorbed, emitted, or scattered in reactions such as the
following:

e- + /1+ - v e +

vll '

ve +/1--vll +e-,

e+
V

+ /1-

+-+

ve + vll '

+II+_V
r'

Jl

+e+ '

ve+e--ve+e-,
vll +/1+-v e +e+,
These are all examples of weak interactions. For T
a typical reaction is of the order

Til' the cross-section of

(22)
where < = 1.4 X 10- 49 erg cm - 3 is the weak interaction coupling constant.
From (6) and Table I, we see that the number densities of participating particles
e is of the order

while for muons we should take account of (17) and introduce an exponential
damping factor of

Thus, typical neutrino reaction rate is


'1 =

cL (~~y exp( -

:;) = <2h- 7

c-

(kT)5 exp ( - :;).

(23)

We must now take note of the other rate that is relevant to the maintenance of the
equilibrium of neutrinos - the rate at which a typical volume enclosing them
expands. From Einstein's equations we get
(24)

H, the Hubble constant at the particular epoch, measures the rate of expansion of
the volume in question. Thus, the ratio of the reaction rate to the expansion rate is
given by

Relics of the Big Bang

(_T )3
10 10 K

= Tio ex p (

67

ex p (- 1012K)
T

1_).

__
T12

(25)

Here we have substituted the values of G, Ii, rg, c, k, and TIL and arrived at the
above numerical expression. Further, we have written the temperature using the
notation T 10 , T 12 , and so on. In general, Tn indicates temperatures expressed in
units of 10n K.
What does (25) tell us? As the temperature drops below 10 12 K, the exponential
decreases rapidly. This means that the reactions involving neutrinos run at slower
rate compared to the expansion rate of the universe. The neutrinos then cease to
interact with the rest of the matter and therefore drop out of thermal equilibrium
as temperatures fall appreciably below T12 = 1. How far below?
The original theory of weak interactions suggested that this temperature may
be around Tll = 1.3. In the late 1960s and early 1970s, successful attempts to
unify the weak interaction with the electromagnetic interaction led to additional
(neutral current) reactions that keep neutrinos interacting with other matter at
even lower temperatures. We state here the outcome of these investigations: that
the neutrinos can remain in thermal equilibrium down to temperatures of the
order T 10 ~ 1.
However, even though neutrinos decouple themselves from the rest of the
matter, their distribution function still retains its original form with the
temperature dropping as T ex S - 1. This is because as the universe expands the
momentum and energy of each neutrino falls as S-l and the number density of
neutrinos falls as S - 3. Since the temperature of the rest of the mixture also drops
as S-l, and since the two temperatures were equal when the neutrinos were
coupled with the rest of the matter, they continue to remain equal, even though
neutrinos and the rest of the matter are no longer in interaction with one
another.*
There is, however, another epoch when the neutrino temperature begins to
differ from the temperature of the rest of the matter. We end this section with
a discussion of this important phase in the early universe.
First consider the universe in the temperature range T12 = 1 to TlO = 1. In this
phase we have the neutrinos, the electron positron pairs, and the photons, each
with distribution functions ofthe type (9) in the high-temperature approximation
(see Table I). Thus

* Our remarks about neutrinos are meant to apply to all four species v

e'

Ve , v"' v".

68

J. V. N arlikar

Counting the various g-factors from Table I, we get


9
4
e = -aT
2
.

(26)

Thus, in this period the expansion equation is modified from our simplified
formula (4) to

S2

S2 =

12nGa

-c-2-

(27)

and the relation (7) is changed to

( C2 )1/4 t-

T= - 48nGa

1/ 2 ,

(28)

which we may rewrite as


(29)

However, in the next phase the situations become complicated, as the e pairs
are no longer relativistic. Thus, the high-temperature approximation is no longer
valid and we have to use the full formulae (12) and (13) to determine the e and
p and the expansion rate of the universe. We will not go into details of this phase
but, instead,jump across to its end, when the pairs have annihilated, leaving only
photons
(30)

Thus, the energy, originally in e and photons, has now vested only in photons,
raising their number and temperature. How can we evaluate this change? It is
here that (21), telling us of the constancy of (J, comes to our help.
In the relativistic phase (Tg > 5) of e we have
(J

4S 3
11
3
= rr{ee- + ee' + ey} = 3a(ST) .

(31)

When the e have annihilated and left only photons, we have the photon
temperature Ty given by
4 S3

(J

= :3 Ty ey = :3 a(STy)3.

(32)

We now use the result that the neutrino temperature always changes as S-l. Let
us write it as
Tv

S'

B = constant.

(33)

Relics of the Big Bang

69

Then (31) gives


(34)
Similarly (32) gives

(J

= -4

aB 3

{T}3
-.2

Tv

(35)

Now in the preannihilation era T = Tv, so that (34) tells us (J = (11/3)aB3. After
annihilation (J must have the same value, so we may equate it to the value given by
(35). Thus, we arrive at the conclusion that the photon temperature at the end of
e annihilation has risen above the neutrino temperature by the factor

Ty
Tv

{!!}1/3
~ 14
4
..

(36)

So the present-day neutrino temperature is lower than the photon temperature


by the factor (1.4) - 1. If we take the latter as "" 3 K, the former is "" 2.1 K.
In the above calculation, we have not taken account of neutrinos having
a small but nonzero rest mass. Nor have we considered the question of the
existence of more than two types of neutrinos in the primordial epochs. For
example, particle physicists talk about the so-called r -neutrino associated with
the r-lepton. We will take another look at neutrinos later when we will discuss
these questions anew.
4.

The Neutron/Proton Ratio

We have so far developed a picture of the early universe that is best expressed
in the form of a time-temperature table of events, as shown in Table II.
In our discussion so far we have not paid much attention to baryons - the
protons and neutrons that are also present in the mixture. In our approximation
of setting the chemical potentials to zero we took the baryon number to be zero.
The validity of the approximation depended on the baryon number density being
several orders (8 to 10) of magnitude smaller than the photon density. Nevertheless, we must now take note of the existence of baryons, however small their
number density; for we need them in order to consider Gamow's idea of nucleosynthesis in the hot universe.
First notice that the temperatures Tn and Tp of Table I are very high, so that
the neutron and proton distribution functions follow the nonrelativistic approximations of (17).

70

1. V. N arlikar
Table II.

A time-temperature table of events preceding nucleosynthesis in the early universe

Time since
big bang (8)

Temperature
(K)

Events

~1O-4

>10 12

Baryons, mesons, leptons, and photons in thermal


equilibrium.

10- 4 _10- 2

10 12 _10 11

J1 begin to annihilate and disappear from the


mixture. Neutrinos begin to decouple from rest of
matter.

10- 2 -1

10 11 -10 10

Neutrinos decouple completely. e pairs still


relativistic.

1-180

10 10-10 9

The pairs of e annihilate and disappear, raising


the photon gas temperature to -1.4 times the
temperature of neutrinos.

Thus we get

N = ~h (mpkT)3/2
ex p (- Tp)
2n
T '
3

Nn= :3 (m2~Tr/2 exp( - ; ) .

(37)

In this approximation, the neutron to proton number ratio is given by

Nn

Np

exp

(Tp - Tn) = exp (1.5


- -).
T

T 10

(38)

The ratio therefore drops with temperature, from near 1: 1 at T;:;: 10 12 K to


about 5:6 at T= 10 11 K, and to 3:5 at 3 x 10 10 K.
For thermodynamic equilibrium to be maintained, the reactions that convert
neutrons to protons and vice-versa have to be rapid enough compared to the rate
at which the universe expands. These interactions are none other than the weak
interactions considered earlier when we discussed the decoupling of neutrinos
from the rest of the primordial brew (see Section 3). There is one difference,
however. In discussing the decoupling of neutrinos we were concerned mainly
with the reaction of a neutrino with leptons like e , fl , and the cross-section
~ given by (22) was determined for such interactions. Similarly, the reaction rate
'1 given by (23) was obtained by multiplying by the number densities of
participating leptons.
In the present case, the cross-section for a typical reaction like

ve+n+--+e- +p
is larger than that for the pure leptonic reaction like

VI' + e +--+ Ve + fl

Relics of the Big Bang

71

Also, the lepton densities used in (23) were considerably higher than the nucleon
densities we are considering now. So the probability of a given nucleon
interacting with any neutrino is higher than the probability of a given neutrino
interacting with any nucleon. The result is that the effective temperature at which
nand p cease to be in thermodynamic equilibrium is lower than the effective
temperature for neutrino decoupling determined earlier.
Quantitatively, instead of ~ ex T2 as in (22), the cross-section in the present
case goes as ex T, and the effective decoupling temperature T* at which the
reaction rate is just about equal to His < 10 10 K. Note that if the universe were
expanding faster, T would be higher and the ratio N nlN pat decoupling as given
by (38) would be higher.
Once the thermodynamic equilibrium ceases to be maintained, the N nlN p ratio
is not given by (38) but by detailed consideration of specific reactions involving
the nucleons.
As the universe cooled further, this ratio was therefore determined by the
reactions that change protons to neutrons and vice-versa. These are essentially
weak interactions of the type

The reaction rates are therefore determined by the cross-sections computed


according to the weak interaction theory. Until the electro-weak gauge theory
became established in the late 1970s, the V-A theory of weak interaction was used
for these computations. We will not go into details of the calculation here, the
purpose of which is to come up with a differential equation for the ratio

=
n

Np

Nn

+ Nn

(39)

If Je (n -+ p) denotes the rate at which neutrons are converted to protons and


Je(p -+ n) the corresponding rate for protons changing to neutrons, then clearly X n
satisfies the equation
dX
_ n = (\ _ X ).Je(p -+ n) - X ).(n -+ pl.
dt
n
n

(40)

The rates ), depend on distribution functions of leptons, which in turn depend


on the temperature, which is related to the scale factor of the expanding universe.
The integration of (40) has to be done numerically, and it is continued until all e
pairs have dropped out of the mixture - which happens at T ~ 10 9 K.
When all e have disappeared, it is still possible for the neutron to decay via the
reaction
n-+p+e-+v e
with a characteristic time r = 1013 s. So from the time the pairs disappear to the
onset of nucleosynthesis, the neutron ratio X n will decrease by the exponential
factor exp( - tiT).

72

J. V. Narlikar

Thus, the ratio of neutrons to protons is uniquely determined at the time


nucleosynthesis begins, once we know all the parameters of the weak interaction
process. This is one good aspect of primordial nucleosynthesis theory, which we
will now proceed to discuss.

5.

The Synthesis of Helium and Other Nuclei

A typical nucleus Q is described by two quantities A = atomic mass and


Z = atomic number, and is written

:Q.

This nucleus has Z protons and (A -Z) neutrons. If mQ is the mass of the nucleus, its
binding energy is given by
(41)

Let us now consider a unit volume of cosmological medium containing NN


nucleons, bound or free. Since the masses of protons and neutrons are nearly
equal, we may denote the typical nucleon mass by m. Thus mn ::;:; mp = m. If there
are N n free neutrons and N p free protons in the mixture

(42)

=-p

NN

will denote the fractions by weight of free neutrons and free protons. If a typical
bound nucleus Qhas atomic mass A and there are N Q of them in our unit volume,
we may denote the weight fraction of Q by
_ NQA

(43)

XQ---'

NN

Now at very high temperatures (T 10 10 K), the nuclei are expected to be in


thermal equilibrium. However, even at these temperatures, T TQ and (17)
holds. Further, since we are now concerned with relative number densities, we can
no longer ignore the chemical potentials. Thus,
(44)

where we reinstated the chemical potentials


conserved in nuclear reactions,

}1Q'

Since chemical potentials are

(45)
assuming that the nuclei were built out of neutrons and protons by nuclear
reactions.

Sometimes the suffix Z is suppressed.

73

Relics of the Big Bang

The unknown chemical potentials can be eliminated between (44) and similar
relations for N p and N n' The result is expressed in this form
X

=.lg
A 5 / 2 XZ X A - Z ;;A-1 exp(BQ)
2 Q
p
n
<,
kT '

(46)

tNN(~n~~) -3/2.

(47)

where

( =

For an appreciable buildup of complex nuclei, T must drop to a low enough


value to make exp (B/kT) large enough to compensate for the smallness of -1.
This happens for nucleus Q when T has dropped down to

BQ

TQ ~ -k(-A------'1"-)Il-n-(I

(48)

Let us consider what happens when we apply the above formula to the nucleus
of 4He. The binding energy of this nucleus is ~ 4.3 x 10- 5 erg. If we substitute
this value in (48) and estimate N N from the presently observed value of nucleon
density of around 10 - 6 cm - 3, we find that TQ is as low as ~ 3 x 10 9 K. However,
at this low temperature the number densities of participating nucleons are so low
that four-body encounters leading to the formation of 4He are extremely rare.
Thus, the underlying assumption of thermodynamic equilibrium (which requires
frequent collisions) leading to (48) becomes invalid. We therefore need to proceed
in a less ambitious fashion in order to describe the buildup of complex nuclei.
Hence, we try using two-body collisions (which are not so rare) to describe the
buildup of heavier nuclei. Thus deuterium (d), tritium eH), and helium eHe, 4He)
are formed via reactions like

+ n +-> d + /"
d + d +-> 3He + n +-> 3H + p,
3H + d +-> 4He + n.
p

(49)

Since formation of deuterium involves only two-body collisions, it quickly


reaches its equilibrium abundance as given by
(50)

However, the binding energy Bd of deuterium is low so that unless T drops to less
than 10 9 K, X d is not high enough to start further reactions leading to 3H, 3He,
and 4He. In fact, the reactions given in (49) with the exception of the first one do
not proceed fast enough until the temperature has dropped to ~ 8 X 10 8 K.
Although at such temperatures, nucleosynthesis does proceed rapidly enough,
it cannot go beyond 4He. This is because there are no stable nuclei with A = 5 or

J. V. N arlikar

74

8, and this means we cannot go on adding neutrons and protons to build nuclei
heavier than 4He. So the process terminates there. Detailed calculations by
several authors have now established this result quite firmly.
So starting with primordial neutrons and protons, we end up finally with 4He
nuclei and free protons. All neutrons have been gobbled up by helium nuclei.
Thus, if we consider the fraction by weight of primordial helium, it is very simply
related to the quantity X n - the neutron concentration before nucleosynthesis
began. Denoting this fraction by weight by the symbol Y, we get
(51)

In Figure 1 the cosmic weight fractions of 4He, 3He, and 2H and so on are plotted
against a parameter '1 defined by

'1=(2.7 x

1OP~6gCm

3)(:oY'

(52)

Thus, '1 essentially measures the nucleon density in the early universe through the
formula
T9 < 3.

(53)

Note that the 4He weight fraction is insensitive to the parameter '1. This is
because, as we saw just now, it only depends on X n; which in turn depends more
critically on the epoch when the weak interactions rate fell below the expansion
rate. If we go back to (38), we see that in the very early stages the neutron/proton
ratio depends on temperature T*. A faster expansion rate implies that the ratio
becomes frozen at a higher temperature and so is higher, thus leading to a higher
4He abundance. However, the expansion rate in the early stage does not depend
sensitively on the parameter '1. This is why the curve for 4He in Figure 1 is nearly
fiat, with Y in the neighborhood of 0.25.
In contrast to the behavior of Y, the abundances of other nuclei critically
depend on '1. These abundances are very small compared to Y. Only deuterium
and 3He eventually survive; 3H (tritium) decays to 3He. Of nuclei heavier than
4He, only 7Li (lithium) appears with any appreciable quantities, although smaller
than 3He. The most interesting situation exists for deuterium, whose abundance
sharply drops as '1 rises above 10- 4. For To = 3 K, this corresponds to
(54)

Comparing this with the densities of Friedmann models, we see that for ho = 1,
no ~ 0.12 and, hence, qo ~ 0.06. Here we have used the present Hubble constant
Ho as 100 ho km S-i Mpc- i and no = Po /c1osure density. qll is the deceleration
parameter. Therefore, if even a small amount of deuterium believed to be
primordial in origin were found, Friedmann models of the closed variety would
be ruled out. There is, however, a loophole in this argument to which we will
return later.
We can sum up by saying that Gamow's expectation that the early hot universe

75

Relics of the Big Bang


Po(3/To)3 9 cm- 3

10- 1
10

10- 3
10'

10

.... 10

tl
(1)

<Fl
<Fl
(1)

10

10

10

1/
Fig. I.

would synthesize all types of nuclei has been only partially fulfilled. To obtain
complex nuclei heavier than 4He (and possible 7Li), astrophysicists have to look
to other sources: the stars. (In Figure 1, the primordial production of nuclei with
atomic weights exceeding 11 is shown by the curve A ~ 12.)

6.

The Microwave Background

Gamow and his colleagues Alpher and Herman made another prediction,
however, that appears to have received confirmation. This is the prediction that
the photons of the early hot era would have cooled down to provide a thermal

1. V. N arlikar

76

radiation background in the microwaves. As mentioned earlier, such radiation


was first detected in 1965 by Penzias and Wilson. To see how this background
forms we have to follow our history of the early universe to stages subsequent to
nucleosynthesis.
The era of nucleosynthesis took place when the temperature was around 109 K.
The universe in subsequent phases cooled as it expanded with the radiation
temperature dropping as S - 1 . The presence of nuclei, free protons, and electrons
did not have much effect on the dynamics of the universe, which was still
radiation-dominated. However, these particles, especially the lightest of them, the
electrons, acted as scattering centers for the ambient radiation and kept it
thermalized. The universe was therefore quite opaque to start with.
However, as the universe cooled, the electron-proton electrical attraction
began to assert itself. In detailed calculations performed by P. 1. E. Peebles, the
mixture of electrons and protons and of hydrogen atoms was studied at varying
temperatures. Because of Coulomb attraction between the electron and proton,
the hydrogen atom has a certain bindng energy B. The problem of determining
the relative number densities offree electrons, free protons (that is, ions), and the
neutral H-atoms in thermal equilibrium is therefore analogous to that which we
considered earlier in deriving (46) in Section 5 for the mixture of free and bound
nucleons. Following the same method, we arrive at the following formula relating
the number densities of electrons (N e), protons (N p = N e)' and H-atoms (N H) at
a given temperature T:

N; = (mekT)3/2
2nli 2

NH

exp

(_~)

kT'

(55)

where me = electron mass. (This equation is a particular case ofSaha's ionization


equation.)
Writing NB for the total baryon number density, we may express the fraction of
ionization by the ratio

Ne
NB

x=~.

Then, since NH = NB - N e , we get from (:55)

~ = ~I (me kT2 )3/2 exp(-~).


I-x

NB

2nli

kT

(56)

For the H-atom, B = 13.59 eV. Substituting for various quantities on the righthand side of (56), we can solve for x as a function of T. The results show that
x drops sharply from I to near zero in the temperature range of '" 5000 to 2500 K,
depending on the value of N B , that is, on the parameter o.oh6. For example, for
o.oh6 = 0.1, x = 0.003 at T = 3000 K.
Thus, by this time most of the free electrons have been removed from the
cosmological brew and, as a result, the main agent responsible for the scattering

77

Relics of the Big Bang

of radiation disappears from the scene. The universe becomes effectively


transparent to radiation. This is called the 'recombination epoch'.
The transparency of the universe means a light photon can go a long way
( '" c/H) without being absorbed or scattered. Therefore, this epoch signifies the
beginning of the new phase when matter and radiation become decoupled. This
phase has lasted up to the present epoch. During this phase, the frequency of each
photon is red shifted according to the rule

voc-

S'

while the number density of photons falls as

It is easy to see that under these conditions the photon distribution function
preserves the Planckian form with the temperature dropping as
1

TocS
A present background temperature of '" 3 K therefore means that the epoch when
matter decoupled from radiation corresponds to a redshift of", 10 3 . However, we
also find that the universe also changed over from being radiation-dominated to
matter-dominated around the same epoch. Why the transition from opaqueness
to transparency and from radiation domination to matter domination should
take place around the same time is at present unexplained and must be
considered a coincidence.
Another result, as yet unexplained by early universe physics, is the observed
ratio of photons to baryons

Ny = 457
N
B

10 7 (0

h2)-1
0

(To)3
3

(57)

This ratio has been conserved since the time the universe became essentially
transparent, although both Ny and N B can be studied theoretically at even earlier
epochs. Why the above ratio and no other? Later we will discuss some ideas from
particle physics that are intended to throw light on this mystery.
The important signature of the relic radiation is, however, its spectrum. In
addition, we will consider a few effects that may cause small perturbations of the
radiation background. But these apart, we should find the background spectrum
to be very close to the Planckian form. Observations to date confirm this
prediction with To ~ 2.7 K.

J. V. N arlikar

78
7.

Anisotropies of the Microwave Background

Assuming that the microwave background is the relic of a hot big bang, there are
two types of anisotropies that, if found, should give us clues to the early history
ofthe universe, especially about the recombination era. They are discussed below.

7.1.

Small-Angle Anisotropy

Theories of galaxy formation place lower limits on the fluctuations of bpi p, the
density contrast at the recombination epoch. Assuming that the fluctuations are
adiabatic, the particle number density will vary as the cube of the radiation
temperature, therefore
(58)

where the subscript R denotes the recombination epoch.


Since the universe is optically thin after this epoch, these fluctuations will be
imprinted on the radiation background and would be observed to this day. That
is, if we sweep across the sky we should see ups and downs in the background
temperature. What should be the order of magnitude of this fluctuation in
temperature at the present epoch? Over what characteristic angular size should
we observe these fluctuations?
Simple theories of galaxy formation suggest that we should have present-day
fluctuations of (bTIT) in the range'" 3 x 10- 3 to 10- 4 . This is of course true on
the assumption of optical thinness mentioned before.
The typical angular size of the fluctuation is
(/18)

23( 1011MM)1/3
(h
oO

q2)1/3
0

arcsec

(59)

where M is the typical galaxy mass. Thus, galaxy formation should leave a
characteristic patchiness of the angular size '" 20 arcsec. Actual observations,
however, reveal no fluctuations down to /1 TIT < 5 x 10 - 5. These null
observations severely constrain the theories of galaxy formation.

7.2.

Large-Angle Anisotropy

Particle horizons restrict the distances over which physical signals can travel. It
could be argued that the universe may not have been homogeneous to start with
(as presumed by the cosmological principle), but that it achieved homogeneity by
physical transport of energy and momentum. In that case, at any given epoch the
size of a homogeneous region cannot exceed the diameter of the particle horizon
at that epoch.

79

Relics of the Big Bang

Since the last time that there occurred a thorough mixing of the radiation
background was at the epoch of red shift Z R' it would be relevant to ask what the
size of the particle horizon was at that epoch. Calculations show that the diameter
of the horizon was
(60)
for the model with qo > 1/2. Similar results can be obtained for qo
The angle 0H subtended by the horizon at us is given by
.

qoy'T+"2q~

()H

Sln-=

qozR

For large red shift,


()H '"

/f:
-

qO

ZR

+ (qo
ZR '"
0

'" 5

- 1) {Jl

+ 2qozR -

I}

1/2.

(61)

1000, (61) can be approximated to give

vCqo

(62)

The small angle coming out of this analysis implies that if the radiation
background was thermalized at an early epoch such as Z R = 1000, the smallness
of particle horizons should lead to a patchiness in the present intensity
distribution across the sky. Is such patchiness observed? Again, observations
show a null result. Thus, the observed smoothness ofthe microwave background
is at present a great embarrassment to theorists.

8.

Cosmology and Particle Physics

We have so far discussed the properties of the big bang universe starting from the
epoch in which it was _10- 4 sec old, when a mixture of baryons, mesons,
leptons, and photons was in thermodynamic equilibrium with a temperature of
_10 12 K. We discussed how this hot primordial gas evolved as the universe
expanded and cooled down. We ended our story with the formation of the helium
nucleus, by which time the universe was '" 3 minutes old.
In the 1960s the above range of epochs would have been considered as
describing the early universe. Today the implication of this phrase has changed.
The 'early universe' now implies the era preceding the above phase, when matter
was in an even more elementary form than that considered above. The reason for
this shift lies less in any development in cosmology than in particle physics. The
remarkable developments in particle physics, which signify progress towards
a unification of the basic interactions of physics, have found their echoes in
cosmology.
So far, particle physicists have relied on the use of powerful accelerators to
study the interaction of particles at high energy. From elementary quantum
theory, it follows that to be able to probe smaller and smaller distances, higher

80

1. V. N arlikar

and higher momenta must be achieved. Thus, high-energy accelerators are


required in order to probe the structure of particles like the proton or the pion.
The present accelerators achieve energies of the oder of a few tens or hundreds of
GeV(l GeV == 109 eV).
In theories of unification, however, interesting phenomena are predicted at
energies _10 15 GeV. Energy ranging as high as this value is far beyond what
could be achieved by present technology.
It is against this background that particle physicists have turned to cosmology
in the realization that the early hot universe is the poor man's high-energy
accelerator. This is not the first time physicists have turned to astronomy in order
to study the behavior of basic physical processes under conditions unattainable
in a terrestrial laboratory. Even before thermal fusion could be achieved on the
Earth, physicists were studying the process inside stars.
Naturally, the interplay of cosmology and particle physics that we plan to
discuss in this section in highly speculative on both fronts. It depends on the
validity of the cosmological model and on the viability of (as yet fluid) ideas of
particle physics. This should be borne in mind throughout the various
calculations given here.
Let us first consider what particles might exist in the early universe, out of
which the baryons and mesons are formed. This information is supplied by
particle physics.
The masses of particles like quarks, leptons etc. are expressed in the unit of
MeV (1 MeV == 106 eV). We have so far not introduced this unit. It is convenient
to do so now, since we shall be using many ideas from particle physics where this
unit is commonly used. Thus, for each mass m expressed in grams, me 2 is energy
expressed in ergs. We then use the following conversion scale:
1 MeV = 1.6021917 x 10 6 erg.
Further, since we are going to describe the hot universe, it is also convenient to
express the temperature in the same unit. Thus, for T expressed in Kelvin, kT is
energy expressed in ergs, which can be written in units of MeV. We therefore have
1 gram = 5.618 x 10 26 MeV,
1 Kelvin

8.617 x

1011

MeV.

Although these conversion factors involve many powers of 10, they show why the
MeV is a good unit for the early universe. For example, a temperature of the order
of 10 12 K is a few MeV.
We end this section by recalling from earlier work the result that relates the
temperature of the universe to its age and is given by the Einstein equation

52

8nG

S2 = -3- P.

(63)

If there are bosons with a total gb of g-factors and fermions with a total gf of

81

Relies of the Big Bang

g-factors, then
pe 2 =

t gaT4

(64)

7
+ sgJ'

(65)

with

gb

Thus, we have for g = constant


Sex t l/2

with
t

( 3e2

16rrGa

(66)

y/2 g-1/2 T- 2.

(67)

This relation can be expressed in MeV as


tsecond

9.

= 2.4 g -

112

T M}V'

(68)

Survival of Massive Particles

We will begin with a simple extrapolation of the approach adopted earlier. We


will assume in this section that quarks have combined to form particles (and
antiparticles) and investigate the criteria that determine the survival of a particular species of particles. In the ideal gas approximation, we will assume the
distribution functions to be those given by (9). In the relativistic (hightemperature) approximation of Section 2, we have the following formula for
a number density of particles of species A:

N A = I]g ANy

2.4 (kT)3
I]g A ~ el1

(69)

where Ny is the number density of photons and I] = 1/2 for bosons and
fermions. In the nonrelativistic approximation, we get

=
A

gA (mAkT)3 /2
11 3
2rr

exp(~ mA~)
kT

for

(70)

The assumption leading to (69) or (70) is that the species is in thermodynamic


equilibrium with the rest of the particles. For (69) to hold we need T TA ==
mA e2 /k, while for (70) to hold we should have T TA . Exactly similar results
must hold if species A has antiparticles A. To fix ideas (since we are eventually
going to use these formulae for baryons-protons and neutrons), we will assume
A to be a fermion. Thus, I] = 3/8.
In general, A and If may annihilate if they are brought together. In a typical

82

J. V. Narlikar

reaction, two photons will be produced


A

+ A -+y + y.

(71 )

A comparison of the reaction rate with the rate of expansion of the universe
finally yields the surviving baryon to photon ratio as
NA
Ny

~2

10- 18

(72)

Compare this with the observed range:

::

~ 2 x 10- (noh~)(~o)-3
8

(73)

Since To ~ 3 and noh~ is not expected to be lower than ~ 10- 3under the most
extreme case, we have a large discrepancy to account for. There is one further
point of criticism. If we are sure that the universe is made up predominantly of
matter, then N A N If and the formula (73) applies N A( ~ N A - N If ~ baryon
number density). However, our analysis so far is symmetric between matter and
antimatter and so leads to N A = N If. Clearly new inputs are necessary in the
discussion given above if we are to understand why N A N If and why N A/ N y is
as high as indicated by (73).

10.

Problems of the Very Early Universe

The baryon to photon ratio described above poses a problem to be solved by the
scenarios purporting to describe the very early universe. It is clear that new inputs
are needed to understand the observed ratio N A/Ny.
On the observational side, our understanding of the material composition of
the universe is largely based on the data provided by the electromagnetic
radiation. Since this radiation treats matter and antimatter in a symmetrical
manner, the distinction between the two cannot be made by this method. Cosmic
rays in the Galaxy and the Local Group of galaxies may be used to argue that we
are living in a matter dominated area. But for remote parts of the universe, the
assumption of a net baryon number corresponding to (73) is largely a conjecture.
Assuming that (73) is characteristic of the whole universe, we see that new
inputs are needed, such as (i) the possibility of baryon nonconserving processes in
particle physics, (ii) the probable lack of thermodynamic equilibrium in the
processes of the very early universe, and (iii) a likely asymmetry between matter
and antimatter at very high energies.
The Grand Unified Theories (GUTs) and other versions of unification
attempts are therefore likely to play an important role in explaining N AI N i' For
example, the simple SU(5) model has been used to bring this ratio in the range
10- 12 - 10- 4 which includes the observed range (73). However, since the GUTs

83

Relics of the Big Bang

themselves are in a state of flux so far as their final form is concerned, it is too early
to say that the problem of explaining N AjN y has been solved.
The other important problems of the very early universe are briefly discussed
below.

10.1.

The Horizon Problem

In the very early universe, the size of the particle horizon was exteremely small. At
time t > 0 the particle horizon restricts the range of communication to '" ct. At
T = 10 15 GeV, g ~ 100 (values characteristic of GUTs-epoch) we get from (6S),
t'" 10- 37 s.

Correspondingly, the horizon range is RH '" 3 X 10- 27 cm. As the temperature


dropped from 10 15 GeV to the present value of 3 K '" 2.6 X 10- 13 GeV, the size
RH would have expanded to
10 15
RH x 2.6 x 10

13 '"

10 cm.

Compared to the present size, '" 10 28 cm, this is extremely small; thus
highlighting the fact that unless the initial composition of the universe was
homogeneous, the horizon cannot allow homogeneity to be established later.
Clearly a departure from the dynamics of standard model is needed to eliminate
the horizon effect. Later articles will offer solutions through inflation (Panchapakesan, Chapter 17) and quantum cosmology (Padmanabhan, Chapter IS).

10.2.

The Flatness Problem

Consider the field equation

82
S2

kc 2

SnGp

(74)

+82=-3-

In the early universe scenario, we neglected the 'curvature term' kc 2 jS2 in


comparison with the dynamical term 82 jS2. The ratio of the two terms
F

kc 2jS2

kc 2

= 82jS2 = 82

~ 0,

(75)

8 ~ 00 in standard big bang models: so the approximation was justified.


However, as first pointed out by Dicke and Peebles in 1979, the ratio F is
comparable to unity at present and this corresponds to an extremely small value
at t = 1 s. For t = 10- 37 s, F may be as small as 10- 5 . Thus, the present state of
the universe seems to have arisen from an extreme fine tuning of the curvature

as

1. V. N arlikar

84

term to the value zero. In other words, all observationally permitted Friedmann
models seem to be bunched round the flat k = 0 model to within a very narrow
range. Why should this be so? Again inflation and quantum cosmology will offer
alternative scenarios to explain this effect.

10.3.

The Monopole Problem

GUTs seem to imply the existence of a magnetic monopole type solution.


Monopoles appear to be inevitable consequences of GUTs and once created they
are hard to get rid of. Their present density can be estimated as follows.
Since the particle horizon restricts a monopole solution to a region ~ R H , the
monopole number density at t = 10- 37 S was ~ Rii 3. By expansion, this should
fall to (10 cm)-3, as we saw earlier. But the mass of the monopole is ~ 10 16 GeV,
i.e.,0.2 x 10- 7 gm. This gives us the monopole mass density at the present epoch
as
PM ~ 2

10- 11 gmcm- 3.

This is far greater than the closure density of ~ 10- 29 gm cm - 3. Clearly there is
something wrong with the monopole survival scenario! The inflationary universe
offers a way out of this difficulty.

lOA.

Galaxy Formation

Despite numerous attempts the detailed scenario of galaxy formation still eludes
the theoreticians. Considerable complexity and arbitrariness has now entered the
picture because (a) there is some astronomical evidence that there is dark matter
in the universe, probably in considerable excess over the visible matter and (b) the
GUTs and SUSY offer several candidates for dark matter, e.g. massive neutrinos,
gravitinos, photinos, etc., as well as other massive particles that may decay in
short timescales. In addition, the inflationary scenarios of various kinds bring
fresh inputs to the galaxy-formation problem.
The observed smoothness of the microwave background is still the biggest
stumbling block to such theories. In addition, there appears to be problem from
massive neutrinos also in that their background induces galaxy formation on
a much more clumpy state than observed.

10.5.

Massive Neutrinos

Recent experiments in the U.S.S.R. suggest that neutrinos may indeed have
a small rest mass. This possibility opens up a number of interesting astrophysical
consequences. As early as 1972, R. Cowsik and 1. McClelland had conjectured

Relics of the Big Bang

85

that the 'missing mass' in the universe (that is, the difference between Po and PN)
may be accounted for by relic neutrinos. What can we say today about such a
possibility? (PN = mass density of nucleons.)
Let us do the calculations taking g = 1 even for massive neutrinos. If the rest
mass of the neutrino is larger than ~ 2 x 104 eV, they will have small random
velocities today. Since experiments suggest m of the order of a few electron volts,
we will write

mv

= Mv(eV).

From Table I we know that the number density of neutrinos is 3/8 of the
number density of photons of the same temperature. We also know that the
number density of photons goes as the cube of the photon temperature. Since in
the post (e + - e -) annihilation phase

4
11'

(76)

we get the present number density of neutrinos as

3
22

(77)

Putting everything together, the mass density of neutrinos at present may be


ex pressed as
Pv = IOvPo

where

(78)

denotes sum over all types of neutrinos and

o _ Mv
v -

150

(To)3
h-2
3
o

(79)

Comparing (54) with (79), we see that 20 v exceeds the upper limit on ON even for
such a modest value of M v as 1.5. If we include all neutrino species, we get the
above result as

mv i2: 1.5 eV.

all
species

Very massive neutrinos will prove embarrassing for big-bang cosmology. If all
neutrinos (which means ve,ve,vll,vll,vr,vr), have on average a mass of ~25eV,
then ~Ov is close to 1. Larger masses than this value and/or increase in the
number of relic neutrino species would increase ~Ov and the overall 0 beyond the
closure value 0 = 1. As seen in general, closed universes have shorter ages, and an
overall age < ~ 6 X 109 years may be embarrassingly small in reality. It has
been suggested that under such circumstances, ).-cosmologies might have to be
invoked.
These calculations illustrate how astrophysics may provide valuable con-

86

J. V. Narlikar

straints on properties of elementary particles. We end with another such


constraint coming from neutrinos.
In Sections 4 and 5 we found that the primordial helium production depends
on the terminal value of N nfN p' and that this value goes up if the universe were
expanding faster at that epoch. Thus, it could happen that if there were more
neutrino flavours (say ~ 3) then the value of the g-factor would go up and the
universe would expand faster so that the primordial value of Y would exceed all
reasonable bounds from observations. Thus, cosmologists impose a restriction
on particle theories if their picture of the: early universe is to be valid.

References
The early work on primordial nucleosynthesis
G. Gamow, Expanding universe and the origin of elements, Phys. Rev. 10, 572 (1946).
R. A. Alpher and R. C. Hermann, Evolution of the universe, Nature 162, 774 (1948).
R. A. Alpher, H. A. Bethe, and G. Gamow, The origin of chemical elements, Phys. Rev. 13, 80 (1948).
[This paper, with the sequence of authors Alpher/Bethe/Gamow, led to the name 'rx//3/y theory'.]
Stellar nucleosynthesis
G. R. Burbidge, E. M. Burbidge, W. A. Fowler, and F. Hoyle, Synthesis of the elements in stars, Rev.
Mod. Phys. 29, 547 (1957).
Later work on primordial nucleosynthesis
C. Hayashi, Proton-neutron concentration ratio in the expanding universe at the stages preceding the
formation of the elements, Progr. Theoret. Phys. (Japan) 5, 224 (1950).
G. R. Burbidge, Nuclear energy generation and dissipation in galaxies, PASP 10, 83 (1958).
F. Hoyle and R. J. Tayler, The mystery of cosmic helium abundance. Nature 203, 1108 (1964).
P. J. E. Peebles, Primordial helium abundance and the primordial fireball, Ap. J. 146, 542 (1966).
Ya B. Zeldovich, The 'hot' model of the universe, Usp. Fiz. Nauk 89, 647 (1966).
R. V. Wagoner, W. A. Fowler, and F. Hoyle, On the synthesis of elements at very high temperatures,
Ap. J. 148,3 (1967).
D. Schramm and R. V. Wagoner, Element production in the early universe, Ann. Rev. Nucl. Sci. 27, 37
(1977).
Discovery of the microwave background
A. A. Penzias and R. W. Wilson, Measurement of excess antenna temperature at 4080 Mc/s, Ap. J.142,
419 (1965) (see also the textbooks at the end).
Growth of fluctuations
E. Lifshitz, On the gravitational instability of the expanding universe, J. Phys. (U SSR) 10, 116 (1946).
Analytical approaches to galaxy formation
D. Lynden-Bell, Statistical mechanics of violent relaxation in stellar systems, Monthly Notices Roy.
Astron. Soc. J 36, 101 (1967).
P. J. E. Peebles, Origin of the angular momentum of galaxies, Ap. J. 155, 393 (1969).
J. E. Gunn and J. R. Gott, On the infall of matter into clusters of galaxies, Ap. J. 116, 1 (1972).
R. A. Sunyaev and Va. B. Zeldovich, Formation of clusters of galaxies; protocluster fragmentation
and intergalactic gas heating, Astron. Astrophys. 20, 189 (1972).
S. D. M. White and M. J. Rees, Core condensation in heavy halos: a two stage theory of galaxy
formation and clustering, Monthly Notices Roy. Astron. Soc. 184,643 (1978).
A. G. Doroshkevich, E. M. Saar, and S. F. Shandarin, Spatial structure of protoclusters and the
formation of galaxies, Monthly Notices Roy. Astron. Soc. 184,643 (1978).

Relics of the Big Bang

87

N-body approaches to galaxy formation

A. Toomre, Mergers and some consequences, in R. B. Larson and B. Tinsley (eds.), Evolution of
Galaxies and Stellar Populations, Yale University Observatory, New Haven, p. 401.
S. J. Aarseth, J. R. Gott, and E. L. Turner, N-body simulations of galaxy clustering, Ap. J. 22S, 664
(1979).

G. Efstathiou and B. 1. T. Jones, The rotation of galaxies: numerical investigations of the tidal torque
theory, Monthly Notices Roy. Astron. Soc. 186, 133 (1979).
Clustering of galaxies
H. Totsuji and T. Kihara, The correlation function for the distribution of galaxies, Pub I. Astron. Soc.
(Japan) 21 221 (1969).
P.1. E. Peebles, The gravitational instability picture and the nature of distribution of galaxies, Ap. J.
lS9, LSI (1974).
S. J. Aarseth, J. R. Gott, and E. L. Turner, N-body simulations of galaxy clustering, Ap. J. 22S, 664
(1979).
Nonrelic microwave background
F. Hoyle, N. C. Wickramasinghe, and V. C. Reddish, Solid hydrogen and the microwave background,
Nature 21S, 1124 (1968).
J. V. Narlikar, M. G. Edmunds, and N. C. Wickramasinghe, Limits on a microwave background
without the big bang, in M. Rowan-Robinson, (ed.), Far Infrared Astronomy, Pergamon Press, New

York, p. 131.
M. J. Rees, Origin of pre galactic microwave background, Nature 275, 35 (1978).
M. Rowan-Robinson, J. Negroponte, and 1. Silk, Distortions of the cosmic microwave background
spectrum by dust, Nature 2S1, 635 (1979).
N. C. Rana, Cosmic thermalization and the microwave background radiation, Monthly Notices Roy.
Astron. Soc. 197, 1125 (1981).
Baryon excess in rhe early universe

G. Steigman, Observational tests of antimatter cosmologies, Ann. Rev. Asrron. Asrrophys. 14, 339
(1976).

M. Yoshimura, Unified gauge theories and the baryon number of the universe, Phys. Rev. Let!. 41, 281
(1978).

S. Weinberg, Baryon-lepton non-conserving processes, Phys. Rev. Lett. 43, 1566.


F. Wilczek and A. Zee, Operator analysis of nucleon decay, Phys. Ret;. Lett. 43, 1571 (1979).
Helium abundance and neutrino types

J, Yang, D. Schramm, G. Steigman, and R. T. Rood, Constraints on cosmology and neutrino physics
from big bang nucleosynthesis, Ap. J. 227, 697 (1979).
Experimental data on massive neutrinos

V. A. Lyubimov, E. G. Novikov, V. Z. Nozik, E. F. Tretyakov, and V. S. Kozik, Brighton conference


on High Energy Physics (1983).
Massive neutrinos and cosmology

R. Cowsik and J. McClelland, An upper limit on the neutrino rest mass, Phys. Rev. Lett. 29, 669 (1972).
R. Cowsik and J. McClelland, Gravity of neutrinos of nonzero mass in astrophysics, Ap. J. ISO,
7 (1973).

S. Tremaine and J. E. Gunn, Dynamical role oflight neutrallcptons in cosmology, Phys. Rev. Lett. 42,
407 (1979).

D. Schramm and G. Steigman, Relic neutrinos and the density of the universe, Ap. J. 243,1 (198\).
Problems of the very early universe
A. D. Linde, The inflationary universe, Rep. Progr. Phys. 47, 925 (1984).

Textbooks on nucleosynthesis and big bang cosmology

D. Clayton, Principles of Stellar Evolution and Nuc/eosynrhesis, McGraw-Hill, New York (1968).

88

J. V. Narlikar

P. J. E. Peebles, Physical Cosmology, Princeton University Press, Princeton (1971).


S. Weinberg, Gravitation and Cosmology, Wiley, New York (1972).
P. 1. E. Peebles, The Large-Scale Structure of the Universe, Princeton University Press, Princeton
(1980).
J. V. Narlikar, Introduction to Cosmology, Jones and Bartlett, Boston (1983).

6. An Approach to Anisotropic
Cosmologies
A. K. RAYCHAUDHURI
Department of Physics, Presidency College, Calcutta 700073, India

1.

Motivation

It is perhaps not wrong to consider Einstein's 1917 paper on cosmology as the


beginning of modern cosmology. It was a peculiar beginning for a branch of
physics. Normally, one has some observational data and the theoretician goes on
to build up a structure with a minimum of ad-hoc assumptions which help to
systematize and 'explain' the data. In 1917, Einstein had hardly any observational
data to explain and the little he had, he chose to ignore. Even at that time, it was
clear that processes are occurring on a large scale leading to a flow of radiation
from celestial bodies and, thanks to the ideas of the special theory of relativity,
this could be understood as a continuous change: in the distribution of sources of
gravitation. Besides, it was already known that matter and radiation, even of the
same energy density, differ in their energy-stress tensor. All this was enough to
indicate that the universe, even if in equilibrium at the same stage, would not
continue in that state. Yet Einstein adopted the assumption of a static universe
which was also uniform in space and isotropic.
Later, the assumption of static nature had to be given up to accommodate the
Hubble shift of spectrailines from distant galaxies, but the assumptions of spatial
homogeneity and isotropy still form the cornerstone of standard cosmology.
Indeed, while for a long time there was little empirical evidence in favour of these
assumptions, the microwave radiation background is commonly thought to have
brought powerful support for the isotropic models.
But if standard cosmology were completely successful, there would hardly
be any need to explore other models of the universe, except perhaps for
mathematical recreation. Here and there, however, doubts and difficulties remain
and anisotropic models have been investigated at different stages in the hope that
they may smooth out these difficulties. We just make mention of the problems
that plague standard cosmology and give the motivation for the study of
anisotropic models:
(a) The big bang singularity

Standard cosmology gives, for the metric of the universc*,

In the following equation and in some of the subsequent ones, the upper and the lower signs
correspond, respectively, to the metric signature (+ - - -) and (- + + +).

89
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 89-106.
C 1989 hy Kluwer Academic Publishers.

90

A. K. Raychaudhuri

k = 0,

+ 1, -1,

(1)

with R obeying the differential equation

4n

Ii =

-3(P

(2)

+ 3p),

where the dots signify differentiation with respect to the time variable t and p,
p are the energy density and pressure of the matter in the universe (we have chosen
units such that G = c = 1). With p and p essentially positive and R also positive
(expanding universe), Equation (2) shows that one would have a state R = 0 at
a finite past. At this stage, physical variables like p, p and scalars like
Rapy3WPy3 all blow up. This is the beginning of the universe and the beginning of
time - ideas rather repugnant to one's feeling of an everflowing time.
Cosmologists like Tolman and Eddington at one time wondered whether this
singularity would persist in anisotropic models as well, but the hope of having
singularity free models has not been fulfilled. We shall return to this point later.
(b) The paradox of standard models

With the metric (1), for any signal we have,

R2 dr 2
dt

kr

2J2 ~ 0

+4

so that

dr

(1 + k:2) ~

ft dt
0

R.

(3)

With the ultrarelativistic equation of statt:, p = p/3 in the early universe, R ~ t l/2
so that the integral on the right of the inequality converges and, hence, at any time
t, communication can be established only up to a finite distance. This situation is
referred to as the existence of a horizon and comes in conflict with the observation
of the isotropy of the microwave background.
At one time, the question was raised as to whether horizons could be got rid of
by going over to anisotropic models. While this was indeed the case in some
anisotropic models, physical processes bringing about subsequent isotropy were
not found viable.

(c) Primordial magnetic fields


It has sometimes been speculated that there might have been intense magnetic
fields in the early universe. Such fields cannot be accommodated in isotropic
models.

An Approach to Anisotropic Cosmologies

91

(d) Abundance of helium

Anisotropy in early stages of the universe might have significant influence on the
relative abundance of helium and, thus, in principle observations on the
abundance of helium might provide clues regarding the anisotropy of the early
UnIverse.

(e) The Kaluza-Klein compactijication


Recently, there has been a revival of interest in the Kaluza-Klein idea. The idea is
that the universe is an n-dimensional manifold with n > 4 but the extra
dimensions have been 'compactified' and, thus, rendered unobservable. This
compactification of some dimensions, along with expansion of others, req uire the
study of anisotropic models.
While we thus see some justification for the investigation of anisotropic
universes, we are faced with a disturbing problem - how to proceed. It is out of the
question to attempt to find some sort of general solution of the equations of
general relativity - we have to simplify and specialize matters. Two ways are
broadly possible.
(i) While giving up complete spatial homogeneity and isotropy, we retain some
symmetry assumptions, e.g. spatial homogeneity only or spherical or cylindrical
symmetry. These require the study of Killing vectors and Lie groups.
(ii) We may analyze the characteristics of the velocity field and place some
restrictions on them. This, in a way, is a more direct physical approach, because in
the other approach, the assumptions of symmetry are basically geometrical and
have physical implications only via the field equations of general relativity.
We proceed to give outlines of the two methods.
2.

Killing Vectors and Bianchi Types

Consider an infinitesimal transformation from xl' to x'l'


(1)

where e is an infinitesimal constant. Under the successive action of this


transformation, any point will in general trace out a curve which is everywhere
tangential to the vector ~I'. The transformation changes the metric tensor
components

(2)
where we have used (1), neglected the higher powers of e, and indicated partial
differentiations by commas. Also, because the gl':s are assumed to be differentiable functions of xl', we have

(3)

If g~v = gl'v(x'), we see that the intrinsic geometry

IS

not altered by the

92

A. K. Raychaudhuri

transformation and the condition for this is obtained from (2) and (3)
gl'v.aC::

-ga/~~I' - gl'a~~v'

which may be written in the covariant form:


~I';v

+ ~v;1' = O.

(4)

Any symmetry property corresponds to the existence of one or more Killing


vectors. Thus, the static (or stationary) character means the existence of
a time-like Killing vector and the cylindrical symmetry corresponds to the
existence of two Killing vectors, one of which has closed orbits.
Spatial homogeneity requires the existence of a group of transformations by
which we may go from one point to another without altering the intrinsic
geometry. Mathematically, this requires the existence of (at least) three linearly
independent Killing vectors, the transformations forming a group. The linear
independence of the three vectors ~'t, ~i, ~~ requires the matrix
~~

~i

~i

~i

~~

~~

~~

~~

~~

~~

~~

~j

to be of rank three, while the group property gives the relation


[Xa,X b] == XaXb - XbXa = C~bXc'

(5)

where the C~b'S are constants called the structure constants of the group and the
operator Xa == ~~(a/axi).
Obviously,
(6a)

and the Jacobi identity


gives,
(6b)

One may change over from one set of ~~'s to a linear combination (of course, the
new set of ~a's must be linearly independent). This could change the structure
constants but not the basic properties of the group. Bearing this in mind, it has
been shown that there are basically nine different types ofthree-parameter groups
G3 and these types are called Bianchi types.
The simplest case is the Bianchi type I, where all the Cbc's vanish ~ in this case
all the three transformations can simultaneously be reduced to translations.
However, the space in this case is generally anisotropic.
The standard isotropic model admits Bianchi types I, V, or IX according to
whether the space sections are of zero, negative, or positive curvatures. However,

An Approach to Anisotropic Cosmologies

93

isotropy means that the space admits the rotation group as well, so that the
complete group is of six parameters, which is the maximal group for a threespace.
The idea of spatial homogeneity greatly facilitates discussion. Firstly it allows
the metric to be written in some canonical form and, secondly, one can reduce the
partial differential equations of general relativity involving four coordinates as
independent variables to ordinary differential equations involving only one
independent variable. This happens because the orbits (or invariant varieties) of
any Bianchi group form a set of geodesically parallel three-spaces, and if one
chooses one of the coordinates along the normal to these hypersurfaces, then the
metric tensor components can be written as functions of this coordinate and the
Killing vector components. One can then make a systematic study of different
homogeneous universes (as also others which are not homogeneous but
nevertheless admit a number of Killing vectors).

3.

Kinematics - Analysis of the Velocity Field

We begin by recapitulating some results of the theory of elasticity in Newtonian


physics. The state of strain originates due to a relative displacement of particles
constituting the body. Let u(x, y, z) be the displacement ofthe point labelled by its
original coordinates (x, y, z). The increase of volume of a region bounded by
a surface S is given by

where the volume integral is over the volume bounded by s. Thus the volume
expansion coefficient 8 == (J(dv)jdv) is given by e = V u.
As is well known, V x u represents a rotation and, hence, we may split up the
first derivative of u in the following manner

The three parts on the right-hand side may be described as follows:


(a) The symmetric trace free tensor - we call this shear. It represents a change
of shape (i.e. of angles) but not of volume. Writing this 0"\;;> (the superscript
n indicating Newtonian physics)

(b) The anti symmetric tensor. This represents a simple rotation without any

94

A. K. Raychaudhuri

associated change of shape or size. We write

W = ! (ou i _ aUk)
(n)

'k

:1

:1

uX k

uX i

(c) The multiple of the unit tensor which represents an isotropic change of
length leading to a volume expansion. Again, we write
8(n)

V.u

We now try to guess the corresponding quantities for fluid motion in general
relativity. We note the following requirements which have to be satisfied:
(i) the displacement vector is to be replaced by the velocity vector,
(ii) the relevant expressions are to have tensor form,
(iii) the shear tensor and the rotation tensor should be purely 'space tensors'.
To meet the requirement of a 'space tensor', consider the tensor (jJl v =+= uJlu v where
uJl is the velocity vector (a unit time-like vector). If AJl is any vector, then the
component of AJl in the direction of the vector uJl is AJl uJl and the component
orthogonal to uJl is
AJl =+= (AVuvluJl

AV(j~ =+=

uJluJ

Thus, the tensor (j~ =+= uJl U v (hereafter written as h~) projects AJl to the local space
of the observer whose velocity is uJl. If we now contract the tensor uJl;V by h~, we get
where uJl = u:,u is called the acceleration vector. (In a local Lorentz frame in
which the fluid is at rest, ui = ouJot.)
Note that the tensor (u Jl ;. =+= uJlu.) is a complete space tensor, as it vanishes on
contraction both with uJl and U.
We can now replace u.;/J by uJl ;. =+= uJlu. to obtain the following expressions for
shear, rotation, and expansion:
(1)

(2)
(3)

(4)
(Note that the Kroenecker (jik in the Newtonian shear expression has been
changed to hJlv to ensure the space character of the shear tensor.)
A reference to the locally Lorentz frame in which the matter is at rest at the
point (i.e. uJl = bg, 9JlV = '1 JlV' 9JlV.' = 0) would make the above expressions go over
to their Newtonian form and, thus, justify the identification. However, we give
a more formal analysis towards the same end.

An Approach to Anisotropic Cosmologies

95

Consider a pencil of nonintersecting time-like world lines - each world line is


labelled by three constant numbers aV(v = 1,2,3) which may be considered as the
Lagrangian coordinates (commonly called comoving coordinates in general
relativity). Of course, a changes as we go from one world line to another. Let the
proper lengths along the world lines be denoted by T. Then any general
coordinates xl' are functions of T and aVo xl' = jV(al', T) and the tangent vector
to the world lines are given by
V

C::)

u" =

aV = Const.,

ul'ul' = 1.

Let t5 denote the change in going from one world line aV to a neighbouring world
line aV + t5a v at the same value of T, so that
axl'
t5xl' = - t5a v.
aa v
U sing the relation

a2 xl'
aT aa v

a2 xl'
aa v aT'

we get

(bXI');aUa

= bxu:a.

(5)

Corresponding to the coordinate separation vector t5x" between points on two


neighbouring world lines, the separation vector in the local space of the world line
a is obtained by contracting with the projection tensor:
V

(hi

= (b~

+ ul'u.) bx v

(6)

and the relative velocity (i.e. the rate of change of this vector) again projected on
the local space is

bui

= (b~

+ ul'uv)(bxHau"

(7)
where we have used Equations (5) and (6). Equation (7) may be written

bui

= bx1(w~

( ... bx1u.

+ CT~ + teb~),

= 0).

(8)

Also for the spatial separation

bl

Ihik bx i bxk

1/ 2

(+bxj bX,,~)1/2,

we get,

(bl) == (bl),a u =
=

+ bibxi bx1u/l;.

(+CT"an"na

+ te)bl,

(9)

A. K. Raychaudhuri

96

where nil = 15xjj15[ is the unit space like vector glVlng the direction of the
separation vector 15xj.
One gets
(10)

for the change of n., as one moves along the world line a.
Returning to the rotation tensor w"v' we can build up the space-like vorticity
vector
w' == ""."apw"a U p =
=1].!'a pU";,,U p'

~"."ap(u
2"
,,;a

a;" )u p

The necessary and sufficient condition for w(or w"a) to vanish is that u!' = IXQY.!"
where IX and qy are arbitrary scalars, Geometrically, this means that there exists
a family of hypersurfaces qy = const., orthogonal to the velocity vector. Thus, if
vorticity exists, no such orthogonal hypersurfaces exist or, in other words, the
local spaces do not mesh together to form a continuous space.
The shear tensor has the associated scalar (J2 = 1(J"v(J"v, Obviously, this scalar
vanishes if all the components of (J "V vanish, otherwise it is positive. Similarly, the
magnitude of vorticity is given by w, where

The kinematic quantities (J "V' w"v,{J, and u" allow one to write some elegant
relations using the field equations of general relativity. Recall that the RiemannChristoffel tensor can be defined by the commutator of the second covariant
derivative of any vector. Taking the velocity vector u",

Contracting the above equation by putting J1

{J, and then with u', we get

or

Using (4), the above gives

+ w". u"ua + tBh".) x


((J". + w" +
-- 8 = R J u"u
- uu" + ~(}h!')
3
lV'

u!';!, - ((J"a
x

or, finally

u";" - 2(J2

+ 2w 2 -

t{}l -

8=

R"vu!'u v

(11)

So far, this is just a geometric identity where the only assumption is that u" is
a unit time-like vector. We now plug in the field equations of general relativity

97

An Approach to Anisotropic Cosmologies

or
R llv

= -8n[Tllv -iTgIlJ

For the energy momentum tensor T llv , the simplest case is that due to a perfect
fluid (which includes the case of thermalized radiation)
T~ = (p

+ p)ullu v + pc5~

or
T

==

T~ =

+ 3p

Tllvullu V = p.

and

Thus, (11) now becomes

e+ te

2 -

UIl;1l

+ 2(cr 2 -

w2 )

+ 4n(p + 3p) = 0.

(12)

The above is a scalar equation obtained by contracting R llv twice with Ull and
uV. A vector equation may be obtained by contracting R llv only once with uV:
Again, replacing ull;fJ by the shear, vorticity, acceleration and expansion, we get,
after some calculation,

ie . hp =

cr P;.

+ crPu. + 2cr v u P +

+ IJllvafJ(WIl.'UV -

2w ll uvp;./(p

+ p)).

(13)

(N ote that the above contains only three independent relations as, on contracting
with uP, both sides vanish identically.) Two other simple relations may be
obtained from the divergence relation Til V;1l = 0, namely

P ==
.

P.Il UIl = -(p

= -

pj(j~

+ p)(J,

+ ullu.)

:...=-----'-

(p

+ p)

(14)
(15)

Equation (15) also contain only three independent relations, as both sides vanish
on contracting with U w
The set of equations (12)-(15) contains eight independent equations and thus,
falls short of the complete set of field equations by two. Nevertheless, their
comparative simplicity allows one to obtain a number of interesting results.
Let us begin with Equation (14). Writing
1 dv
e=--,
v ds

where v is a proper spatial volume comoving with the fluid, we get

dv

-(pv) = -p-.
ds
ds

A. K. Raychaudhuri

98

Hence, for the expanding case, dv/ds > 0 and d/ds(pv) ~ 0 (the sign of equality
holding for pressureless dust). Hence, in an expanding space, the total energy of
the fluid remains constant for dust and decreases for positive pressure. (Note that
conservation of energy in the naive sense does not hold in general relativity.)
For negative pressure, however, the energy goes on increasing. This was the
way in which McVittie sought to 'explain' the creation of energy postulated in an
ad-hoc manner by the protagonists ofthe steady-state cosmology. More recently,
negative pressures have made their appearance as a component of the energy
stress tensor of false vacuum and, in this way, a provision for the so-called 'free
lunch' has been provided in the inflationary universe scenario.
From Equation (15), we see that the fluid world lines will be geodesic if either
p is just a constant (this includes the case of dust) or P.a = AUa. In the second case,
ua will be hypersurface orthogonal and, thus, the motion will be irrotational.
Hence, we have the theorem.

For a perfect fluid if the vorticity does not vanish, the motion will not be geodesic
unless P,a = 0, while for irrotational motion, the motion will be geodesic if P,a = AUa
or the level surfaces of P are orthogonal to the velocity vector.
We return to Equation (12). Introducing a length scale R by V

R 3 , we get

'R=3 8,
so that Equation (12) becomes

'R =

4n

-T(P

+ 3p) - 3(u -

w )

2.

+ 3ul-';1"

(16)

For the isotropic model, (16) reduces to Equation (2) of Section 1, leading to the
conclusion of a collapse singularity. Equation (16) shows that the influence of any
shear is to hasten the march towards the collapse, while the tendency to collapse
is opposed by vorticity. The acceleration term is a dark horse - apparently it can
be of either sign.
The role of vorticity in opposing collapse raised the expectation at one time
that one can have singularity-free rotating universes. This expectation has not
been fulfilled. Identifying the singularity with incompleteness of time like or null
geodesics, Hawking and Penrose have shown that unless we are ready to make
some serious departure from physics as we know it, the occurrence of singularities
is inevitable. The departures they spelt out, were a possibility for the existence of
negative energy or repulsive gravitational interaction or a breakdown of
causality in the form of the existence of closed time like lines. All these seem
unacceptable, so that the occurrence of singularities stands out as a basic problem
of cosmology or indeed of physics in general. However, the singularity may not be
associated with infinity of density, pressure or other physical variables. Cases of
spatially homogeneous universes showing such singularities (christened whimper
singularities as distinct from big bang singularities) have been worked out by Ellis
and King.

99

An Approach to Anisotropic Cosmologies

4.

Perfect Fluid Solutions Classified According to Kinematic Properties

We consider two possibilities for each kinematic variable - it is either zero or not.
As there are four such variables, we have 16 possibilities altogether. We mention
the more important solutions for each possibility as known to us.
1.

w/1 =

(J IlV

= Ull = 8 = 0.

The only known solution of this type is the Einstein static universe. However,
Equation (12) shows that in this case p + 3p = 0, so that p and p cannot both be
nonnegative. Einstein avoided this difficulty by introducing the cosmological
term in the field equations. The Einstein metric is
ds 2 = +dt 2
-

=+=

[dr2

+ r 2 d8 2 + r 2 sin 2 8d<p2]
[1 + r 2 /4R 2 ]2

-"--------0-----::--=----

The group of isometry admitted is G 7 , corresponding to the Bianchi type


IX + rotation group + translation along the taxis.

II.

w ll

(J /1V

= u/1 = 0, 8 #-

To this class belongs the Friedmann metrics. The isometry is G 6 .


One may consider the de Sitter metric as also belonging to this class:
ds 2 = dt 2

=+=

e"[dr 2

+ r 2 d8 2 + r 2 sin 2 8d<p2].

The isometry, however, is now G70 as it is of a constant curvature. As in this case,


with the Einstein equations (p + p) = 0, the velocity vector is undefined and so it
is not meaningful to talk about kinematic quantities.
III.

w ll

(J /1V

= 8 = 0, u/1 #-

In this case u ll is hypersurface orthogonal. In the simplest case, there is a Killing


vector of the form lUll. This corresponds to static fluid distributions. There is
belief that static fluid distributions are necessarily spherically symmetric but
there does not seem to be any formal proof. In the spherically symmetric case, the
field equations may be reduced to Oppenheimer-Volkoff equations. Nonstatic
spherically symmetric solution of the above type are also known.
IV.

W Il

=ull =8=0,

(JIlV#-O.

A reference to Equation (12) in Section 3 shows that no solution with nonnegative


pressure and density exists. (Does any solution exist with a cosmological
constant?)
V.

(JIlV

= ull = 8 = 0,

Wll

#- 0

We have the Godel universe and the Van Stockum nonhomogeneous solution.*
We shall discuss the Godel universe in some de:taillater on.

* Both these solutions have closed time-like lines.


rotation?

Are closed time-like lines inevitable in rigid

A. K. Raychaudhuri

100
VI.

of =

(J J1V

= 0,

e#-o
with p = p(p),

#- 0,

uJ1

All solutions of this class


are known - namely, a spherically
symmetric metric due to Wyman (phys. Rev. 70, 396 (1946 and a spatially
homogeneous solution in which the velocity vector, although irrotational, is not
orthogonal to the homogeneous varieties. (i.e. a titled universe solution) (Collins
and Wainwright, Phys. Rev. D27, 1209 (1983.
Both these solutions, however, possess physically undesirable features.
VII.

mJ1

uJ1

= 0,

(JJ1V

e#-o

#- 0,

All the anisotropic nonrotating homogeneous untilted solutions belong to this


class. We shall discuss some of them later on.
VIII.

e #- 0,

#- 0,

mJ1

(J J1V

uJ1

No such solution exists if dp/dp #(1967.


IX.

mJ1

e = 0,

(JJ1V

#- 0,

#-

uJ1

or if p =

(Ellis, 1. Math. Phys. 8, 1171

Some spherically symmetric solutions are known (Kramer et al., Exact Solution,
p. 172 - the density p is assumed constant).
X.

(J J1V

e=

0,

mJ1

#- 0,

uJ1

#-

Solutions with these properties have been given by Wahlquist (Phys. Rev. 172,
1291 (1968; Vaidya (Pram ana 8,512 (1977 and Prasad (PhD thesis, Gorakhpur
Univ., 1984). However, the equation of state in these solutions gives dp/dp <
except perhaps in one case studied by Prasad.

XI.

uJ1

e = 0,

mJ1

#- 0,

(JJ1V

#-

A cylindrically symmetric solution admitting a time-like Killing vector distinct


from the velocity vector, was given by Maitra (J. Math. Phys. 7, 1025 (1966 (also,
Visheshwara and Winicour, J. Math. Phys. 18, 1280, (1977)). The Maitra metric
has no closed time-like lines and is singularity free.
XII.

mJ1

#- 0,

(J J1V

#- 0,

uJ1

#- 0,

e=

No such solution is known to us.


XIII.

mJ1

#- 0,

(J J1V

#- 0,

e #- 0,

uJ1

= 0

Lukash (JETP Letts. 19, 267, (1974 gives a solution of this type admitting
a Bianchi type-VII group of motions. The rotating, shearing, and expanding dust
is associated with gravitational waves.
XIV.

mJ1

#- 0,

uJ1

#- 0,

e #- 0,

(J J1V

=0

Such a solution seems unlikely to exist and no proof is available.


XV.

mJ1

= 0,

(JJ1V

#- 0,

uJ1

#- 0,

e#-o

An Approach to Anisotropic Cosmologies

101

This is the general case of irrotational motion. Allnutt (Gen. ReI. Grav. 13, 1017,
(1981))
ds 2 = =+=(3e,+2x-2y
=+=

+ 2n2e31+6X)du2

et

2x 2 (e 2Y _1)- l dy2

=+=

2e3t+4xdx2

=+=

+ 2e2t+3X-Ydu(dt + 3dx + dy).

In the region e 2y > 1 and 4n 2 e 2t + 4x > 3, the signature is correct and one finds
27n 2

p = -_e- t
16n
p

21n
= --e-

16n

I -

9 e- 3t - 4x

64n

'

9
_e- 3t - 4x .
64n

In the region mentioned, p and p + pare >0 but p + 3p < O. Thus, the weak
and dominant energy conditions are satisfied and the strong energy condition is
violated. Only one space-like Killing vector ~I'(a/axl') = a/au is admitted by this
metric. Other solutions are due to Oleson (J. Math. Phys. 12,666, (1971)). For
some spherically symmetric solutions, see Vaidya, Phys. Rev. 174, 1615, (1968).
XVI.

wI' i= 0,

(lI'V i= 0,

till i= 0,

e i= 0

This is the type of solution given by Lukash, as referred to under XIII. Lukash
also gives solutions with p = ap, where a is a constant and, for a = 0, the solutions
are of type XIII, while for a i= 0, they are of type XVI.
5. Some Anisotropic Cosmological Solutions

Anisotropic cosmological solutions range all thl~ way from the inhomogeneous
model discovered by Szekeres (Commun. Math. Phys. 41, 55, (1975)) which has no
Killing vectors (Bonnor et al., Gen. Rei. Grav. 8, 549, (1977)) to the completely
homogeneous model of Godel. We shall discuss only some very selected models
here.

5.1.

The Godel Cosmos

Although the Godel universe has hardly any place as a realistic model it,
nevertheless, had a great impact on our ideas. The metric given by Godel is
ds 2

= a2[(dt + e

dy)2

=+=

dx 2

=+=

e2x

dy2

=+=

dz 2 ].

(1)

admits a five-parameter group of motions - Bianchi type VIII + translation


along the taxis + a rotation. The solution is singularity free and the pressure is
equal to the energy density. At the time when the solution was discovered, the

A. K. Raychaudhuri

102

existence of a universal rotation, which contradicted Mach's principle, seemed


rather puzzling. One found, however, that the solution contained closed time-like
lines which involved a possibility of violating causal relations. This might be
taken either as a defect of the particular solution or as a limitation of general
relativity itself, as it admitted such solutions. The easiest way of seeing the
existence of closed time-like lines is to consider the Godel metric in a transformed
form given by Godel himself
ds 2 = 4a 2[dt 2 =+= dr 2 =+= dy2

(sinh 4 r

+ 2)2 sinh 2 r dcp dt].

- sinh2r)dcp2

+
(2)

One notes that g",,,, vanishes at ,. = 0, as also det/gllvl. This is not a singularity,
however, if cp is considered as an angular coordinate. The commonly used term
'elementary flatness' requires that in order to be singularity free, an infinitesimal
circle must have its circumference given by 2n times its radius. In the present case,
(sinh4 r - sinh 2 r) ~ - r2 as r ~ 0 and the above condition is satisfied.
Now, we note that for r > log (1 + )2), cp is time-like and, hence, in this region
any cp line (i.e. a circle) is a closed time-like line. Thus, it is possible to return to
a spacetime point after some wandering, and so one cannot have any unique
ordering in the time sequence of events. We thus run into conflict with
cause-effect relationships. Later, from the researches of Hawking and Penrose, it
has become clear that the singularity-free property of the Godel metric depends
essentially on this unphysical feature.
It seems that the occurrence of closed time-like lines is related to the rigidity of
rotation. Thus, it occurs in the Van Stockum solution as well, but is absent in the
solution discovered by Maitra.

5.2.

Bianchi Type 1 Cosmology

In this case, all the structure constants vanish so that the group is Abelian. The
metric can be written in the form

(3)
where A, B, C are functions of t. Rotation of the cosmic matter is not allowed but,
in general, the shear is nonvanishing. (If the shear vanishes, A = B = C and the
metric goes over to the Friedmann flat space metric.) The explicit solution in this
case for pressure-free dust is fairly simple

A=R (-

t2/3)~

t2/3)fJ
B=R ( -

'

(4)

'

(5)

103

An Approach to Anisotropic Cosmologies

t2 /3)Y

C=R ( -

(6)

'

where a, f3, yare constants subject to the restriction a + f3 + y =

and
(7)

to being a constant related to the a's by

t5 = i(a 2 + f32 + y2).

(8)

Thus, in the form given, there are essentially two arbitrary constants. (Some
constants have been absorbed by coordinate transformations.) The density and
shear are given by

p = Constt-I(t

Const.R- 6

T2 =

+ to)-I,
=

ConsU- 2 (t

+ t o)-2,

8=-+--.
t

+ TO

The solution has a collapse singularity at t =, O(R ...... 0, p ...... 00, (J2, 8 ...... (0).
Different types of behaviour in the three principal directions are, however,
possible.
Solutions of this type which have an additional symmetry in the form A = B
(i.e. admitting a rotation in the xy plane) were studied by Thorne (Astrophys J.
148, 51, (1967)) in connection with primordial magnetic fields and nucleosynthesis. He found two distinct types of behaviour as t ...... 0: (i) a collapse
of the x - y space along with an expansion in the z direction, A = B ,....., t 2 / 3 ,
c,....., t- I / 3 ... a cigar-shaped form for the collapsed universe. (ii) A = B ~ (1 + !Xt),
C ,. . ., t, as t ...... 0, so that any finite region in the x - y two-space remains finite,
while the perpendicular direction collapses - pancake collapse.
The most striking point is that near the singularity, p/(J2 ...... 0 so that, in the
early stages, the dynamics is governed by the shear rather than the matter density.
Indeed, the behaviour near the singularity is independent of whether we consider
an empty, dust, or radiation universe. (In the last case, Py ,. . ., R - 4, only for the
Zel'dovich limiting pressure density relation p = p would p - R - 6 and thus be of
the same order as the shear.) Consequently, whereas in the early stages of the
isotropic universe, the temperature falls off as t -1 / 2, in the present case the
temperature will fall off as t -1/3, so long as the shear remains dominating. This
slower fall of temperature and density would affect nUcleogenesis in the early
universe and, thus, while on the one hand. discrepancies between observed and
calculated He abundance (calculated on the basis of isotropic models) can be
sought to be explained in this manner, alternatively the same considerations may
be used to set bounds to the present shear.
The other point to be noted is the asymptotic approach to isotropy for t to.

A. K. Raychaudhuri

104

It should be noted, however, that the behaviour of shear may not be typical for
other Bianchi types, not to mention general nonhomogeneous spaces.
By a suitable choice of a, [3, }" one may have that one of the three, A, B, C,
behaves as t - 1. Then as the integral Io dt/t diverges, the horizon in that direction
will be abolished. In the present case, however, there is no general abolition of
horizons ~ this may be obtained for a Bianchi type-IX solution, which also shows
a novel temporal behaviour near the singularity.
The dominating behaviour of the shear may have an impact on the appearance
of the so-called inflationary state of the universe. There it is supposed that at one
stage in the early universe, the vacuum energy stress (T ik = Pc5 ik ) dominates over
the matter-energy density and this leads to the universe passing through a de
Sitter stage with the metric
ds 2 = dt 2 e,t(dx 2 + d/ + dz 2).

However, if the universe is anisotropic and of the Bianchi type I, then Barrow and
Turner (Nature 292, 35 (1981)) have shown that the dominating shear may
prevent the onset of the inflationary condition.*
Another simple anisotropic model belonging to the Bianchi type V is due to
Schucking and Heckmann (1958)
ds 2 dt 2 R2 [dx 2 + S2 e2x dy2 + S- 2 eX dz 2 ],

IP = 1 + 2MR- 1 + ta 2 R- 4 ,

where

M = !npR3 = const.,
a 2 = 30'2 R6 = const.,
S = exp [aIR - 3 dt].

In the early stages, the shear also dominates and

5.3.

Bianchi Type IX Model

The group structure is given by C13 = cL = Ci2 = 1 and all those having two
or more equal indices vanish. The line element can be written in the form
ds 2 = dt 2

+ 12(dljJ + cos8dcp)2 + m2(sinljJd8 -

cosljJ sin 8dcpf

+
(9)

Here I, m, n are functions of t and 1jJ, 8, cp, the space coordinates are restricted by
the following conditions

o :(: IjJ :(: 4n,


*

o :(: 8 :(: n,

o "'; cp :(: 2n.

Note added in proof: Later studies indicate this may not be true, see, e.g., I. G. Moss and V. Sahni
Phys. Lett. 178B, 159 (1986).

An Approach to Anisotropic Cosmologies

105

We write out the field equations, by neglecting the

T'\

components
(10)
(11)
(12)

liz

ii

-+- + ~ =

0,

(13)

where

12 + m2 - n2,
b = 12 - m2 + n2,
c = _12 + m 2 + n 2.
It is not possible to integrate the above set of equations in a straightforward
manner. In the neighbourhood of the singularity, however, one can obtain some
interesting results about the behaviour of the three functions I, m and n. One can
make the ansatz: I '" t P , m '" t q , n '" t r , where although p, q, r are no longer
constants, one can consider intervals in which their change is small. These
intervals are comparatively long if p, q, r are the triplet 1,0, which is associated
with the abolition of the horizon in the direction having the exponent 1. This
situation, however, does not continue indefinitely and at some other stage, p, q,
r again have the values 1,0,0, but now the index 1 occurs in a different direction.
Thus, one finds an abolition of horizons in one direction at any stage, with the
particular direction, however, changing from stage to stage. Over a large number
of stages, there is the possibility of the complete abolition of the horizons.
Although the horizons may be abolished, however, a smoothing out of the
anisotropies does not generally occur.
Leaving aside the case of the triplet 1,0,0, in other cases two of the p, q, rare
positive and the third negative. Thus, not all directions are expanding and there is
an oscillatory change of expanding and contracting directions as we approach the
singularity. All these conclusions were arrived at as a result of the investigations
of Misner in the U.S.A. and the Soviet group working with Lifshitz.
a

6.

Problems

1. The spherically symmetric metric

ds 2

A(r,t)dt2

+ B(r,t)(dx2 + dy2 + dz

2 ),

where r = (x 2 + i + Z2)1 /2 admits rotations about the x, y and z axes. Find the
corresponding Killing vectors and express them in spherical polar coordinates
r, 0, <po

106

A. K. Raychaudhuri

2. In the Schwarzschild field


ds 2 =

( 2)
~
1-

(2~)

\-1

dt 2 += 1 -

dr 2 += r2(de 2

+ sin 2 edcp2)

the translation t --+ t + IX maintains the metric unchanged. Find the


corresponding Killing vector and the region in which it is time-like. Calculate the
shear, vorticity, and acceleration for the unit vector along the taxis.
3. Find the condition that the metric of the form
ds z = dt Z += dr z += dz z + 2mdcpdt += Idcpz
may admit a Killing vector

~a

with

~z

~a, Z

= O.

4. Transform the metric


ds 2 = f2

d~ d1/

- gZ(dxZ

+ dyZ),

where f and g are arbitrary functions of (~


form.

+ 1/) to the canonical Bianchi type-I

5. Calculate the shear, vorticity, expansion, and acceleration for the vector ul'
having components ul = 1, u'P = 2a, ur = U z = 0 in the metric
ds z = dt Z += e- a2r \dr Z + dz z ) += rZdcpz

+ 2ar z dcpdt.

6. Find the time-like Killing vector for the de Sitter metric

ds 2 = dt Z += eal[dr Z + r 2 d8 z + r Zsin 2 8dcpZ]


and transform it into a static form.
7. In the Bianchi type-I metric
ds z = dt Z += AZdx z += BZdyZ += C 2 dz z
show that if A, B, C are unequal, there is no shear free unit time-like vectors, but if
A = B, there is such a vector. Find it.
8. Show that if .leVI' is a Killing vector, where VIl is the velocity vector, then the
shear and expansion vanishes and the acceleration vector is a gradient vector.
9. Show that in a metric ds 2 = dt Z - 1dcp2 + 2m dcp dt - e 2 1/1(drZ + dz z) [where
I, m and t/I are functions of r], there are a set of time-like geodesics with
dr = dz = 0

ds

ds

'

dcp

ds #

O.

Calculate the expansion and shear for these geodesics and describe the nature of
the shift of spectral lines that will be observed when both the source and observer
are describing such geodesics.

7. Topics In Spacetime Structure


P. S. JOSHI
Theoretical Astrophysics Group, Tata Institute of Fundamental Research,
Homi Bhahha Road, Bomhay 400005, India

1.

Introduction

The idea in this chapter is to provide an introduction to several topics concerning


the spacetime structure which have proved to be of considerable importance in
the analysis of problems in gravitation physics. Most ofthe theories of gravitation
today model the spacetime as a four-dimensional differentiable manifold on
which a Lorentzian metric tensor is globally defined. After setting up this basic
framework in Section 2, we discuss spacetime diffeomorphisms in Section 3. Lie
derivatives and Killing vector fields, which help to define and classify spacetime
symmetries are discussed in Section 4. Finally, in Section 5, we discuss boundary
construction and the useful technique of conformal compactification for
spacetimes which attaches a null boundary to the physical manifold. Whereas
conformal compactification becomes most relevant in the case of asymptotically
flat spacetimes, the procedure is also useful in cosmological situations. Examples
to illustrate the conformal compactification procedure are given.

2.

The Manifold Model

We will take the spacetime to be a four-dimensional differentiable manifold


(M, g), where g is a Lorentzian metric tensor of indefinite signature. Apart from

the differential structure on M, the spacetime has a topological structure induced


from the companion Euclidean space by the requirement that the open sets in [R4
are pulled back into open sets in M by the mappings ;; 1, where {U a, a}
denotes the complete atlas covering M. It will be assumed that M is Hausdorff,
paracompact, and connected in the induced topology and, further, it is
time-orientable in the sense that there exists a unique, global choice between past
and future in M. It can be shown that M will be time-orientable if and only if there
exists a global non vanishing line-element field (i.e. an assignment of two vectors
(X, - X) at each point of M) on M [ll
The indefinite metric,
(1)

completely specifies the casual structure of M giving rise to future and past light
cones in M. Since the null geodesic of a spacetime are invariant under a conformal
transformation of the metric tensor, given the above light cone structure, the
107
B. R. Iycr et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 107-117.
:(" 1989 hy K /ulVer Academic Puhlishers.

P. S. Joshi

108

metric 9 is determined up to a conformal factor [2]. Most of the work on


singularity theorems in relativity assumes that the spacetime satisfies certain
reasonable causality conditions. A primary condition in this connection would be
that (M, g) contains no closed non-spacelike curves, which is known as the
Causality assumption. Actually, there exists a hierarchy of these regularity
assumptions on M [3]. However, the most important condition after causality
comes from the consideration that in view of the quantum principles it is not
possible to determine with absolute certainty the components gab at a point and,
hence, actually one would demand what is called stable causality for the
spacetime: Whenever (M, g) is causal, (M, g) is also causal for some g > g. (Here
g > 9 means all the non-spacelike vectors with respect to 9 will be timelike with
respect to g.) It should be noted that the causality assumptions are no longer seen
as absolutely essential as far as establishing the existence of spacetime singularities
is concerned. In fact, it has been shown that causality violation is as bad as
singularities in the sense that they themselves generate geodesic incompleteness
in the spacetime in certain situations [4, 5J.
Finally, in this connection we would like to comment that the causal and
topological structures of spacetimes are intimately connected. For example, if
M is to satisfy the causality assumption, then it cannot be topologically compact.
To see this, we first note that the family 1+ (p); P E M covers M. (Here I + (p) denotes
all those points q of M which are connected from P by means of future directed
timelike curves. This is also denoted as P q. Here we follow the notation and
terminology of [2].) Now suppose, if possible, that M is compact, which implies
that there exists a finite set PI' PZ'" . 'P n such that M is covered by the union
I + (PI) u 1+ (P2) u ... u 1+ (Pn). Then PI lies in the future of one of the points Pi' of
PI'" ., Pn; i.e. PI Pil' where PI -# Pil' Continuing this chain we get
(2)

Since Pin-l must again be in the future of one of the PI' P2"'" Pn by assumption,
this gives
(3)

for some n, which is the violation of causality.

3.

Spacetime Dilfeomorphisms

We shall work here with spacetimes which are endowed with reasonable
topological and causal properties, as specified in the previous section. The
covering of spacetime with local neighbourhoods endows it with local coordinates. Let be a map from a spacetime M to another spacetime N, i.e. makes
a unique assignment of a point in N for each point in M. Then is called
a Coo-map if the coordinates of (p) are C X functions of coordinates of p. Such
a map induces a pull-back map * of all functions on N to functions on M,

109

Topics in Spacetime Structure


defined by

(4)

(*f)(p) = j((p)).

If we denote the set of all tangent vectors at p by the tangent space T p ' then
induces another map * which maps tangent vectors of Tp into tangent vectors
of Tq,(p). Effectively, if V is a tangent vector at p to a curve ),(t) passing through p,
then * V is tangent to the image curve (A(t)) in N at (p), i.e.

(* V)(f)

V(fo )

(5)

V(* f),

where j is any function on Nand V(f) denotes the directional derivative. One
can also pull back one-form W on N into Musing *, which can be defined in
coordinate language by
(6)

In fact, the maps * and * can be extended to carry tensors of the (r, s) type
between M and N; however, now we confine ourselves to only those maps
which are diffeomorphism on M, i.e. is a one-one, onto map from M to M such
that - I is also a ex; map. Then, if YiI.:.irj, ... j, is a tensor of type (r, s) at p, we
define the 'carryover' tensor [* TJi,irj, ... j" at (p) by
[* TJil .. irj, ... js(wlt ... (Wrk(V1 yl ... (vsy'
= Til ... irj, ... js(*W\)iI ... ((*wr);.[(-I)*

VIJj, ... [(-I)*

vsY',

(7)

where WI"" 'W r and Vi, ... , V are one-forms and vectors, respectively, at (P).1t
should be noted that for diffeomorphisms
S

or
(8)

which is not difficult to see, and anyone of them can carryover tensors in M.
Amongst the set of all possible mappings from Minto M, the spacetime
diffeomorphisms have a special significance. It can be seen that if : M --> N is
a diffeomorphism, then M and N have identical manifold structures and the
solutions (M, T(i)) and (N, * T(i)) have identical physical properties, where T(i)
denote tensor fields on M. On the other hand, if (M, T(i)) and (N, T(i)) are not
diffeomorphically related, then these spacetimes have different and physically
nonequivalent properties. In this sense, spacetime diffeomorphisms represent the
gauge freedom of general relativity, or for that matter, of any spacetime theory
formulated in terms of tensor fields on a spacetime manifold.
It is possible to view diffeomorphisms locally in terms of local coordinate
transformations. Suppose (p) = q and U and V are disjoint coordinate
neighbourhoods of p and q, respectively, such that -ltV) = U. Let Xi and i be
the local coordinate systems in U and V. We use to set up a new coordinate

110

P. S. Joshi

system Xi' in the neighbourhood of p, given for any q in U by,


Xi'(q) = i((q)).

(9)

Thus, the effect of now is to induce a coordinate transformation Xi ---> Xi' while
the tensor fields at p are left invariant. Of course, here again the components of
a tensor field T at p in Xi' will be the same as the components of * Tat (p)
discussed above. Thus, actually both local as well as the coordinate-free
approaches are equivalent.
For a diffeomorphism : M ---> M and for any tensor field Ton M, we can
compare T and * T. If
T = * T,

(10)

then even if we have moved T with , it has remained the same. In such a case, is
called a symmetry transformation for the tensor field T. In the case of the metric
tensor, such a symmetry transformation, i.e.

(11)
is called an isometry. Thus, isometry is a diffeomorphism on M that leaves the
metric tensor invariant and, hence, distance measurements in M will be invariant
under .

4.

Killing Vector Fields

We have mentioned symmetry transformations for a spacetime. Such symmetries


are characterized by the existence of the so-called Killing vector fields in M. We
will begin here by introducing the notion of the Lie derivative.
Let X be a vector field on M, i.e. the assignment of a vector at each point of M.
Then the existence theorem in the theory of ordinary differential equations
implies the existence of integral curve for X[6], i.e. there exists a unique maximal
curve .le(t) through each point p E M, such that
},(O)

p,

(~)
ut

).lr=O

= Xp.

(12)

Thus, the tangent vector at the point }.(t) is X1).(t)'


Next, for each q E M, there exists an opl~n neighbourhood U containing q and
an e > 0 such that X defines a one-parameter family of diffeomorphisms
r: U ---> M for It I < I: obtained by taking each p E U a parameter distance t
along the integral curves of X. These diffeomorphisms form a one-parameter
local group
r+s = ros
_t=(t)-l,

for

Itl, lsi, It + sl <

o(p)=p.

iO,

( 13)

Topics in Spacetime Structure

111

These diffeomorphisms map any tensor T at p as

Tp

->

(14)

t* Tlq,,(p)'

The Lie derivative of a tensor field T along X is now defined as

. 1
LxTlp.=hm -[(-t)*T- T].
-

t~O

(15)

Given a derivative operator Va on M, it is possible to write down the components


of the Lie derivative of a tensor T, which is again a tensor of the same type. We
shall not go into details, but note in particular that, for the metric tensor gab'
(16)
where Va is the connection compatible with the metric, i.e. Va gbc = O. The operator
Va is often termed as the covariant derivative on M and (16) can also be written as
(17)

Suppose now that the local one-parameter group of diffeomorphisms t


generated by a vector field K is a group of isometries, i.e. for every t, t is an
isometry,

t* gab = gab

for all t,

(18)

then K is called a Killing vector field and we have

Lx9 = lim
t~O

~[g
t

(-t)*gJ = O.

(19)

Thus, the necessary and sufficient condition for K to be a Killing vector field is
that the Lie derivative of g along K vanishes.
Next, since

LK gab = Kb;a

+ Ka;b'

(20)

for a Killing vector field we get


K b:a

+ Ka:b

= 0

(21)

which is called the Killing equation.


As pointed out in Chapter 6 of this volume [7J, Killing vector fields playa very
important role in the study of cosmologies which are spatially homogeneous but
anisotropic. In fact, introducing anisotropies or inhomogeneities in a cosmological model, mathematically amounts to simply reducing the number of
independent Killing vectors that might be admitted by the spacetime. Again,
solving Einstein's equations for a completely general spacetime is almost an
impossible task, and normally one has to consider spacetimes admitting various
symmetries which are usually characterized by the existence of Killing vector
fields in the spacetime.
There are certain important implications of having a Killing vector field in

112

P. S. Joshi

a spacetime which we now discuss:


(1) Since the one-parameter group of diffeomorphisms generated by a Killing
field are all isometries, they all leave the metric invariant. Hence, if a small but
finite displacement is made of all spacetime points by means of <PI at each point of
M, then the resultant spacetime structure is exactly the same as the original
spacetime; the spacetime is invariant under such a change.
(2) Let ~a be a Killing vector field and}' be a geodesic with tangent vector ua
Then ~a ua is constant along},. For this, consider
UbVb(~aUa)

= ub~a Vb ua + UbuaVb~a'

(22)

Now the Killing equation


Va~b

+ Vb~a = 0,

(23)

when contracted with ua ub, implies that the second term of (22) is zero, whereas
the first term vanishes by the geodesic equation, which proves the result.
In general relativity, timelike geodesics represent the particle trajectories,
whereas null geodesics represent light rays. The above result then implies that
everyone-parameter group of symmetries gives rise to a conserved quantity for
particles and light rays. Conserved quantities are physically significant and, at the
same time, they help in integrating geodesic equations.
(3) Next, consider the following equation defining the Riemann tensor [8]
(24)

If now ~a, appearing in the above, is a Killing vector field, then using the Killing
equation (23) and the symmetries of the Riemann tensor (24), gives
(25)
As a consequence of (25), a Killing vector field ~a is completely determined by
values ~a and Lab = Va~b at any p E M. This means that given (~a, Lab) at p, the
values (~a, Lab) at any other q are determined by integrating the ordinary
differential equations
vaV a~b

= va Lab'
(26)

along any curve joining p to q, where va denotes the tangent vector to that curve.
This consideration now implies a clue to finding out the number of totally
independent Killing vector fields in a manifold of dimension n. This would be the
dimension of the space of initial data W, L"b)' which is, at most, n + tn(n - 1); i.e.
there can be, at most, tn(n + 1) different independent Killing vector fields in
a spacetime of dimension n. As an example, it is possible to solve the Killing
equation for the Minkowski space to see that this spacetime admits 10 Killing
vectors, which is the maximum number possible for a four-dimensional
spacetime. Another interesting example to solve the Killing equations is that of

113

Topics in Spacetime Structure

a two-sphere where the metric is given by


ds 2 = de 2 + sin 2 e d<p2

(27)

and the Killing vectors turn out to be


~B = A
~ = (A

sin cjJ + B cos cjJ,


cos cjJ - B sin cjJ) cot e +

c,

(28)

where A, B, C are constants.


In a cosmological situation, it is possible to characterize the physical
conditions of homogeneity and isotropy in terms of the existence of Killing
symmetries on a spacetime [8]. The two-sphere mentioned above would then be
homogeneous, as it admits two linearly independent Killing vectors at each point.
In such a context, the standard Friedmann-Robertson-Walker cosmological
models [8J would be maximally symmetric in the sense that they would admit all
the tn(n + 1) different Killing vector fields.
5. Boundary Attachment and Conformal Compactification
for Spacetimes
One would like to study, within a general spacetime framework, the behaviour of
fields in the limit of going away to infinity which may be only in space or in both
space and time. Such a task presents no special problems when the spacetime
under consideration is the flat Minkowski spacetime. Since this spacetime admits
a global inertial coordinate system, it is possible to give an unambiguous meaning
to the notion of infinity here. For example, one can define isolated charge
distribution in a precise manner by requiring the charge-current density l to
vanish outside a world tube of compact spatial support and that the electromagnetic field tensor F = 0(1/r2) as r ...... X) in space and F = O(1/r) as r ...... 00
along any null geodesic. In a general curved background, however, we have no
natural global inertial coordinate systems to define a preferred radial coordinate,
such as r, which takes limit for use in specifying falloff rates. In order that
asymptotic analysis becomes possible in a general background, one might require
that there exists some coordinate system Xi such that the metric components
become Minkowskian for large values of r, where r is defined to be [(X 1)2 +
(X 2)2 + (X 3)2J 1/2 . The problem with these kind of criteria, however, is that it is
extremely difficult to check their coordinate independence. Somehow one would
like to make meaningful statements regarding infinity in a coordinate invariant
way.
It is possible to satisfactorily resolve this question in a coordinate
independent way by realizing that for a given, general curved spacetime, there
exists a natural boundary, consisting of points at infinity, which is not part of the
spacetime. There are several ways in which such a boundary could be attached to
a general spacetime, an excellent example being the causal boundary construc-

114

P. S. Joshi

tion given by Geroch, Kronheimer and Penrose [9J, which we will briefly
describe here, as it will help to clarify the basic ideas involved.
Given a future directed time like curve y with the future end point p, we have
r(p) = r(y), i.e. both p and y have the same past. In such a situation, the
spacetime point p can be said to be characterized by the past set r(y). On the
other hand, if y were future endless, r (y) would define a point of future infinity,
whereas if y were a null geodesic, r(y) defines a point at future null infinity. In
general, a nonempty subset P of M is said to be a past set if there is some A c M
such that r (A) = P.1f P cannot be expressed as the union oftwo proper past sets,
then it is called an indecomposable past set (IP). If P is an IP and if there is some
x E M with r(x) = P, then P is known as proper IP or a PIP. If an IP set is not
PIP, then it is termed as terminal IP or TIP. For example for the future endless
timelike y mentioned above r (y) would be a TIP. The definitions of future sets
indecomposable future sets (IF), PIF, and TIF are dual. Now let M* be the union
of M and M, which are the unions of all IPs and IFs in M, respectively. One can
avoid duplications in M by defining M* as the quotient space M* /R h, where Rh is
the intersection of all equivalence relations R on M* x M* for which M* /P is
Hausdorff [9]. Now, M* can be viewed as a spacetime with boundary M c M*,
and the topology on M can be looked upon as the induced topology of M*.
A point x E M* is said to be a regular point if it is represented by a PIP or PIF. All
other points in M* represented by TIPs or TIFs are the boundary points for
M which form the infinity attached to M, called the ideal point boundary. Since the
construction described above uses only the causal structure of M, the construction is clearly coordinate invariant.
Although the boundary construction above attaches both time-like and null
ideal points to M, when one is interested in analyzing radiation fields only in
a background containing a single, isolated source (such as the Schwarzschild or
Kerr-Newman geometries), the special technique of conformal compactification
of spacetimes becomes very relevent. Here, one attaches a null infinity to the
spacetime M by transforming the original spacetime metric by means of
a suitable conformal factor. The resulting unphysical manifold JW with boundary
is causally equivalent to M since light cones remain unaltered under a conformal
transformation and, hence, the construction is again coordinate independent.
We shall first perform such an attachment of null boundary in the case of
Minkowski spacetime and then analyze the structure of this null infinity. This
also provides an answer to the question of when to call a general spacetime
asymptotically flat. The criteria for the asymptotic flatness of a general spacetime
would then be that (since the Minkowski spacetime is asymptotically flat by
definition), if the general spacetime being considered admits a null infinity whose
structure is similar to that of MinkowskJan null infinity, then such a spacetime
will be called asymptotically flat.
In spherical polar coordinates, the Minkowski line element takes the form
(29)

115

Topics in Spacetime Structure


Introducing advanced and retarded null coordinates given by

v=t

+ r,

u=

t -

r.

(30)

takes us to a reference frame based on null cones which is most suitable for
analyzing the radiation fields, and the metric becomes

(31)
Now the information at future null infinity corresponds to taking limits as v ---> 00,
i.e. we go along u = const. light cones all the way in future. Similarly, past null
infinity corresponds to u ---> 00. But this is yet a coordinate-dependent description
which may not easily generalize to general spacetimes. This can be achieved by
introducing a conformal transformation of the original metric
(32)

with
0 2 = (1

+ V2 )-1(1 + U 2 )-I.

(33)

We also introduce new coordinates p and q by

u = tan q.

v=tan p,

(34)

Then the coordinate ranges for p and q are

-n12 < p < n12,

-n12 < q < nl2

(35)

and the metric becomes


(36)

It is now possible to see that the metric (36) with coordinate ranges (35) is just
a manifold embedded in the Einstein static universe. To see this, write
T= P + q,

= P - q.

(37)

Then (36) becomes in (T, R, 8, (jJ) coordinates,

dg 2 = dT 2

dR 2

sin 2 R(d8 2

+ sin 2 0dip2)

(38)

with coordinate ranges

-n < T+ R < n,

-n < T- R < n.

(39)

This is precisely the natural Lorentz metric on S3 x IR, which is the Einstein static
universe, except that the coordinate ranges are restricted by (39). The future null
infinity here is given by T = n - R for 0 < R < n, where as the past null infinity is
given by T = - n + R for 0 < R < n.
We now introduce the notation] for the future (past) null infinity. To analyse
the structure of I +, we note that, as mentioned earlier, given any event along the
time axis in the Minkowski spacetime, the future light cone at that event can be
labelled by a constant value of the retarded time u. Next, the set of all possible

P. S. Joshi

116

directions of light rays is in one-one correspondence with the set of points of the
two-sphere S2. Hence, 1+ is a three-dimensional manifold which has the topology
of S2 x IR.
To see several features related to r clearly, it is very useful to write the metric
(29) in (u, r, (, coordinate system, where u is the retarded time defined earlier and
(, I; are complex stereographic coordinates on sphere defined by

( = eiq> cot(0/2).

(40)

The metric could then be written as [10]


ds 2

= du 2 + 2 du dr -

d(dl;
r2 - P~

(41)

where Po = (1 + (~/2. Introducing a new coordinate 1 = r- 1 and a conformal


factor 0 = I, the metric becomes
ds 2

= 12 du 2

4 du dl _

d~~I;.

(42)

As r -+ 00, 1-+ 0 and the null infinity r is defined by the condition 1 = O. Future
directed null cones are characterized by the values u, (, I; and these can be used as
coordinates on 1+. In these coordinates, a hypersurface of I + has the metric
d

2 _

s -

d( dl;
P~

(43)

In the coordinate system (u, (, 1;), it is easy to see that

ao

ox i = (0,

(44)

1, 0, 0,)

and

.. ao aOI

g'J oxi ox i

1+

(45)

=0.

Thus, n is differentiable on M, the new unphysical manifold with boundary, and


oO/ox i is a null vector. Since 0 = 0 at 1+, this gives that r is a null hypersurface
and its null generators are defined by (, I; = const.
As a final comment on the null infinities J and the conformal compactification
of spacetimes to attach these null boundaries, we consider here a somewhat more
general metric, which is that for a Schwarzschild spacetime. This metric,
describing the geometry to the exterior of an isolated, nonrotating body, can be
given in the (u, r, 0, cp) coordinates as,
ds 2

(1 - 2~) du

+ 2 du dr -

r2(d0 2

+ sin 2

0d cp),
2

(46)

where u = t - r - 2m log(r - 2m) is the retarded time. As before we use stereo-

117

Topics in Spacetime Structure

graphic coordinates (, [ and conformally transform the metric by


which gives
ds 2

[2

ds 2

W-

2m(3) du 2

4 du d[ -

d(d[

P6 .

n = [ = r-1,
(47)

r,

The new coordinate [ is finite for r = 00 and the metric (47) is regular at
defined by [ = 0, u finite. Thus, the metric on a u = const. section of 1 + is again
given by (43) and, hence, the topology of the null hypersurface r is S2 x IR again
as in the case of Minkowski spacetime. Thus, according to the criteria set up
earlier in this section, Schwarzschild spacetime will be considered to be
asymptotically fiat. The conformal compactification procedure can also be
carried out for nonasymptotically fiat, cosmological situations, such as Friedmann
models, with useful consequences. We refer to [11] for details of such an
approach.

References
1. R. P. Geroch and G. Horowitz, in S. W. Hawking and W. Israel (eds.), General Relativity, an
Einstein Centenary Survey, CUP, Cambridge (1979).
2. S. W. Hawking and G. F. R. Ellis, The Large Scale Structure of Spacetime, CUP, Cambridge
(1973).
3. B. Carter, Gen. Relat. Grav. 1,349 (1971).
4. F. J. Tipler, Ann. Phys. 108, I (1977).
5. P. S. Joshi, Phys. Lett. 85A, 319 (1981).
6. J. C. Burkill, Theory of Ordinary Differential Equations, Oliver and Boyd, Edinburgh (1956).
7. A. K. Raychaudhuri, An approach to anisotropic cosmologies, chapter 6 in this volume.
8. J. V. Narlikar, General Relativity and Cosmology, Macmillan, London (1978).
9. R. Geroch, E. Kronheimer and R. Penrose, Proc. Roy. Soc. 327A, 545 (1972).
10. P. S. Joshi, C. Kozameh and E. T. Newman, J. Math. Phys. 24, 2490 (1983).
11. R. Penrose, in C. M. DeWitt and J. A. Wheeler (eds.), Battelle Rencontres, W. A. Benjamin, New
York (1968); R. Penrose, in General Relativity, an Einstein Centenary Survey of Ref. [1] above.

8. Differential Forms and


Einstein-Cartan Theory
A. R. PRASANNA
Physical Research Laboratory, Navrangpura, Ahmedabad 380009, India

Notation

In the following we use the Lorentz metric with signature - 2, and Latin letters
for vector fields and Greek letters for covector fields or forms. The usual
conventions, E (belonging to), u (union) and n (intersection) are adopted. The
[...J around a set of indices denote antisymmetrization, 0 denotes the Cartesian
product, c5 denotes the variational derivative, the partial derivative and D, V,
or ; the covariant derivative. Einstein's summation convention is adopted
wherever necessary.

1.

Basic Definitions

Manifold: A differentiable Manifold M is a collection of open neighbourhoods


Vi on M such that (1)
Vi covers M and (2) if <Pi are one-to-one mappings, of Vi
onto open subsets of Rn, then for any two neighbourhoods Vi and Vj with
nonempty intersection, the mappings <Pi and <Pj are differentiable functions of
each other in the overlapping region Vi n V j
Chart and Atlas: The set Vi along with its mapping <Pi' (Vi' <PJ is called a chart
and a collection of charts {(Vi' <PJ} is called an atlas. If Xi are a set of local
coordinates defined by the map <Pi on Vi' then Vi is called a local coordinate

Ui

neighbourhood. The set of all possible coordinate systems covering M is called


a complete atlas.
A manifold is said to be orientable if there exists an atlas {Vi' <Pi} in the
complete atlas such that in every nonempty intersection Vi n V j , the transformation Jacobian

is positive.
M is said to be paracompact if, for every atlas {Vi' <p;}, there exists a locally
finite atlas {l-j, I/IJ with each Vj contained in some Vi' (An atlas is said to be

119
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 119-132.
\[j 1989 by Kluwer Academic Publishers.

A. R. Prasanna

120

locally finite if, for every point p E M, there exists an open neighbourhood which
intersects only a finite number of sets in the atlas.)
Afunctionfon M is a mapping from M to R I. Given a chart (U i , CPJ of M, where
U i is any local coordinate neighbourhood of a point p E M, if the expression
f CPi- 1 is a C function of the local coordinates at P, then f is said to be of class C
at p. If it is so for every p E M, then f is a C function.
A curve ).(T) in M is a map of an interval in RI into M.
The set of all real-valued functions on M forms a ring. If F(M) denotes a subring
of the ring of all basic functions on M, then a vector V (also known as tangent
vector) at a point p on M is defined as a linear mapping of F(M) to R such that for
f and g E F,
0

V(fg) = V(f)g(p)

+ V(g)f(p)

In particular, if A(T) is a C l curve on M, then the tangent vector Vat the point
p = A(To) is the mapping that takes each C l functionfat p to a number

(af /aT),\(ro)'

Equivalently, Vf is the derivative of f in the direction of the curve A(T) at T = To.


lf [[ is a coordinate neighbourhood of the point p with a local coordinate
system Xi, then by definition

af )
vf == ( aT ro

dxi

afl

dr axi ro'

which means that the set of elements (a/ax\ forms the basis of every vector Vat p.
It follows, therefore, that the set of vectors or tangent vectors at p forms a vector
space called the tangent space Tp at p. The union of all the tangent spaces for all
p E M forms what is known as the tangent bundle T(M) on M.
T(M) =

U Tv

pEM

If we consider the basis elements of the tangent space denoted by ei , then there
exists a dual basis e j such that (e j, e) = (j!, the Kronecker delta. The elements
constructed using these objects e j are called co vectors or m-forms or differential

forms.
The space spanned by the covectors is called the cotangent space T;.
By definition, a differentialform is a real-valued linear function on the tangent
space. If at p EM, V is a vector and w is a one-form, then w maps V onto a number
in the same way as V maps a function at p onto a number.

(w, V)

(Pjei, aie)

pia i.

In particular, as we are using a coordinate basis, we will have ei = a/ax i and then
obviously the dual basis e j = dx j . Thus, we have that everyone-form w can be
expressed in terms of its components with respect to a local coordinate system
as

Xi

121

Differential Forms and Einstein-Cartan Theory


2.

Algebra and Calculus of Forms

As the one-forms or co vectors are the elements of the dual space *Tp of Tp, which
is a linear vector space, they satisfy the same rules of addition and scalar
multiplication as vectors

(OJ, (aU +

bY)~ =

a(OJ, U) + b(OJ, V)

(2.1)

where

OJET;,

U,VETp ,

a,bER.

Given two linear one-forms OJ and (J, their product form is given by the exterior
product or wedge product defined by
OJ

1\

(J

= (OJ

(J -

(J

OJ) = -

(J

(2.2)

OJ

1\

which is a two-form. This product is distributive and associative. With this rule
we can form covectors or differential forms of higher order (any) as follows.
If A. is a r-form and 11 is a s-form, then
). 1\

11 = (_I)rs 11

1\

(2.3)

Thus, in terms of a coordinate basis, a r-form which is also known as a completely


skew-symmetric covariant tensor of type (0, r) can be expressed as

A -

~,Ala!
~ ... a,] dx at
r.

1\

dx

a2

...

1\

u,

(2.4)

dx .

The space of 'forms' constitutes an associative but noncommutative algebra of


dimension 2" (n being the dimension of the manifold) over the set of reals called
the Grassmann algebra associated with the vector space.
2.1.

Exterior Differentiation

The operation of exterior differentiation is a mapping that takes an r-form to


a (r + 1) form, such that
(a) d(). + 11) = d). + dl1,
(b) d(). 1\ 11) = d). 1\ 11 + (_I)de g A ). 1\ dl1,
(c) d(dOJ) =
for every OJ - Poincare lemma.

Since functions are zero-form fields by operating with d, we get one-form fields

df - the well known total differential. In terms of a coordinate basis we have


ilf

(2.5)

df=~dx.

ox

Acting on an r-form field )., we get the (r

1
.
dA = '(Ala ..... a,J).i dx' 1\ dx a!
r.

...

1\

+ 1) form

dx a,.

field
(2.6)

A. R. Prasanna

122

2.2.

Mapping of Forms

Supposing we have two manifolds M and N with fas a mapping of U E Minto


V EN. Let Xi and yi be the local coordinates on U and V, respectively. If ). is an
r-form on N its pull back on M, f* A is given by using the differential relations
between the two local coordinate systems Xi and as follows

..
.
1
A = -r! ). [,...... ,,j
. dX'l /\ dX'2 /\ ... /\ dx",
.
f* A = -r!1 ),[" ..... ",1 [ax~J'
- . dyl' /\ dyl;. ... /\ dyh.
ayh

(2.7)

This pull-back operation (f*) is independent of the nature of coordinate systems


and commutes with the operations + and /\ and further, satisfies
(a) f*(A + /1) = f*) + f*/1,
(b) f*U- /\ /1) = (f* ).) /\ (f* /1),
(c) for every m-form w on V, we have

d(f*w) = f*dw
(d) f: U

g: V ~ W, then

V,

(fog)* = g*cf*.

(2.8)

Property (c), which says that the exterior derivative d and pull back f* are
commutative, is essentially a reassertion of the chain rule for the partial
derivative, for the exterior derivative of a differential form is independent of the
coordinate system in which it is computed.
The algebra offorms on a manifold with the operations sum ( + ), product (/\),
and differentiation (d) is called the Cartan differential algebra.
Simple examples: on E3
Any function f on E3 is a zero-form. Any vector field of type w = P dx
Q dy + R dz is a one-form and a polar vector field of type '1 = A dy dz
B dz dx + C dx dy is a two:form.
Applying the exterior derivative operator d on these yields
af
df = -a dx
x

af

af

+ -a
dy + -a dz,
.y
z

the Gradient
dUJ

= dP /\ dx + dQ /\ dy + dR /\ dz
=

(aRay _~Q)
dydz + (ap _~~) dz dx +
oz
az
ax
ap) dxdy,
aQ - + (ax

ay

+
+

Differential Forms and Einstein-Cartan Theory

123

which is the Curl or rotation, and


OA
dry = ( -

oB OC)
+ -;- dx dy dz,
ox + -;uy
uz

the divergence.
With these identifications, it is trivial to verify that the Poincare lemma
d(dw) = 0 is simply a reexpression of the familiar identities
Curl(grad f) = 0

div(Curl V) = O.

and

One other operation of interest in the space of forms is the interior product
defined betwen a vector V and an m-form p denoted differently as
V J p or iyp or i(V)p.
If p is a one-form
i.
j
i
,i
i
VJ p=ve
i Pje =vPjUj=Vll i

The iy operation thus reduces the degree of the form. For an m-form
1

p = -m! p [1I. .. Im]


.

VJ

/I

/"'

.
eli

1\ '"

1\

.
elm,

1.
.
Vlp... e /2
(m _ I)!
JIl ... lm

1\ ... 1\

elm.

(2.9)

The operator iy satisfies


(a) iy f = 0,
(b) iy e i =

Vi,

(c) i~ = O.

(2.10)

The most interesting one is that the operations iy and d together define the Lie
derivative of forms given by
2"y = iy d

+ diy,

(2.11)

with respect to the vector field V.


With this it can be verified that
and
dp(V, W) = 2"yp(W) - 2"wp(V) - p[V, W].

3.

Connection and Curvature Forms

As our aim in dealing with differential forms is to use them in the study of theory
of gravitation, we need to consider the objects on a four-dimensional spacetime

A. R. Prasanna

124

manifold M which we assume to be C"', connected, Haussdorf, and oriented,


having a Lorentz metric g defined on it. Since at each point of the manifold we can
define a set of one-forms in the cotangent space of M, there exists a field of
coframes gi which are linearly independent at each point of M and which, in
general, are anholonomic. If Xi denotes a set of local coordinates, then the gi are
determined by functions ),i k

ai =

(3.1 )

A\dx k

which are often called tetrads or vierbeins.


As M is Haussdorff and possesses a Lorentz metric, it is paracompact (due to
a theorem of Geroch) and, thus, there exists a connection w. Now the metric 'g'
and the connection ware described, with respect to the chosen frame
by

ai,

w ij

t ikj ak ,

(3.2)

wherein gij and tikj are functions on M, denoting the components of the metric
and the connection.
If Ak, denotes a mapping M -> GL(n, R), the general linear group, then a change
of frame 0' -> rP: Ok = Ak,rP, induces changes in the components of the metric
tensor gij and the connection one-forms w~ as given by

Y'm =

gijAi,Aj m'

(3.3)

If rr is a homomorphism of Lie groups, GL(n, R) -> GL(N, R), for any


a E GL(n, R), rr (a) is a nonsingular N x N matrix with elements denoted by rr~(a),
A, B = 1, ... , N. The derived homomorphism of the corresponding Lie algebras
rr': L(R") -> L(RN) may be represented by the matrix
rr

,k -_ (rr B' Ak ) -_

arr BA(a)i
k

aa

amn=b~

(3.4)

An rnlorrn of type rr on M may be defined as the set rp = (rpA) of N fields of


rn-forms associated with each field of frames ai, such that with rJk the associated
field of forms ip A is given by
rp = (rroA)ip,

(rroA) being the composition map M ~ GL(n,R) ~ GL(N,R).


The Covariant Exterior Derivative D of an rn-form t/J A of type rr, with respect to
the connection one-forms w i j , is an (rn + 1) form of type rr defined as
(3.5)
For a scalar-valued form rp, Drp = drp, and for a tensor field, Drp A = akvkrp A' the
usual covariant derivative.

Differential Forms and Einstein-Cartan Theory

125

With this we can now define the torsion and curvature two-forms given by
E)i

==

Dei = de i

o.kl = doll

+ oi j

+ wk m

(3.6)

curvature form,

(3.7)

wm l

J\

torsion form,

J\ e j

which in terms of their respective tensor-valued zero forms


written as

Q)k

and R~mn' are

(3.8)
As the connection one-form w i j = r ikj 8k, one can easily verify that the torsion
tensor Qijk and the curvature tensor R k lmn are given in terms of the connection
coefficients rL as

rL + Ci jk ,

Qijk = nk -

R klmn

= r~l.m -

r~l.n

(3.9)

+ r~pr~l -

r~pr:;" - CP mnr~l'

(3.1 0)

In the above, Ci jk , called the 'object of anholonomeity', is a function that depends


upon the tetrad functions Aik and their derivatives. However, because Qand Rare
both tensors, we can always choose C to be zero without loss of generality.
Now considering the exterior derivatives of E)i and o.k l , we get

(3.11)
and

do.\ =
do. k l

dw k m

+ w km

J\

wml

J\ o.ml -

wk m
wm l

dw m l ,

J\

J\ o.k m

0,

i.e.
(3.12)
which are, indeed, the Bianchi identities.
In relation to the four-dimensional manifold, we know that there exists
a completely anti symmetric tensor 11ijkl of rank four such that 111234 = [det gij[1/2,
which is a zero-form. This zero-form, along with the forms
11 ijk = 11 ijkl 81,

11i =

1
311ij
J\

e'J,

11ij

= 1211ijk

11 = i11i

J\
J\

8k ,
8 i,

(3.13)

A. R. Prasanna

126

span the Grassmann algebra of M and, further, we have

+ bjIJkli IJijk = bLIJij + b~IJki + blIJjk'

(rIJijkl = biIJijk - bk'IJ,ij

e' 1\

ek
ei

b7'IJjkl'

= bh - b7IJj'

1\

IJij

1\

IJj = b)IJ

(3.14)

Using the metric tensor gij' we can raise and lower the indices of IJi ... I.
If the frames e' and iJl infinitesimally differ, such that [Jl = e' + be' with
be' = -a'mem,where aim: M -> .P(R4), .p(R4) is the associated Lie algebra, then
the corresponding changes induced on gij and W k, are given by
D being the covariant exterior derivative.
4. Einstein-Cartan Theory - The Gauge Theory of Gravity

The special theory of relativity which has the Poincare group as the global
symmetry group, is the most fundamental of all physical theories. In fact, the Lie
algebra of Poincare group has two basic invariants physically associated with
mass and spin - the two fundamental quantum numbers of the elementary
particle phenomena. If we now gauge this group and make it the local symmetry
group, it is only natural to look for the corresponding set of invariants in the more
general theory.
We start with a four-dimensional manifold endowed with all the usual
properties and consider the field offrames ei . Supposing that we start with a flat
manifold, then an infinitesimal change of frames ei -> iJi, as defined earlier, gives
us the relation
(4.1)

at

with a ik + aki = 0 and


are constants. This essentially represents the usual
Lorentz transformation and under this, the Cartan derivative also yields a similar
transformation
(4.2)

and we can recover all the results of special relativity. Let us suppose that in
a manner similar to gauge theories, we now say that the aik's are functions oflocal
coordinates Xi, meaning that the corresponding Lie group is now a local
symmetry group. It is obvious that bd no longer preserves its transformation
properties and, instead, we will have
(4.3)

Further, under these circumstances a ik

+ aki =

bg ik , which is not necessarily zero

Differential Forms and Einstein - Cartan Theory

127

and, in fact, gik could now be functions of (Xi). This calls for the introduction of
extra fields which is done through the definition of a covariant exterior derivative
having connection one-forms w i j . Using the definition of D, we have
DO i = dO i

+ Wi.J

/\ OJ

and so
bDo i = bdO i

+ bw ij

/\

OJ

+ w ij

/\

bO j .

(4.4)

Requiring that
b(DOi) = - oc i k(DO k),

(4.5)

we get the condition that


bw iJ.

/\

OJ = (doc i J. + WikOC kJ. - w kJ.OC ik) /\ OJ

(4.6)

As this must be true for all OJ, we have the condition


bw ij = Doc i j

(4.7)

which is the same as the one we obtained earlier for the general infinitesimal
transformations of O. Hence, we have now the result that on a general differential
manifold, if the field of frames are to be invariant under transformations from
point to point, then we need to introduce the connection form to be determined
by 64 functions
(w i j = r i kj 8k ). These may be considered as the gauge fields. It
is clear, so far that, no particular relation between the metric g and the connection
w~ is assumed.
At this stage we have 90 functions (10 gij' 16 )"i k , 64 r ik) to be determined as
self-consistently representing the physical field on the spacetime manifold,
namely, the free gravitational field, without any matter fields. Again, in the same
spirit as that of gauge theories, we look for a Lagrangian, constructed from the
fields, to be determined and their derivatives, viz., g, 8, w i j and dw ij . Amongst the
existing set of forms, the only object that fills the bill is the Ricci four-form

n/

(4.8)

R being a scalar-valued zero-form and 11 the volume four-form. Writing explicitly


we have
K = l.[glm.,
4
',kmpq 8q /\ 8P /\ (dw k1 + w kp /\ w P1)]

(4.9)

and under the change of frames 81 = A1mtJm, as glm and Wkl have corresponding
transformations, it can be easily verified that K remains invariant and, thus, is
globally defined.
If we now vary the frame, the metric, and the connection independently, we get
bK

= {i( - glr gms bg rs l1kmpq +


+ 211k'pqgrsbgrs) /\ oq /\ 8P

/\

Qkl

A. R. Prasanna

128

+ ir//p/(j(Jq

1\

(JP

+ (Jq

1\

(j(JP)

+ t[d('1/ 1\ (jW kl) - dd'1/


+ '1k l 1\ Wkm 1\ (jw m,]}

1\

gkl

1\

(5w\ - '1k l

1\

Wml

1\

(jW km +
(4.1 0)

which, on simplification, yields


(4.11 )

with
E rs

= t(grs'1k l - g,IYfkS - gsl'1{)

e p -- .12'1pkl

1\

okl,

1\

gk I'

CkI = D '1k.I

(4 .12)

If the variation in K were to be induced only by a mere change of frames W-+ ~),
keeping the metric g and the connection W l k fixed, then according to the induced
changes in gij and w/, we get
(jK = Ersrx,s - ep
=

(E: - e,

1\

1\

rxPI(J1 - tCkl

1\

Drxkl

+ td(Yfkl

(Js - tDC:)rx's.

1\

Drx kl)
(4.13)

As rx; is arbitrary, (jK = 0 leads to the identity


E:

= e,

1\

(Js

+ tDC:.

(4.14)

Since the basic field quantities to be determined are the metric and the
connection, the field equations may be obtained by applying the variational
principle that the action is stationary with respect to small variations of
dynamical variables. As the metric g = gij ()i (Jj, a variation in g can be brought
about either by fixing gij and varying (J' or vice-versa.
Thus, considering the action integral (j JK = 0 for infinitesimal variations in gij
and w~ keeping (Ji fixed, we get the set of equations

Eii = 0 and

C\ = 0,

(4.15)

and, similarly, if we vary (Ji and w~ keeping gij fixed, we get the set of equations

ei = 0 and

Ck l

= O.

(4.16)

In view ofthe identity (4.14), these two sets of equations are equivalent and, thus,
all the information has to come out of either one of them.
Writing explicitly, we get the equations

= 0,
'1ik + '1/ gpq

R{ - tR(j{
2Dgjk

1\

(4.17)
1\

Dgpq

+ 2e l 1\

'1/1 = O.

(4.18)

So far, we have kept all objects as general as possible. Since our intention is to
look for physical fields, we now make a choice of gauge in restricting our field of
frames to only orthonormal frames which satisfy the relation
(4.19)

Differential Forms and Einstein - Cartan Theory

129

which means that we can write our metric as


9 = gkl(.Je) dxi dx j = YJij.Jeik.Je jl dx k dxl.

(4.20)

This means that we either have 10 functions gkl or 16 functions .Jeik to be


determined to get the information regarding the metric. Equations (4.18) tell us
that if in the manifold we had apriori chosen E>i = 0, that is, the torsion is zero
(r}k = nJ, then we get Dg,j = 0 meaning that the metric is a covariant constant,
which explicitly gives
(4.21)

or

r'jk =

1 'm( gjm.k
2g

+ gkm,j -

gjk,m )'

(4.22)

Thus, we have the gauge fields r uniquely determined in terms of the metric
components gij' The other set of equations yields the familiar Einstein's equations
Rij = 0 which exactly determine, the metric, Thus, if the torsion is zero, we have
the familiar Einstein theory of gravity on the Riemannian manifold with a LeviCivita connection. On the other hand, if we had a priori chosen that the
connection W'j is a metric connection, then Dg ij = 0 would lead to ()' = 0 from
Equations (4.18), thus implying that the torsion is zero, and the gauge fields rare
determined from gij which, in turn, are determined by Einstein's equations. Thus,
for the free gravitational field, we have the metric and connection uniquely
determined with vanishing torsion, which is exactly Einstein's general relativity.
If instead of the variation of gij' we had considered a variation of (}i with gij = YJij'
then we would have to determine 16 components of ),!k with only 10 equations,
which means that the tetrads are uniquely determined up to Lorentz rotations
(six degrees of freedom) which, again, is the familiar general relativity.
What happens within the matter distribution is the more interesting part of this
development. It is obvious that if we include the matter Lagrangian, then the
variation with respect to the connection would yield a nonzero quantity on the
right-hand side of Equation (4.18) and, thus, even when Dg ij = 0, E>' i= 0, the
torsion does not vanish.

5.

Gravitation in the Presence of Fermionic Matter

As is well known, mass and spin (the two fundamental entities of the elementary
particle phenomena) are respectively associated with the groups of translations
and rotations in special relativity. However, in Einstein's general relativity, mass
expressed as the energy-momentum of matter has a dynamical role and
determines the curvature of spacetime, whereas spin has no such influence on
spacetime, As the symmetric energy momentum tensor is directly related to the
variations in the matter Lagrangian induced by the infinitesimal variations of the
metric tensor gij' Tij = 2b!t'M/bg'i' it is clear that the only conserved quantity

130

A. R. Prasanna

with dynamical properties is the one obtained that corresponds to the group of
translations. On the other hand, in the above 'gauge approach', one needs to
consider conserved quantities under both translations and rotations and, thus,
look for a positive role for spin in the description of spacetime dynamics.
While discussing the free fields, we saw that the dynamical variables arise both
from the metric and the connection and, a priori, there is no relation between
these two. Hence, while introducting the matter Lagrangian 2 M , a function of
spinorial fields I/J A and their covariant derivative DI/J A' we need to distinguish
separately the variation of 2M with respect to g the metric and wi the connection.
We shall first consider the meaning of the variation in the connection as follows:
As we are discussing spinor fields, it is necessary that we restrict ourselves to
orthonormal frames 8i , for which one can define parallel propagation. If at a point
P E M, we consider the frame 8i and propagate it parallely to the neighbouring
point P' (which can be done through the use of D) and compare it with the natural
frame lJi at P', the difference between the two is given by
(5.1)
If, further, we have the connection to be a metric linear connection, then
(5.2)

which implies, for (5.1), a rotation of frame. Hence, we find that a parallel
propagation of a frame to a neighbouring point, induces a rotation in the frame
through the connection. Thus, infinitesimal variation in connection also induces
a rotation of the orthonormal frames if the connection is metric. In the usual
Einstein's theory, since we can always introduce geodesic normal coordinates
for which we can choose r to be zero, by proper choice of frames, we can make
this effect vanish. But in the present theory, e i the torsion associated with
the antisymmetric part of the connection, being a tensor, cannot arbitrarily be
made to vanish and, thus, the rotation induced on frames cannot be removed.
Thus, one has to associate this feature of variation of the connection with some
intrinsic property of the system, and define a new entity conserved under
rotations. From the analogy of special relativity, if one takes the total angular
momentum as being associated with the rotational invariance of the action
Lagrangian 2 M , then the only new dynamical entity that arises is spin, as the
orbital part is expressible as a cross-product of the energy momentum tensor and
the radius vector (T1Xj - T)x k ). With this, one can define the 'spin density' as
given by the three-form:
(5.3)

As the variations in the metric may also be introduced through 8i, keeping gij
fixed, one can have the canonical energy momentum vector-valued three-form
ti = J2 M/J8 i . If we now consider the combined action 2 = K + X2M and use
the variational principle J 2 = 0, we get, for variations in gij' w~, and I/J A with 8i

Differential Forms and Einstein - Cartan Theory

131

fixed, the set of equations

C/ =

-xS/,

or, equivalently, for variations in ei,

C/ =

(5.4)

w/ and If; A with gij fixed, the equations

5b~:

-xS/,

O.

(5.5)

Further using (5.4) and (5.5) in (4.14), we get the identity


(5.6)
Thus, we have the complete set of equations written in terms of tensor-valued
zero-forms
R! - tR5! = -xt{,

(5.7)

i i I
l
i
Qjk-5jQlk-5
ki Qjl=-XSjk

(58)
.

and
with the tensors t{ and Sjk i being defined through the respective three-forms
(5.9)

Equations (5.7) are the well-known Einstein equations, whereas (5.8) are the
Cartan equations. The set (5.7) and (5.8) form the equations of Einstein-Cartan
theory, the natural generalization of general relativity that includes spinorial
matter. Analogous to general relativity, one can get the conservation laws by
using Bianchi identities with the above field equations.
Considering the covariant derivative of the three-form ei and C ki ' after using
(3.11) and (3.12) we have
1
Q,I
D ei = l.D(
2
'lijk /\ un.jk) = Z'lijkl
'0

DC kl = -'lkm /\
=

e1 /\ 8k

oml

ek

+ 'lIm
/\

n.jk ,

/\..

(5.10)

/\ omk

81,

(5.11)

which, on using (5.5) and after simplification, yield


Dt j

= 'l[ _Qkjmtm k + tSklmR1mjkJ,

DS kl = tl /\ k - tk /\ l,

(5.12)
(5.13)

the conservation laws for energy-momentum and spin, that give the usual
conservation of canonical energy-momentum, Dt j = 0, in the absence of spin.
It is indeed clear from the above that if there is no spin associated with the
matter, then the spin density Sijk = 0, yields from the field equations the
torsion Qijk = 0 and the theory reduces to the usual Einsteinian relativity.

132

A. R. Prasanna

Thus, the Einstein-Cartan theory provides a natural framework for considering the geometrization of mass and spin with their respective roles in influencing
the curvature and torsion of the manifold. As it associates conserved quantities
under both translation and rotation, it seem to be the correct generalization of
special relativity for all observers. As torsion does not propagate outside the
material distribution, the tests of general relativity which have all been for the
behaviour of spacetime outside the material distribution, are perfectly valid.
Acknowledgement
The material presented in these lectures depends heavily on the work of A.
Trautum, the references for which are listed below, along with a few other
references considered to be useful.
References
1. H. Flanders, Differential Forms Academic Press. New York (1963).
2. S. W. Hawking and G. F. R. Ellis, Large Scale Structure of Space Time, CUP, Cambridge (1972),
Chapt. 2.1-2.3.
3. A. Trautman, On the Structure of Einstein-Cartan Equations, Symposia Mathematica, Vol. 12,
(Bologna) (1973).
4. A. Trautman, Theory of gravitation, in J. Mehra (ed.), The Physicists Concepts of Nature (1973).
5. F. W. Hehl et al., General relativity with spin and torsion, Rev. Mod. Phys. 48 (1976).
6. A. R. Prasanna, Proc. Int. Symp. Relativity and Unified Field Theory, S. N. Bose Inst., Calcutta
(1976), p. 149.
7. A. R. Prasanna, Phys. Letts. 54A, 17 (1975).
8. R. de Ritis et ai., Phys. Letts. 98A, 411 (1983).

9. N. Mukunda, Gravitation as a gauge theory, in A. R. Prasanna, 1. V. Narlikar, and C. V.


Vishveshwara (eds.). Gravitation and Relativistic Astrophysics, World Scientific (1984).

Part II:
Introduction to Particle Physics
and Gauge Field Theories
This part of the book contains material put together specifically keeping in mind
the needs of those whose background is basically in general relativity and
cosmology. It seeks to present, in a compact form, the theoretical framework and
methods of calculation, and a description of the factual situation in elementary
particle physics.
R. P. Saxena (Chapter 9) introduces the basic principles of relativistic
Lagrangian field theory, first in the classical context and later in the quantized
form. He discusses various free fields, their quantization, Lorentz invariance and
the important discrete symmetries. Going on to interacting quantum fields, the
invariant perturbation theory and Feynman graphs are succinctly discussed.
Renormalizability and renormalization methods are covered, with emphasis on
the method of dimensional regularization.
The chapter by 1. Pasupathy (Chapter 10) gives a description of the
phenomenology of particle physics, more or less in a historical sequence. The
various interactions, their strengths and symmetries, associated selection rules,
and details of the particle spectrum are discussed, with frequent presentation of
orders of magnitude of physical quantities. Isotopic spin, strangeness, baryon and
lepton numbers, and the discrete symmetries C, P, T and their combinations are
explained. The presentation includes )is invariance of the V-A form of weak
interactions; the quark-gluon picture for strong interactions based on SU(3)c; and
the gauge principle for electrodynamics and for non-Abelian theory. The
Glashow-Weinberg-Salam model is briefly sketched.
The next chapter, by G. Rajasekaran (Chapter 11), carefully builds up, step by
step, the standard gauge model of particle physics based on the group
SU(3)c x SU(2) x U(1). It is expressly written for those without prior exposure to
these ideas. Spontaneous symmetry breaking via the Nambu-Goldstone mode,
and then via the Higgs mode for gauge theories, are presented via examples, first
for the Abelian U(l) and then for the non-Abelian SU(2) case. The physically
interesting SU(2) x U(1) model is then taken up. The emergence of massive
vector bosons is demonstrated. After this preparation, the 'standard model' ofthe
late 60's prior to the gauge theory revolution, based on the V-A current-current
weak interactions, minimal electromagnetism, and an unspecified strong
interaction, all in quark-lepton language, is set up. It is then compared to the
standard gauge model of SUJ3) x SU(2) x U(l). The compelling reasons for
QCD as the theory of strong interactions are spelt out. An introduction to
renormalization group methods as the main calculational tool for QCD,
133
B. R. I yer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 133-134.
1989 by Kluwer Academic Publishers.

134

Part II

asymptotic freedom, infrared problems, and physically motivated reasons for


going beyond the standard model are presented.
In Chapter 12, K. C. Wali begins more or less at the point reached by the
previous chapter and gives a pedagogical introduction to Grand Unified
Theories (GUT). After discussing the general features to be expected in any such
theory, as well as the motivations for them, a detailed presentation of 5U(5)
theory is given. The group structures, particle multiplets, gauge and Higgs bosons
are well explained. The two stages of spontaneous symmetry breaking via the
Higgs model, are calculated individually and in combination. Fermion mass
matrices and relations between quark and lepton masses are derived. Predictions
of 5U(5) theory, calculated using renormalization group methods, are derived.
The chapter ends with discussions that bring together particle physics and
cosmology, including the baryon asymmetry problem, phase transitions in the
very early universe, and singularities like domain walls, vortex lines, and
monopoles.
All in all, the material included should give a good idea of the current scene in
particle physics and particle theory, and the ways in which it merges with
cosmology in the understanding of the early universe.
Two brief mathematical supplements complete this part: on topology and
homotopy by B. R. Sitaram (Chapter 13), and on compact Lie groups and their
representations, by N. Mukunda (Chapter 14). These chapters are not intended to
be exhaustive, but just to indicate the main ideas in these areas.

9.

Introduction to Classical and


Quantum Lagrangian Field Theory

R. P. SAXENA
Department of Physics and Astrophysics, Delhi University, New Delhi 110 007, India

Classical Lagrangian Field Theory

1.

Classical field theory may be regarded as a generalization of Lagrangian


mechanics in the sense that .generalized coordinates which are functions of
a parameter (time) get replaced by fields which are functions of local parameters
in a four-dimensional continuum, viz. spacetime coordinates. These local
functions or local fields, being the generalized coordinates of the classical field
theory, satisfy Euler-Lagrange equations of motion which are called field
equations. The field equations result from an appropriate action principle, just as
in classical mechanics.
The starting point in classical field theory is an action functional which is an
integral over a suitable four-dimensional volume of a certain Lagrangian density
2. The Lagrangian density 2 must be
(i) a function of all the local fields (say , t/J, . .. , All'" .) and their spatial
derivatives (a/1' allt/J, av A/1)'
(ii) invariant under orthochronous Lorentz transformations,
(iii) scalar or a pseudoscalar under spatial reflections, and
(iv) invariant under time reversal.
In addition, it may satisfy other internal symmetry requirements. For a single real
scalar field, one may write the Lagrangian density as 2(cp(x), aIlCP(x and the
action integral as
I =

In d

x2( cp(x),

a cp(x.

(1)

II

Here n is a four-dimensional volume with temporal extremities


respectively.
1.1.

t2

and

tl,

Action Principle

The action functional I must be an extremum for arbitrary variations i5 in fields

135
B. R. /yer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 135-162.
1989 by Kluwer Academic Publishers.

R. P. Saxena

136

such that
(2)

Then

(3)
after partially integrating. Here L is the three-dimensional hypersurface of the
volume n with temporal extremities t2 and t1 and nil (x) is the normal to the
hype'rsurface. However, this hypersurface is located at spatial infinity. We assume
that our fields vanish as Ix I -+ OCJ and, therefore, the surface integral in (3)
vanishes.
For the action function to be an extremum for arbitrary variations in the field
JcjJ, we must have
off

aft'

~ - o"OOl'cjJ

O.

(4)

This is the Euler-Lagrange field equation.


It must be noted that ft' is uncertain to the extent of a four divergence
ft" = ft'

+ o"P(cjJ(x)),

(5)

with ft" and ft' giving rise to the same equation of motion.

1.2.

Canonical M omentum, Hamiltonian

The momentum variable of the field theory that is canonically conjugate to cjJ(x),
is defined via the functional derivative
n(x) = -aft'
.ocjJ(x)

), bemg
. t he tIme
.
d'
. )
('V
envatIve.

(6)

The Hamiltonian density of the field is now defined to be


Yf = n(x)(x) - ft'.

It would be nice if one could arrange that Yf is positive semidefinite.

(7)

137

Classical and Quantum Lagrangian Field Theory


1.3.

Examples

(a) Complex Klein-Gordon field


This field has two independent degrees of freedom, rjJ(x) and rjJ*(x) which, in the
sense of complex variable theory, are complex conjugates of each other. Both rjJ(x)
and rjJ*(x) satisfy the free Klein-Gordon equation
(8)

The Lagrangian density which yields (8) via an appropriate action principle, is
(9)

The canonically conjugate momenta are


n(x)

= *(x), n*(x) = (x)

(10)

and the Hamiltonian density is


Yf = n(x)n*(x)

+ VrjJ(x) VrjJ*(x) + 11 2 rjJ(x)rjJ*(x)

:;" O.

(11)

We shall discuss this example in greater detail later on.


(b) Dirac Field

The standard choice for the Lagrangian density for a classical Dirac field is
(12)

!f' = l!i(x)(iy/Y' - m)ljJ(x).

The resulting Euler-Lagrange equations for the independent components ljJ(x)


and 1[I(x) are
(iYfl8fl - m)ljJ(x)

=0
(13)

From one-particle quantum mechanics, we know that 1[1 = IjJtyo, implying


a relationship between IjJ and 1[1. However, such a relationship in field theory
would result in a symmetry of the theory which we shall discuss later.
The canonical momenta are
n(x) = i1[l(x)yo,

rr(x)

o.

(14)

This could have caused serious trouble, but an anticipation on our part that the
quantization of the Dirac field will not follow the canonical path, enables us to
postpone a discussion of this point.
The Hamiltonian density is
Yf = 1[I(x)( -iyV

+ m)ljJ(x).

(15)

This expression is also not trouble-free in as much as it is not manifestly positive


definite. We shall discuss the resolution of both these problems in quantum
theory.

R. P. Saxena

138

(c) Electromagnetic Field


A description of an electromagnetic (e.m.) field at the classical as well as quantum
level, although very old, is beset with many problems. It is well known that the
e.m. field possesses only two independent degrees of polarization, thus if one
wishes to describe an e.m. field by a four-vector function A/x), one has to find two
equations of constraints. Hopefully, the Maxwell-Lorentz subsidiary condition
along with gauge invariance of the second kind does the job, but to carry out these
formal procedures, to maintain manifest Lorentz covariance at each stage of the
theory, and to follow the canonical path of mechanics, is, to say the least, quite
difficult.
In 1932, Fermi wrote a paper in Reviews in Modern Physics which overcame
the difficulties mentioned above. Since great men admit to having learnt
electrodynamics from this paper, we shall follow it for a while. The Fermi
Lagrangian is
!t? = -to/lAJxWAV(x)

(16)

This differs from the standard Maxwell Lagrangian in a term proportional to


0/lA/l(x). The equation of motion satisfied by electromagnetic potential is
o!t?
o!t?
oA - 0000 A
/l
0
/l

=0

implying
DAI'=O.

(17)

The supplementary condition is


0/lAI'(x) = 0

(18)

and all four components of AI' can be treated as independent dynamical variables
subject to (18). The canonical momenta are
7r1'(x) = - oOA/l(x).

(19)

We shall return to a fuller discussion of the e.m. field later.

2.

Canonical Quantization

The canonical quantization of fields results from changing the Poisson brackets
of classical mechanics to Heisenberg's commutators, thereby assigning to the
fields and their canonically conjugate momenta the significance of linear
operators in a suitable Hilbert space (to be constructed). The Hamiltonian of the
system of fields acquires the significance of a quantized operator. To discuss all
the ramifications of the procedure of canonical quantization of fields, we take
a simple model and discuss it in some detail.

Classical and Quantum Lagrangian Field Theory


2.1.

139

Complex Klein-Gordon Field

The canonical quantization of this field is achieved through postulating the


following equal time commutators (ETC)
[n(x,x o ), cf>(y,Yo)]xo=YO = -ib 3 (x - y),

(20)
all other equal time commutators between cf>, cf>*, n, n* being zero. Using (20) and
(11), it is easy to prove that
[H, F(cf>(x), n(x))] = - i ooF(cf>(x), n(x),

(21)

where F is any Taylor expandable function of cf> and n. Thus, canonical


quantization implies that the Hamiltonian of the field acts like the generator of
time translations in an operator sense. Are there other operators which
implement Lorentz transformations or translations in space, in our theory?
Further, do these operators in some sense describe invariances of the theory? In
what follows, we shall attempt to answer these questions.
2.2.

Meaning of Lorentz Invariance

In quantum theory it is possible to ascertain Lorentz invariance (or for that


matter, any invariance) without any reference to the particular form of equations
of motion. Let us consider a fixed coordinate system and imagine an apparatus
that serves to prepare a physical state t/J A' Consider now another physical
apparatus related to the first by means of a Lorentz transformation, which
prepares the state t/J A" Consider a measuring apparatus M which performs
measurements on t/J A while M'(similarly related to M) performs measurements on

t/J A"

We recall that in a quantum mechanical measurement, one measures only


probabilities, e.g. the probability that the system t/J A on the action of M (i.e., on
measurement) provides an electron of momentum p described by the state
cf>p, is given by 1(cf>p, t/J A) 12 . The corresponding measurement on A' by M' finds this
probability to be I( cf>p, t/J A' W, The requirement of Lorentz invariance is that these
two must be equal,

Since the Hilbert space of states contains all states, and if A' and A are related
by a Lorentz transformation (L T) A' = AA then
t/JA = U(A)t/J A

and

cf>p = U(J1)cf>p.

(22)

Equality of probabilities implies that U(A) is either unitary or antiunitary.


For inhomogeneous LTs,
(23)

such an operator looks like U(A, a), where A is a homogeneous LT and


represents translations.

all

R. P. Saxena

140
Invariance of the probability
1(1/1 A" r/J(x')1/1 A'W

= 1(1/1 A' r/J(x)1/I A)1 2

(24)

under the transformation (23), yields the transformation property of the field

r/J(Ax + a)

= U(A, a)r/J(x)U- 1 (A, a).

(25)

The existence of such a unitary (antiunitary) operator in our theory implies the
implementation of LI. Such operators can be explicitly constructed.

2.3,

Particle Interpretation, Fock Space

The physical structure of quantum field theory (QFT) becomes transparent, if one
performs the following Fourier expansions

r/J*(x) =

f2W~
d3

U:(x)A + (q)

+ Jq(x)B(q)],

(26)

where
/q(x)

(2n)-3/2 e -i q'x,

qo = (q2

(q

+ /1 2 )1/2 = w q.

qo,q),
(27)

The Fourier amplitudes A, B (and their complex conjugates) of the classical


field, now take on the significance of operators.
It is now easy to work out that
[H, A(K)] = -wKA(K),
[H,At(K)] = wKAt(K),
[H,B(K)] = -wKB(K),
[H,Bt(K)] = wKBt(K).

(28)

From (28), it is clear that the operators A and B act as annihilation operators of
energy, i.e. if there exists an eigenstate of H with energy E called 1/1 E' then A(K)I/I E
or B(K)I/I E are both eigenstates of energy E - WK' In a similar manner, At (K)I/I E
and Bt (K)I/I E turn out to be eigenstates of energy E + WK, thus enabling us to call
A t and Bt as creation operators of energy.
If we consider the momentum operator of the field

(29)

Classical and Quantum Lagrangian Field Theory

141

defined as the generator of spatial infinitesimal translations, we find that


[Pj' A(K)] = - K jA(K),
(30)

Pj' At(K)] = KjAt(K)

(similar relations being valid for Band Bt). If we now repeat the arguments given
above, it is clear that the operators A(K), (A(K)) also create (annihilate)
momentum. Thus, At (K), (A(K)) creates (destroys) a quantum (packet) of energy
(w K ) and momentum (K) which satisfies w K 2 = K2 + /1 2 and, thus, may be said to
have particle-like attributes or, in other words, the operators At (K)(A(K)) create
(destroy) particles. A similar interpretation also holds for B-type quanta:
created (destroyed) by Bt(B).
We now define the ground state of the quantized field as

= B(q)l/Jo = o.

A(q)l/Jo

(31)

This, however, immediately leads to the first problem with field theories, i.e.

(32)

Thus, to maintain I/J 0 as the state of zero energy (usually chosen for the ground
state), one needs to perform an infinite shift in the energy of the system. This
procedure corresponds to arranging all creation operators to the left of the
destruction operators and is called Normal ordering. Now
H =

d q wq(A t (q)A(q)
2wq

+ Bt (q)B(q)).

(33)

Normal ordering at each stage removes these infinite vacuum expectation values
of physical quantities.
The states of the quantized field are now constructed through successive
operations of creation operators on the vacuum. These are labelled as
(34)

(35)
The states (35) are orthonormal and complete and form the basic set in a Hilbert
space designated by the number of quanta of a given type. This space is
remembered by the name of its inventor V. Fock.
It is clear that a number density operator can be defined by
(36)

142

R. P. Saxena

which counts the number of quanta per unit volume in momentum space.
Commutators between fields at unequal times can be worked out using (20)
and the KG equation. These general commutators are LI and were first discussed
by Pauli and Jordan.
[(x), * (y)]

f ~:: (e-iq.(x-

id(x - y,j1)

_1_ fd4q5(q2 _ j12)e(q )e-iq.(X-y).


(211l
0

(2:)3

y) -

eiq.(x- y ))
(37)

The invariant d-function satisfies the following conditions

+ j12)d(x,j1) = 0,
(2) d(x, 0, j1) = 0,
(3) 8od(x, j1) Ixo = 0 = - i 53 (x),
(4) d(x) = -d( -x),
(5) d(x) is LI,
(6) For x 2 < 0, d(x) = 0,
(1) (0

i.e., this function is causal.


Remark. The vanishing of any commutator means the possibility of the precise
measurement of both the dynamical variables in the sense of uncertainty
principle.
The above requirement is also called 'microcausality' or 'local commutativity'.

3.
3.1.

Discrete Symmetries
Gauge Invariance and Charge Conservation

The theory outlined above is invariant under gauge transformations of the first
kind:
(x)

-+

(x)e io , *(x) -+ *(x)e- i"

(38)

leading to a conserved current density


(39)

The integrated fourth component of the current leads to the global invariant
charge
Q=

J o d 3 x = -i d 3 x[n(x)(x) - n*(x)*(x)].

(40)

Classical and Quantum Lagrangian Field Theory

143

Expressed in terms of the creation and destruction operators, the charge operator
becomes (in normal ordered form)
Q

d3q (At(q)A(q) - Bt(q)B(q)).

2wq

(41)

It is now clear that the charge carried by A and B-type quanta is equal and
opposite

[Q, A(q)]

= - A(q);

[Q, B(q)] = B(q).

(42)

This charge need not necessarily mean electrical charge. Further, the precise
relationship between the charges of A and B-type quanta leads one to suspect
some symmetry of the theory:

3.2. Charge Conjugation

If there exists an operator C such that


C(x)C~ 1 = *(x),

Cn(x)C ~ 1 = n*(x)

(43)

and
CQC~l =

cct

-Q,

ctc

(44)
(45)

= 1.

This operator is obviously unitary and could represent a symmetry of the theory.
The definitions given above are equivalent to the fact
CA(q)C~ 1 = B(q),

CB(q)C~l = A(q),

(46)

implying that the operator C converts A-type quanta into B-type and vice-versa.
Further, it is easy to see that C commutes with the Hamiltonian of the
Klein-Gordon field (Eqs. (46), (11)), thus it represents a true symmetry of the
system. The A and B-type quanta are, therefore, called antiparticles of each other.
A little trial and error is sufficient to guess that
(47)

3.3.

Parity

If a QFT is invariant under spatial inversion or, in another language, parity is


conserved, the states 1/1 and 1/1' (in a mirror-reflected frame) must be related by
a unitary transformation Us, i.e. 1/1' = Us 1/1. This unitary transformation must

144

R. P. Saxena

also be such that


(IjI,PIjI) = -(IjI',PIjI')

= Us-1pUs == -P,

(IjI,JIjI) = +(IjI',JIjI')=Us-1JUs=J,

(1jI, QIjI) = (IjI',QIjI')

= Us-1QUs = Q.

(48)

From these physical requirements, it is obvious that certain bilinear combinations of and * or, equivalently, A and B, should appropriately transform. In
particular, the number operators should remain invariant
(49)

It is obvious HOW that in the transformation law of the creation and destruction
operators, there can be an uncertainty of a phase, i.e.
U; 1 A(q)U s = 'IA( - q), U s- 1 B(q) Us = '1* B( - q), '1'1* = 1.

(50)

Thus
(51 )
and so on. These phases are not measurable. For the field operators, the
transformation law looks like
(52)

For free fields, 'I can be chosen arbitrarily. However for an interacting system of
fields, the 'I'S for various fields must be chosen so that, in addition to (48), one
must also have
Us-1HUs=H.

(53)

This may not always be possible.


3.4.

Time Reversal

Under time reversal, the states of the system undergo a reversal of momenta and
spins. Wigner in his pioneering paper in 1932, showed that for time reversal, the
relevant operator U in quantum theory must be chosen to be an anti unitary one.
(54)
U= VK,
where V is unitary and K is a c-number operation denoting complex conjugation. K satisfies

Kt=K,

K2=1,

K(aljll + INz) = a*KljIl + P*Kljlz,


(K IjI l' K IjI 2) = (1jI 2' IjI 1 ).

(55)

We shall now see that the equations of motion of our theory force an antiunitary

Classical and Quantum Lagrangian Field Theory

145

time-reversal operator. Let T, the time-reversal operator be defined by

T(x,t)T-l = (x, -t),

(56)

T(x, t)T- 1 = *(x, -t).

(57)

or

The equation of motion is


d
dt (x, t) = i[H, (x, t)],

d
Tdt(x,t)T- 1

i[THT- 1 , T(x,t)T- 1 ].

If THT- 1 is equal to H, then, using (56),


d
- d-( (x, - t)
-t)

= i[H, (x, -

t)]

and, using (57),

- d(

~ t) *(x, -

t) = i[H, *(x, - t)].

(58)

Neither of these alternatives, leaves the equation of motion form invariant.


Clearly, the time-reversal operator must include a c-number operation which
performs complex conjugation.
If one chooses

T= VK,

V(X,t)V-l = *(x, -t)

(59)

and K performs complex conjugation, then


T ( x, t) T -

= ( x, - t)

(60)

thus restoring form invariance to the equations of motion. In general, there may
be a phase in (60).

4.

Interacting Fields

When the Lagrangian of a system of fields involves polynomials in fields of degree


greater than 2 (i.e. the resulting equations of motion are no longer linear), then
one is dealing with an interacting system of fields. Clearly, such Lagrangian
systems must
(i) satisfy Lorentz invariance,
(ii) should involve as few derivatives of the fields as possible, and
(iii) should satisfy all other internal symmetry requirements.

R. P. Saxena

146

Examples
(i)

!t'

= lfI(x)(i -

e~(x) - m)t/!(x)

+ 2' e.m.

(61)

where !t'e.m. is the Fermi Lagrangian for the e.m. field (Equation (16)). This
Lagrangian describes the interaction of a Dirac field t/!(x) with the
electromagnetic field.
(ii)

!t' = 0l'* iY' - /1 2*

+ tA(*)2.

(62)

This Lagrangian describes the self-interaction of a complex KG field.


!t' = t(0l'0l

(iii)

+ }(01'1t)2

!/1 2(IT2 + 1t 2) + V(IT 2 + 1t 2)2.

(63)

This Lagrangian describes the self-interaction of two scalar fields IT and 1t as well.
The IT field is a scalar and 1t is a vector under a suitable internal symmetry
transformation, e.g. the isospin symmetry.
4

(iv)

!t' =

i~l lfIJi -

m)t/!i

+ ./2lf11 YI'(1 + YS)t/!Zlfl3yl'(1 + YS)t/!4

(63a)

This Lagrangian describes the V - A interaction of four fermions.

5.

Invariant Perturbation Theory

It has been found profitable to perform an expansion in powers of the coupling

parameter to discuss the behaviour of interacting fields. An invariant


perturbation theory was developed by Dyson which described, in a transparent
manner, the earlier results obtained by Feynman, Schwinger, and Tomonaga.
The time development of the fields in Dyson's theory is neither performed in the
Schrodinger nor in the Heisenberg representation, but in an intermediate
representation called the interaction picture.
Let the free fields be denoted by the label in(X, t) and their conjugates by
nin(x, t), (t here tends to - 00 corresponding to incoming fields) and the
interacting fields by and n. Further, let there exist an operator U(t) such that
(x, t) = U -1 (t)in(X, t)U(t),
n(x, t) = U - 1 (t)nin(x, t)U(t).

(64)

The equations of motion satisfied by and in are

ot (x, t) = i[H(<p, n), <p],

ot in(x, t) = i[Hin(<Pin, n in ), <Pin]

(65)

H in is the Hamiltonian of free fields with physical properties (i.e. masses and
charge).

147

Classical and Quantum Lagrangian Field Theory

ot </Jin(X, t) = ok (U(t)</J(x, t)U

-1

(t))

U</JU- 1 + U4>U- 1 + U</JU- 1

= UU- 1</Jn
.

= [UU

_ 1 i

+ i[H(CPin,1!in)'CPinJ + CPin UU - 1
.

+ IH(CPin, Rin), CPin].

(66)

Clearly,
H(</Jin, 1!in) = Hin(</Jin, 1!in)

+ HI(t)

(67)

and
(68)

Here HI represents the interaction Hamiltonian. (C(t) is a c-number chosen as


zero).
Solution of 68

Define
U(t,t')

= U(t)U-l(t')

(69)

then
. aU(t, t')
,
l-a-t- = HI(t)U(t, t)

(70)

subject to the boundary condition U(t, t)


equation
U(t,t') = 1 -

if

= 1.

We convert (70) into an integral

H I(t 1)U(t 1,t')dt 1 for t

~ t'.

(71 )

The operator U(t, n has certain group properties


U(t, t') = U(t, t")U(t", n,
U(t,t') = U-1(t',t).

(72)

A formal iterative solution of (71) is written as


U(t,t') = 1 +

nt f f' f2 f
(-i)"

dt1

dt z

dt3 ....

x T(H r(t 1)H 1(t 2 ) H1(tn)),

n
,

dt n x
(73)

where the time ordered product T is defined as


(74)
Utilizing certain symmetry properties of T products under integration, we get
(75)

R. P. Saxena

148

or, alternatively,
V(t,t'} = Tex p [ -i

= Tex p [ -i

f
f

H1(t)dtl

d4 XAin(X))}

(76)

where

or, in covariant language, integral over a spacetime region bounded by


t' < Xo < t.
5.1.

Perturbation Expansion of r- Functions

Define r-function as the vacuum expectation value of the time-ordered product of


a set of field operators
r(x 1

...

xn) = <01 T((xd(xz) (xn))IO)

(77)

= <OIT(V-l(tl)in(Xl)U(tl'tZ)in(XZ) x
x V(t n-1' tn)in(Xn)V(tn))IO).

(78)

We now write
(79)

such that t is a reference time larger than all t l ' t z , . .. , tn. Thus, if we take the limit
t --+ 00, one can write
r(x 1 ,xa,,Xn) = <0IV-1(k)T(V(t,td<Pin(X 1) x V(t 1 ,t 2 )<Pin(X Z ) x

x V(tn _ l' tn)<Pin (Xn) x V(tn, - t))U( - t) 10)


= <01 V-

(t)T(<Pin(X 1 )<Pin(X Z) <Pin(Xn} X

x Vet, -t))V( -t)IO)

<

(80)

(81)

0IV- 1 (t)T(Q;'in(X 1 )<Pin(X Z) ... <Pin(X n) x


(82)

The operators V-I (t) and V( - t) on the extreme left and right, are such that the
vacuum state is an eigenstate of these operators. After a little computation in the

Classical and Quantum Lagrangian Field Theory

limit

t -. 00,

149

it can be proved that

<

0IT(<Pin(X 1 hPin(X 2) ... <Pin(Xn) exp( -ifd4XXI(X)) 10

{<

01 T(ex p -i f d4 xX I(X)lo )

)x
(83)

The perturbation expansion for a r-function consists of expanding the


exponentials in (83) and retaining up to a given power of Xl in the numerator
and denominator.
It can be shown that the vacuum-to-vacuum amplitudes, or the so-called
disconnected graphs, cancel the contribution to a given r-function up to a given
order coming from the denominator. Thus, the only physically interesting parts
of the r-functions are the ones coming from connected graphs and they are given
by

(84)
Wick's Theorem

T(<p(x 1 ) .. <p(xn

= :<p(x 1 ) . <p(xn): +
+ [(01 T(<p(X 1 )<P(x 210):<p(x 3 ) <p(xn): +

+ Permutations] + [r(x 1 x 2)r(x 3 X4):<p(X S ) <p(X n): +


+ Permutations] + .................. +
+ [r(xIX2)r(x3X4) ... r(xn_lxn)(ifn even) +
+ Permutations] + or +
+ [r(x 1 x 2)r(x 3 x 4) r(Xn-2Xn-l):<P(Xn): +
+ Permutations].
Basic result:

T(<p(X 1 )<P(X 2 = :<p(X 1 )<P(X 2 ):

(85)

+ r(x 1 x 2 ),

where rex 1 X 2 ) is also called the Feynman propagator.

(86)

R. P. Saxena

150
5.2.

Feynman Graphs

Every contraction of two fields represents a world line joining two spacetime
points. With the help of such lines, every term in the Dyson-Wick expansion of
a r-function can be given a graphical representation which is of great help in
understanding the physical mechanism underlying a process.
Example

Consider the interaction of a Dirac field with an e.m. field (see Equation (61)) and
the second-order contribution to the r-function
(87)

Here r:t. and f3 are the spinor indices of the Dirac fields and II and v the polarization
indices of the EM field. The second-order contribution is
r

(2)

( - I')2 2

(X1X2Z1Z2) = -2-!-

d Yl d Y2 (01 T x

(ljJ~n(z2)f~n(Zl)A~n(Xl)A~n(X2)

fin(Ylh,AljJin(ydA~n(Yl) X

fin(Yz)I'''ljJin(Y2)A~n(Y2))10)

(88)

If one denotes the e.m. field by a dotted line and the Dirac (electron) field by
a solid one, the following contractions are possible (Figures 1 and 2). As argued
X2V

Z2 P

Z2P Z2 13

X2v

0 0 ~f)
;

Y2

Yl

I
I

Xl

Z l C1.

Xl

D!sconnected

Yl

Z,C1.

Fig. 1.

Zl C1. Xl

diagrams)

Classical and Quantum Lagrangian Field Theory

151

before, only the diagrams labelled (A) and (B) need to be considered. The
T-product of the four photon operators in (88) is factorized as
<01 T(A::(X1)A~n(xz)A~n(Yl )A~n(Y2))IO>
=

<01 T(A~n(Xl )A~n(y 1)) 10> <01 T(A~n(xz)A~n(Y2))IO> +


+ <01 T(A~n(x 1 )A~n(Y2)) 10> <01 T(A~n(X2)A~n(Y1)) 10>,

(89)

leading to the two diagrams. In a similar manner, if one does not permit
contraction of z 1 and z2' a typical contraction is
T(Fermion operators) = 1/I/1(Z2)1/1;.(YZ)(y'T);.p x
x 1/1 p(Yz)I/1Il(Y l)(l )IlV 1/1 v(y 1)l/1a(Z 1)
=

Thus, one obtains


,(2)A(X 1 XZZ1Zz) = -e 2

(SF(Z2 - YZ)y"SF(Y2 - Y1) x


x Y'l.SF(Yl - Zl))/1 .

f
f

d 4 Yl d 4 yz

(90)

DFv,,(x Z - Y2)D FIl ;'(Yl - Xl) x

x (SF(Z2 - Y2)y"SF(Y2 - Yl)y;'SF(Yl - ZI))/1.,


,(2)B(X 1 X Z Z 1 Z Z )

= _e Z

d 4Y1 d4 yz

DFv;.(x Z - Yl)DF!1"(YZ - Xl) x

x (SF(ZZ - Y2)Y"SF(Yz - Y1)Y "SF(Yl 5.3.

(91)

Zl))/1..

(92)

S-matrix

Using LSZ formalism, it is possible to prove that for the scattering of an electron
of momentum PI and a photon of momentum ql into an electron and photon
state with momenta Pz and q2' respectively, in the second order of perturbation
theory
S(Z)(P2Q z V, P1Q1/l) =

d4 x1 d4 x Z d4 z l d4zzf42V(XZ)Up2(ZZ)

x (iz2 - m)(Dx),(Z)(X1XZZ1ZZ) x
x (-iZI - m)(DX)upl(Zl)f~1(X1)'

(93)

Here
(94)

R. P. Saxena

152

is the free photon wavefunction of polarization SJ1(q I) and

1m e- ip .
V~ (2n)3/2 u(p)
x

up(x) =

(95)

is the free particle solution of the Dirac equation.


Several trivial but routine steps have to be performed. Let us substitute ,(2)A in
(93). The operations by Dirac and Klein-Gordon operators on the respective
Green functions (i.e. propagators) yield 4 four-dimensional b-functions, which
can be used to integrate over X 1 X 2 Z 1 and Z2. The remaining propagator
SF(YZ - YI) is written as a Fourier transform
S (

F Y2

YI

)=

d4 p

(2n)4 e

-ip(Y2-yd

I'

+ rn
p2 _ rn 2 + is

(96)

The integrations over YI and Yz yield 2 four-dimensional b-functions among the


momenta PlqlPzq2 and p. The final integration over p yields the b-function
denoting energy momentum conservation. Finally,
S(2JA -

2
--

- (2n)2

1
4WI

XU(P2)Y v

(rn2
)1/2 x
---

w 2 Epl Ep2

(PI

PI + il + rn2

+ ql)Z -

ev(q2)el'(ql)b 4(PI

U(P2)YI'

rn

+ ql

. yl'u(pd x

+ IS

(97)

P2 - q2)

and

PI + ti2 + rn . yVu(PI)
+ q2)2 - rn 2 + IS
Sv(Q2)SI'(ql)b 4(PI + Ql - P2 - Q2)
(PI

(98)

Looking at expressions (97) and (98), it appears that given any process, one simply
draws the allowed Feynman graphs and through a recipe, directly obtain the
S-matrix for the process in momentum space. This recipe is a set of Feynrnan rules.
Quantum electrodynamics is a highly successful theory. A vast amount of
experimental data involving e.m. interaction can be explained to a good degree of
accuracy. However, in calculating certain higher-order diagrams, it is found that
their contribution to the S-matrix is a meaningless divergent quantity. The
essence of the renormalization program, is in the interpretation of these divergent
quantities. It is successfully argued that the: quanta of the fields or particles in the
absence of interactions are in an unphysical, totally unobservable state and their
physical attributes, like mass and charge etc., are the so-called unrenormalized
quantities. Nobody has seen them, so their magnitude can be anything, even

Classical and Quantum Lagrangian Field Theory

153

divergent. The divergent contributions to the S-matrix can now be lumped


together in the form of a few distinct divergent quantities called charge, mass, and
coupling constant renormalization constants etc. These terms can then be used to
redefine the charge, mass etc. of physical particles and should be finite. If the
number of such divergent quantities in a theory is finite and does not increase
with the order of perturbation, the theory is called a renormalizable one.
Quantum electrodynamics is a renormalizable theory, V-A theory of weak
interactions is nonrenormalizable. In what follows, we shall attempt to illustrate
these.
6.

Primitive Divergences in QED

There are three basic diagrams in QED which are divergent (Figure 3). All further
divergences are iterations of these.

J> 0 ~
P,

I-L

:k

(a)

---q

;>.,.'

ci

( b)

Fig. 3.

These diagrams are called electron self-energy (a), vacuum polarization or


photon self-energy (b), and the vertex graph (c). The contributions to S-matrix
from these diagrams may be written as
Sla) =

u(p)I(p)u(p),

_e-2
"(p) = 1..(2n)4

g
d 4 k ~/1_V_
k 2 + it;

(99)
X

"l(p-~+m)yV
(p - k)2 - m2 + ic'

(99a)

(called the electron self-energy part)


Sib)

= If(k)n/1Jk)c (k),

(100)

nl1 Jk) = (2e:)4 d4 p Tr (y /1(P

+ m)yv(1' - + m)) x

x [(p2 _ m2)((p _ k)2 _ m 2 )]-1

(tOOa)

is called the vacuum polarization term,


SIc) =

u(p')A I1 (p, p')u(p)c/1(q),

e2

A/1(p,p') = (2n)4

f g.ki/'", _ ~1 _ m y/1 x p_ ~1 _ m yPd

is called the vertex part.

(t01)
4

(lOla)

R. P. Saxena

154

To evaluate these divergent integrals in the past, one had to perform


a regularization. These days, one performs the so-called dimensional regularization due to 't Hooft.
The final result for the primitively divergent integral for electron self-energy is
I(p) = A

+ (p

m)B,

[1 11

-iiX 3m-+2m
A=-4
n
e
- m(l

(102)

dx(l+x)ln (m2x

+ p 2x(1
- X))
2
4np

+ 31') + O(e)],

+ (1 + y) + O(e)].

(103)

(104)

Here, e = 4 - d, d being the dimension of space, p is the mass introduced to make


the effective charge dimensionless in d dimensions and I' is the Euler constant. The
divergent quantity A is the self-mass counterterm and B is related to the
wavefunction renormalization constant Z2' as we shall see later.
In a similar manner, the vacuum polarization term and the vertex part can be
evaluated. It is seen that
(i) the result is automatically gauge invariant, a feature not found in other
regularization schemes;
(ii) the only singularity of these expressions occurs as a pole as e -+ 0 or
d -+ 4, thus rendering the extraction of divergent parts unambiguous;
(iii) the finite part of the vertex part yields a measurable contribution to the
so-called 'anomalous' magnetic moment of the electron in second-order
perturbation (i.e. from (101)) equal to iX/2n. This 'anomalous' magnetic
moment is now a very accurately measured quantity. It has also been
calculated up to the sixth order in perturbation theory. The comparative
numbers are:
Theory:
Experiment:

1159 652 359 (282) x 10- 12


1159 652 410 (210) x 10- 12

(105)

Another experimental number which tests the theories to a similar degree of


accuracy is the Lamb shift. The self-energies of electrons bound in atomic orbits
turn out to be dependent on the environment in which they are bound. The
divergent part is independent of the environment leading to the same mass
renormalization effect for all electrons. The finite parts of the self-energy depend
on the environment, thus leading to a lifting of degeneracy (present in the absence

155

Classical and Quantum Lagrangian Field Theory

of self-interaction). The difference in any two such levels is called the Lamb shift.
Historically, it was observed between the 2S 1/2 and 2P 1/2 levels of the H2 atom by
Lamb and Retherford in 1947. The current status of the Lamb shift, vis-a.-vis
theory and experiment, is similar to (l05).
7.

QED as a Renormalizable Theory

Quantum electrodynamics as defined by the Lagrangian (see Equation (61)) is


a meaningless theory on account of the divergences present in the theory.
However, through a series of redefinitions (called renormalizations), it is possible
to absorb these infinities in the definition of physical quantities. An additional
Lagrangian incorporating these so-called counterterms, makes the net or
renormalized QED a finite theory, order-by-order in perturbation theory. In
what follows, we illustrate this procedure in the second order perturbation
theory.
Let the unrenormalized Lagrangian be written in terms of renormalized
quantities t/J, All' m, e and a, a gauge fixing parameter (not needed for theories with
0IlAIl = 0). This Lagrangian is (to make use of the results of dimensional
regularization, we work in a Euclidean space of dimensions 2w)
!fun =

~Fllvp'v

+ 2a(oIlAIl)2 + lfi(~ + im)t/J + ief.12-wlfi "It/J.

(106)

We now perform the renormalizations

= ZY2t/J,

t/Jo

All = Zj/2 All,

eo = ef.12-wZdZ2ZjJ2,

= m + Jm = mZm/Z2'

mo

ao

=Za/2 3 a -

l .

(107)

Here, the divergent quantities 2 1, Z2' Z3' Zm and Za all are pieces of the three
primitive divergent diagrams represented by 1:, n llv , and All' We work them out
to be

(1 + )
(1 + ) +
(1 + )= +
2

e 2
Z 1 = Z 2 = 1 - 16n
2

e 2
Z 3 = 1 - 12n
2

e2
Zm = 1 - 4n
Za =
Here

I: =

finite

finite

1+ 1~:2 G+

4 - 2w.

finite
=

= 1+ K

1 '

K2'

Km,

finite) = 1 + Ka .

(108)

R. P. Saxena

156

The structure of the primitive divergent terms suggests that the counterterm
Lagrangian must have a renormalization of the kinetic energy of the electron, the
mass of the electron, the photon selfenergy, charge renormalization and,
possibly, a modification of the gauge fixing term. The coefficients of such terms
can now be written down as

+ imKmlfJt/J + tK3F/lvp'v +
+ ie/12- w K 1 IfJJt/J + t K a (oIlAIl)2.

2'e.t = K21fJ~t/J

(109)

The net renormalized (physical) Lagrangian is now

+ imZrnmlfJt/J + tZ3FllvP'v +
+ ie/12- w Z 1 IfJJt/J + !Za(ow41l )2.

2'ren = Z21fJ~t/J

(110)

The Lagrangian written above has no infinities until the second order of
perturbation theory.
The equality of Z 1 and Z 2 is called the Ward identity and is true order by order.
It has been proved that the divergences in QED to all orders can be absorbed in
the multiplicative renormalization constants, the Z's.

8.

V-A Theory as a Nonrenormalizable Theory

The clue to the renormalizability of QED came from the failure of naive power
counting. By naive power counting, ~(p) should have been linearly divergent, TI llv
quadratically divergent, and All logarithmically divergent. Evaluated properly,
all these diagrams turn out to be only logarithmically divergent (in the language
of dimensional regularization, a pole at d == 4). The reason is gauge in variance. In
addition, the number of primitively divergent diagrams in QED is finite and, thus,
the degree of divergence in QED does not increase with order, enabling us to
perform the renormalizations.
In V-A theory one has neither a gauge principle nor is the number of primitive
divergent graphs finite. Further, the degree of divergence of diagrams
contributing to a physical process increasles with the order of perturbation. The
electron self-energy for the Lagrangian (63a) is quartically divergent to order g2
and diverges like 8 in the fourth order (Figure 4). With no gauge principle to
provide cancellations among diagrams of the same order, the degree of
divergence just keeps increasing. Further, the structure of counterterms needed in
V-A theory, firstly, does not correspond to the pieces of original Lagrangian and,

Fig. 4.

Classical and Quantum Lagrangian Field Theory

157

secondly, goes on increasing with the order of perturbation. Thus, V-A theory is
not renormalizable.
9.

Dimensional Regularization

For performing dimensional regularization of a divergent integral, we continue


to a 2w-dimensional Euclidean continuum. In such a continuum, spin! fields
have dimension -w +1, spin 1 fields have dimension -w + 1. The Euclidean
space Feynman rules are shown in Figure 5, with the factor Jl2 -w being necessary
to make the charge dimensionless in 2w dimensions.
5lJ.. v

/p 2

-i/({S+m)
.

- I

2-

IJ..

l.U

Yp

Fig. 5.

Fig. 6.

- -(eJl

2-w 2

d 2w e
-i
JJ1.V
(2n)2w YJ1.t _) + m Yvp.

(111 )

In Euclidean spaces, the Dirac matrices satisfy


{YwYv}
YJ1.YJ1.
YJ1.Y p Yp.

= =

2J J1.V'

-2w,

= (2 -

2(2 - w))Yp.

(112)

We rewrite the Feynman propagator as


-i/(rj

+ m) =

i(rj - m)!(q2

+ m2)

(113)

and introduce the Feynman parameter integral to obtain


(114)

Since the integral over the momentum variable I is now a convergent one, we

R. P. Saxena

158
introduce a new variable of integration
/--+ / - px

(115)

in terms of which

The term linear in I in the numerator in (115) vanishes on integration because of


oddness and the others can be simplified by using (112). Thus, we get
I(p) = _i(ep2-W)2

x [2wrn

L f(~;;2IW

+ p(1

dx

x)(2 - 2(2 - w))] x

UZ + ~)- 2,

(116)

where
(l16a)
For a N-dimensional Euclidean space, one can write
(117a)

for

o :s; I :s;

o :s; <p :s; 2n,

00,

(117b)

Using the beta function representation

"/2

.
(Sill

r(l)r(rn)

t)21-1 (cos t)2m-l dt = - - - 2r(l + rn)'

(117c)

one can show that


IN

2nN/2

r(N /2)

fCD
0

(117d)

IN -1 dlF(l).

Using results like (117d) and introducing the notation 2 - w


evaluate the integral over / in (116) completely and get
I(p) = _ie 2 p 2&2-2w+l n -wr(8)
x ~ -&((2 - 8)rn

or, alternatively,

+ (1

= 8,

one can

dx x

- 8)p(1 - x))

(118)

Classical and Quantum Lagrangian Field Theory

-ie
L(p) = 8n2 rce)

f1 dx(M4 np 2)-e
0

x [(P(1 - x)

+ 2m) -

159

e(p(l - x)

+ m)].

(119)

The only singularities of this expression are the poles of the r -function as e ---+ 0,
-I, etc. corresponding the number of dimensions equal to 4, 6 etc. The pole at
e = is the manifestation of the ultra violet divergence of the electron self-energy
diagram. This pole term can be readily identified using

1/1(1)

-y,

(120)

1/1'(1) = :

and performing an expansion in powers of e in (119). Thus, we get


-ie 2
L(p) = - 1
2
6n

+2

[1

-(/

+ 4m) - {p(1 + y) + 2m(1 + 2y)} +

Ldxln(~/4np2){p(1

- x)

+ 2m} + o(e)}

(121)

This is precisely the expression previously obtained.


9.1.

Vacuum Polarization

Fig. 7.

Using the Euclidean space Feynman rules in 2(1) dimensions, we get:


(122)

(123)

160

R. P. Saxena

Introducing Feynman parameter integration and a new loop momentum


I ---> I - px, we get

n~v=

_(ejl2-w)2

J:

dx

f(~~~~ x

-+ p(1 - x) - m)yv(/
- px - m)
---=-'::--2
[p -+ m -+ p2 x(l __ X)]2

Tr [y~(J

------"----=-.,.,.--'-----,,----;;-__

(124)

In 2m dimension we can choose the y-matrices to be 2w x 2w dimensional. Their


properties (112) now tell us
Tr(y~yJ

-2wJ~v,

Tr (y~yv YAY) = 2W(J~v b AP - b~A bvp

-+ b~p bVA )

(125)

If we invoke the rule of integration over odd functions and uses (125), we get (after
some algebra)
rr~v = _(ejl2-w)22W

1f
1
0

dx

d 2W l

(2n)2W x

b~vI[12 -+ m2 -+ p2x(1 -

X)]}.

(126)

Evaluating the integrations over I by using the previously outlined procedure we


get

1 - (;I y x [ 6B
The pole at

= 0 and

11 dx x(1 0

x) In

the manifest gauge invariance of the result is obvious.

Vertex Part

1:/

p2 X(I-X)
(m 2 -+ 2njl2
-+ O(B) ] . (127)

".-:;..:::..- ....
'\(J

P - __~_--"---"r--l.-"~~-- p
I
Iq, P
I
I
I

Fig. 8.

Classical and Quantum Lagrangian Field Theory


rp(p,q) = -i(e/12-",)3
x Yp(i

[12p

d2"'e
(2nY'" Yt(f +) + m)-I x

+) + m)-IYI1(btl1/12)

i(e/1 2-W)3

= -

f
f

161

(128)

d2"'1

(2n)2w N p(p, q, I) x

+ 1)2 + m2)q + 1)2 + m2)]-I,

(129)

where

Np(p,q,1) = YI1(' +) - m)Yp(i +) - m)YI1'

(130)

Use a two-parameter Feynman formula

_1_=2!fl dxfl-x dy x [a 1 x+a 2y+a 3 (1-x-y)r 3


a 1 a 2a 3
0
0

(p

p ,q

)= -2'( 2-W)3fl d jl'l-X d


I e/1
0
x 0
y

d 2"'1 Np(p,q,1)

(2n)2W

D3

(131)

(132)

'

where

+ 2p olx + (p2 + m2)x + 210qy + (q2 + m2)y.


Substitute I -+ I + px + qy and perform the translation to obtain
D = [2 + m 2 (x + y) + p2x(1 - x) + q2y(l_ y) - 2poqxy
D = 12

(133)

Np = YI1[) - iy +,(1 - x) - m]y p[) + i(1 - y) - fX - m]YI1'

(134)

and

Only the term quadratic in I in N p leads to a divergent integral, the rest of the
integral is convergent. One can write

rp

= r~l)

+ q2),

where r~l), the divergent part after evaluating the I integral, can be written as

ql)(p,q) =

(e/1 2 - W)3 r(2 - w) fl


fl-X
(4n)'"
2
0 dx 0
dy
X

[m 2(x

The convergent part is

q2)(p,q) = (e/12-W)

+ y) + p2x(1

1:: r
2

dx

- x)

r-

+ q2y(1

- y) - 2poqxy]w-2.(135)

dy x

x YI1[,(1 - x) - iy - m]Y p[i(1 - y)

x [m 2(x

YI1YtYpYtYI1

y) + p2x(1 - x)

+ q2y(1

-lx -

m]YI1 x

- y) - 2poqxyrl. (136)

162

R. P. Saxena

A useful Dirac identity is

Y"YaYprpY"

2YpYpYa - 2(2 - w)YaYpYp'

(136a)

Using this and the previous identity (Equation (112)), we can perform a
Gordon decomposition of the numerator N p , keeping in mind that we had
u(p)Ap(p, q)u(q), namely the momenta p and q are on the mass shell. Thus Ap
can be shown to have a piece proportional to Yp and another piece proportional
to (J PT(q - P)T' It is this latter term that we are interested in at the moment.
Making use of p2 = - m 2 and q2 = - m 2 , and the Dirac equation for momenta
p and q, one can prove after some algebra, that
r~ag(p,

e2

q) = - 82 (J PT(q - P)T'
mn

(137)

This is precisely the induced magnetic interaction of the electron and the
numerical value of the anomalous magnetic moment to this order turns out to be
rj./2n.

Further Reading
I. C. Itzykson and 1. B. Zuber, Quantum Field Theory, McGraw-Hill 1985.
2. L. H. Ryder, Quantum Field Theory, Cambridge Univ. Press, 1985.
3. J. D. Bjorken and S. D. Drell, Relativistic Quantum Fields, McGraw-Hill, 1965.

10. Introduction to Particle Physics,


Symmetries and Conservation Laws
J. PASUPATHY
Centre for Theoretical Studies. Indian I nstitute of Science. Bangalore 560 012. India

1. Introduction

'I want to know how God created this world. 1 am not interested in this or that
phenomenon, in the spectrum of this or that element. 1 want to know His
thoughts; the rest are details.' - A. Einstein.
Symmetry principles playa fundamental role in our understanding of elementary
particles and their interactions. These concepts have their origin in what is almost
a commonplace observation. Thus, for example, time translation invariance is an
expression of the fact that the outcome of a collision experiment is independent of
when it is performed. Similarly, invariance under spatial translations reflects the
independence of the results on the location of the apparatus, and invariance
under rotation arises from independence of the res Its from the orientation of the
apparatus. In other words, there is no privileged position in space and time or
orientation in space. The special theory of relativity extends this equivalence class
to all inertial observes. Thus, we have invariance under Lorentz transformations.
The extended symmetry which includes Lorentz transformations and spacetime
translations is called Poincare invariance.
Symmetry principles lead to various conservation laws and selection rules. For
example energy and momentum conservation follows from, respectively, time
translation and space translation invariance. As an illustration of a selection rule,
consider the decay of the strange particle charged Kaon K + into a charged meson
n + and a photon
K+

--+ 11: +

+ )I; Forbidden.

This transition is forbidden by angular momentum conservation. The photon is


a massless spin 1 particle and has helicity, or component of angular momentum
along its direction of motion equal to 1 while K + and n+ are spin zero
particles.
The principle of gauge invariance which we shall describe at length later, not
only leads to conservation of electric charge but also dictates the structure of the
interaction between particles that carry electric charge.
At the present time, quarks and leptons are regarded as elementary particles. It
is good to recall that the set of particles which are regarded as elementary, has
163
B. R. lyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 163-184.
1989 by Kluwer Academic Publishers.

1. Pasupathy

164

always changed with the course of time. Prior to 1930, one tended to regard the
proton and electron as elementary particles. The discovery of the neutron by
Chadwick in 1932 added to the list. In 1933, Pauli introduced, purely on
theoretical grounds, the neutrino whose existence was established by Cowan and
Reines nearly two decades later. The pi meson or pion was introduced in 1937 by
Yukawa to explain nuclear forces and was discovered by C. F. Powell and
collaborators in cosmic rays a decade later. It is interesting to note that Powell's
discovery is based on using the photographic emulsion technique, first used by
D. M. Bose to obtain tracks of cosmic rays. The number of elementary particles
has proliferated very rapidly with increaslingly higher-energy particle beams that
have become available for experimental study. Returning to quarks and leptons,
they are regarded as elementary, since at the currently available energies, we are
able to probe distances down to 10- 15 cm and at these length scales there is no
evidence for any substructure.

1.1.

U nits, Dimensions and Orders of Magnitude

We shall use the so-called natural units in which c = h = 1. In CGS units we


recall c = 2.998 X 10 10 cm/sec and h = h/2n = 1.0546 x 10- 27 erg sec. The
electron volt denoted by ev is the energy unit. We have ev = 1.602 x 10- 12 erg.
The charge on the proton is e = 4.803 x 10- 10 esu. Useful conversion factors to
remember are

h = 6.582

10- 22 Mev Sec,

he = 197.3 MeV fm, Ifm = 10- 13 cm,


(IiC)2 = 0.389 (GeV)2 mb, 1 mb = 10- 27 cm 2,
and
Boltzmann's constant k = 1.3807 x 10 - 17 ergt K,
= 8.62 x 1O-5 ev;O K

1.2. Types of Interactions


The universal character of gravitational and electromagnetic interactions is
described by Einstein and Maxwell equations. The success of Fermi's theory of
{J-decay also established nuclear weak interaction as being universal in character.
Writing the transition probability as

G;' x INuclear matrix elementl 2 x phase space,

the enormous variations in

can be entirely described by the second and third

Particle Physics, Symmetries and Conservation Laws

165

factors with a universal value for GF


GF = 1.66

10- 5 GeV- 2

(1.1)

Therefore, the f3 interaction is really weak compared to electromagnetic


interactions. In contrast, nuclear or strong interactions are characterized by large
coupling strengths. For example, the pion nucleon coupling constant g2/hc ~ 15
is much larger than even the electromagnetic coupling constant e2 /hc = rx =
1/137. The existence of yet another interaction, CP violation, (cf Section 5) was
discovered in 1964. Empirical information is sparse and, theoretically, it remains
very much of an enigma.

1.3. Cross-section Lifetimes

The strength of different interactions is reflected in the cross-sections of various


reactions. As an illustration, consider the neutron proton cross-section at
E 1ab ~ 1 GeV.

a"(np) ::::: 30 mb

30 x 10- 27 cm 2

(1.2)

The dominant contribution to the cross-section, of course, comes from the strong
interaction between the neutron and proton. This is to be contrasted with the
cross-section for an electron-positron pair to annihilate and materialize into
a muon pair which is given by the formula

86x 1O- 33 cm 2
s in (GeVf

(1.3)

Here s is the square of the energy in the centre of mass of the electron and
positron. For
= 1 GeV, it is seen that

Js

a(e+ e- ---> 11+ 11-) a(np).

The cross-section for scattering neutrinos of energy Evon protons is given by


a(vp)

~ GJ.mpEv::::: 06 x 1O-38(~:)cm2.

The scattering here is due to weak interaction and, hence, the cross-section is
much smaller than in (1.3). Since the neutrino couples to the rest of the matter
only through its weak interaction, it is a clean probe to study the structure of
matter. In passing, we note that Equation (1.3) exhibits scaling with energy. There
are no dimensional parameters and, therefore, its experimental validity indicates
that the electron and muon have no substructures which can be seen at currently
available energies.
With the exception of the electron, proton and neutrino, the other known
particles are unstable. Their lifetimes are a reflection of the strength of the

166

l. Pasupathy

interaction responsible for their decay. Thus, for example, the charged pion n+,
which mostly decays into a muon and a neutrino through weak interactions has
a lifetime T". = 2.6 x 10 - 8 sec, while its neutral partner nO which decays mostly
into a pair of photons, has a lifetime T"o = 0.87 X 10- 16 sec.
2.

Charge Independence of Nuclear Forces - Isotopic Spin

Soon after the discovery of the neutron, the concept of isotopic spin invariance
was introduced by Heisenberg. In this picture, the proton and neutron are
regarded as two components of a single entity called the nucleon, just as the two
states of an electron with spin projections ms = ! are two different states of the
same particle. The two charge states of the nucleon, the proton and neutron, are
different projections of the third component of the isotopic spin. Further,
generalizing the concept of rotation invariance, one demands that the nucleonnucleon interaction be invariant under rotations in isotopic spin space. The
empirical motivation for this hypothesis comes from an examination of nuclear
binding energies. Compare, for example, the binding energies (BE) ofHe 3 and H3.

He

en

Fig. I.

Experimentally, we have BE(He 3 }-BE(H 3 ) ~ 0.7 MeV, while each nucleus has
a BE of the order of8 MeV. This strongly suggests the equality of the p-p, n-n and
n-p forces.

2.1. Consequences of Isotopic Spin Invariance - Illustrative examples

Recall that rotation invariance implies that the angular momentum generators,
lx, ly, lz commute with the Hamiltonian H
[H,lJ

= O.

This implies that the eigenstates of H, H I/J = EI/J, form a representation of the
rotation group. We have degenerate multiplets designated by the eigenvalues of
j2

= l; + l; + l; = j(j + 1)

167

Particle Physics, Symmetries and Conservation Laws

+ 1) eigenvalues of
= -j, -j + 1, ... ,j - 1,j.

and the (2j


lz

Mathematically, the isospin group is identical to the rotation group. We


introduce Tx, T y , T z as the generators of isotopic spin with the commutation
properties

[7;, T j ] = iSijkTk'
[H, T2] = 0,
[H, TJ

0,

and we characterize the physical states by the eigenvalues of T2 and T z The


nucleon has T = t with the proton being identified with the state with T z = t and
the neutron with the state with T z = -~. The rules for adding isotopic spin are
identical to the ones for angular momentum addition. Thus, combining two
nucleons, we can produce both T = 1 and T = 0 states as follows

Ip)lp);

T= 1, T z

=1

J21 {lp)ln) + In)lp)};

T= 1, T z

=0

In)n);

T= 1, T z

= -1

J21 {lp)ln) -In)lp)};

T= 0, T z

= 0 } singlet

triplet,

From the Pauli principle, we know that the wavefunction of an assembly of


identical fermions is anti symmetric with respect to the exchange of their spatial
and spin coordinates. It follows, then, that the diproton state when L = 0 must be
a spin singlet, i.e. must be in ISO state. A priori, there is no Pauli principle
restriction for the n-p since nand p are nonidentical. However, when we invoke
isotopic spin invariance, we have generalized the Pauli principle, namely we
require the two nucleon wavefunctions to be antisymmetric under the exchange
of spatial, spin and isospin coordinates. Consequently, the n-p system in T = 0
can exist only in 3 S 1 states, while the T = 1 state can exist only in 1 So
configuration. Experimentally, we find that there is only one bound state, the
deuteron in T = 0, while there are no bound states in T = 1 (dineutron-bound
states are not seen). The triplet of charged and neutral pions n + , n - and nO belong
to the T = 1 representation of the isospin group.
T=1

Tz

Mass (MeV).

1
139.6

n
-1
139.6

135.0

It is seen that the masses are nearly degenerate. Just as there is a slight difference

in the masses of the proton and neutron, there is also a slight difference in the
masses of charged and neutral pions (n+ and n- being antiparticles of each other,

168

J. Pasupathy

have identical masses). The differences mn - mp ' m". - m"o arise from effects
other than strong interactions, i.e. electromagnetism. (The difference in the
masses of the up and down quarks (see later) is also a contributing factor.)
As an illustration of the experimental implication of isotopic spin invariance,
consider the nucleon-nucleon scattering into a pion and a deuteron.
n

+ p --+ nO + d,

Since, as we have seen, the deuteron has T = 0, the final state of nd is a T = 1 state.
How about the initial states? In the second reaction, the initial state has
~ = -! + -! = 1, so that it is in a purely T == 1 state. On the other hand, in the first
reaction, the initial state is a superposition of T = 1 and T = 0 states.

Because of isospin conservation, the reaction proceeds only through the T = 1


component so that, for the scattering amplitudes, we have the relation
amplitude (np

--+

nOd) =

~ amplitude (pp --+ n +d).

Expressed in terms of cross-sections, this implies that they are in the ratio of 1:2
which is verified experimentally.
In early fifties, with the construction of large accelerators, pion beams became
available. The first experiments on pion-nucleon scattering were carried out by
Fermi and collaborators, who discovered a resonance in n+p scattering at a mass
of 1230 MeV. Here the mass is obtained from the invariant mass ofthe initial pion
and the proton

This resonance is clearly a doubly charged Li + +. This state, which is called the
isobar, has T = t. Using n- beams and nuclear targets which have neutrons, the
existence of other charged states Li +, Li and Li - has been established. It has
a width of about 110 MeV or a lifetime

T= -

["

6 x 10 - 24 sec.

This is roughly the time taken by light to traverse a particle with a radius equal to
a Fermi. In the last three decades, a very large number of resonant states have
been discovered in pion-nucleon scattering in different isotopic spin and angular
momentum channels.

Particle Physics, Symmetries and Conservation Laws

3.

169

Strange Particles

In the same year, 1947, when the existence of the pion was experimentally
established, a strange cosmic ray event was observed by Rochester and Butler. It
corresponded to the decay of a heavy neutral particle with a mass of around
1000 me' The next few years saw the discovery of a variety of such events. The
most puzzling aspect of the new particles or 'strange particles', was the following.
The cross-section for the production of these particles through collisions of
cosmic rays with nuclei, was comparable to that for the production of pions,
suggesting, therefore, that the production takes place through nuclear or strong
interactions. On the other hand, these particles had a relatively long lifetime, in
the range 10- 8 to 10- 10 sec. If the decay mechanism is also due to strong
interactions, then as we have seen in the case of the nucleon isobar ~, we expect
the lifetime to be in the range ~ 10- 23 sec. In 1952, Pais resolved this problem by
suggesting that strange particles are produced in pairs in strong interactions,
while they decay singly and so one can associate a new quantum number which is
conserved during production processes, while it is violated in the decay process,
which accounts for the apparent inhibition of the latter. The strangeness
classification for the new particles was introduced by Nakano and Nishijima in
1953 and independently by Gell-Mann. For example, the neutral meson KO,
produced in the reaction

n-p ----> AKo,


is assigned the strangeness quantum number 1, while the corresponding heavy
particle, or hyperon A, -1. Thus strangeness is conserved in the production
process which takes place through strong interaction. On the other hand, in the
decay of the A hyperon,

A ---->pn-,
strangeness is not conserved and, therefore, it is not a strong interaction. In fact,
the decay proceeds only by weak interaction which does not respect the
conservation of strangeness. Similarly, in the decay of KO

KO ----> n+

+ n-

again strangeness is not conserved and it is a weak decay.


The concept of isospin can be extended to strange particles by using the
Gell-Mann-Nishijima formula
y

Q = T3 + 2
y= N

+5

where N is the baryon number (cf. Section 4) which is assigned the value + 1 for
nucleons and hyperons and zero for mesons, both strange and nonstrange. The

J. Pasupathy

170

nucleons, pions, kaons and hyperons have the quantum numbers displayed in
Table I.

Table I
Pseudo scalar mesons

B=O

Particle

T3

1
0
1/2
1/2
1/2
1/2

+1
-1
0
0
+ 1/2
-1/2
1/2
-1/2

Mass in MeV

7[+
7[
7[0

139.6
139.6
135
548.8
493.7
497.7
497.7
493.7

rf

K+
KO
KO

KSpin

Half

Baryons

938.3
939.6
1115.6
1189.4
1192.5
1197.3
1314.9
1321.3

1/2
1/2
0

rf
A

:E+
:Eo
:E3

4.

5,
0
0
0
0

-1
-1

B= 1

1(,OJ..

+ 1/2
-1/2
0
+1
0
-1
+ 1/2

1/2

-1/2

0
0
-1
-1
-1
-1
-2

-2

Nucleon Number Conservation

It has been experimentally observed in pp collisions that the total number of


nucleons in the final state is the same as in the initial state. Thus, for example, the
reactions

pp->nnn+n+
-> nnppn+n+
-> ppnii, etc.
are seen while, for example, pp -> ppnn, pnnn+, are never seen. So this
observation has led to the introduction of the quantum number, called the
nuclear number baryon number, which is assigned a value + 1 for nucleons, -1
for their anti-particles and zero for pions. It is interesting to ask whether
baryon-number conservation is on the same footing as electric charge conservation. For example, while
e+p -> e+p

Particle Physics, Symmetries and Conservation Laws

171

is allowed,

is never seen due to electric-charge conservation. As we shall see in Section 7, this


is associated with gauge invariance and, in fact, the existence of an electromagnetic field, or the light quantum with zero mass, can be deduced from this invariance
principle. This led Lee and Yang in 1955 to ask whether there exists a long-range
gauge field corresponding to the baryon number. If this were so, we would expect
the gravitational force law to be modified according to
Force =

-Gmlm2
r2

1]2 N1N2

+ 4n

- r 2--'

where m1 , m2 are the usual masses of the particles while N l' N 2 are the nucleon
numbers of the particles. 1]2/4n will be the strength of this new force associated
with the baryon number. One knows from Eotvos experiments that 1]2/4nGm~ <
10 - 5, suggesting, therefore, that if there is any gauge field associated with
baryon-number conservation, then its coupling must be quite weak, even
compared to gravitation. Nowadays, it is widely believed that baryon-number
conservation is not an absolute law. It was pointed out in 1967 by Sakharov that
the observed asymmetry of the baryon number of the Universe in fact demands
that it cannot be absolutely conserved. Experimentally, it is estimated that the
ratio of the number of Baryons N B to the number of photons Ny is
NB ~ 10- 9 .
Ny

5.

Lepton Number Conservation

It is known from the weak decays that the number oflight mass particles (leptons)

is also conserved. Thus, for example, in


n -+ pe - v (beta decay)
the electron and the neutrino are assigned a lepton number + 1 while positron
and anti-neutrino are assigned - 1 so that the lepton number is conserved. As an
example of lepton-number conservation, the process e~p-+e+nn-, which
conserves charge and baryon number, is not seen.
The lepton numbers can also be extended to muons. SO)1- has L = + 1 and)1 +
has L = - 1. An immediate question then is, while the weak decay
)1- -+ e vv
is seen (and, in fact, the dominant decay mode of ji), the decay
)1- -+ e-y
has not been observed so far. If the neutrinos in )1- -. e - vv were particles and

J. Pasupathy

172

antiparticles of the same field, then one expects


B.R. j1- --> e_- y ~ 10-3 to 10-4,
B.R. j1 --> e vv
while the experimental limit is much below this value. The current limit is
2 x 10- 1 . This prompted Schwinger in 1957 to postulate the existence of two
types of neutrinos, one associated with the muons, denoted by vIi and the one with
electrons, denoted by Ve' This hypothesis of Schwinger, the nonidentity of the
muon and electron neutrinos, neatly explains the absence of j1- --> e-y. In 1962,
Danby et al. directly established the distinction between ve and vIi' as follows
Consider the neutrinos from the n- decay

n-

--> j1 Vii'

The corresponding electronic mode

is inhibited by the V-A theory (see Section 6). In fact


B.R. nB.R. n

-->

e-v e

=--------".

--> j1 Vii

1.

0x 1 4

so that, to a very good approximation, the neutral particles or the neutrino beams
obtained from n- decay consists mainly of Vii' Using this beam against a proton,
Danby et al. found that while the reaction

vp

--> j1 + n

take place at the expected cross-section level, the reaction

vp

--> e +n

does not take place. This demonstrates


6.

il

=I

ve

Discrete Symmetries

In the early days of quantum mechanics, Wigner pointed out that the famous
Hund's rule in atomic spectroscopy can be understood as a consequence of the
conservation of parity. Unlike symmetries such as rotation or translation, parity
P is a discrete symmetry. Under P, we have

P:

XI

-XI

Xz

-X2

X3

-X3

Particle Physics, Symmetries and Conservation Laws

173

P is represented by an unitary operator in the space of physical states. Clearly

1, corresponding to even and odd


parity, respectively. Apart from the parity associated with the orbital part of the
wavefunction, there is also an intrinsic parity associated with the fields
corresponding to each particle. For example, the electromagnetic field described
by the four-vector gauge potential A/1(x), has the transformation law
p 2 = 1 and, hence, the eigenvalues of Pare

and

Since the pion has spin zero, it is described by a field transforming as scalar under
rotation. However, in principle, it could be either a pseudoscalar or a scalar under
parity. Consider
nO -> 2/,.

As pointed out by Landau and Yang in 1948, this immediately proves that a pion
cannot be a spin 1 particle. Let 1:1 and 1:2 denote the polarization vectors of the two
photons. In the rest frame of nO, the two photons emerge in opposite directions
with equal momenta.
y

}'

112 < - - , -----> III


nO

Now the transversality condition requires III ' k


and 1l2' we can form the following vectors

1l 2 '

k. Using the vectors k,

III

These are forbidden by Bose statistics which demands symmetry under particle
exchange. On the other hand, if the pion has spin zero, we can form two objects

which transform under parity as scalar and pseudoscalar, respectively. If the


intrinsic parity of the pion was even, then the matrix elements must be of the form
III '1l 2 , while if it were odd, i.e. the pion were a pseudoscalar, then the matrix
element must be of the form (Ill x 1l 2 )' k. With the scalar the configuration in
which III and 112 are parallel is preferred, while with the pseudoscalar the spins are
preferentially in orthogonal combinations. By studying the materialization of the
photons into e + e ~ pairs, their polarization can be found. By this method, it was
established that nO is indeed a pseudoscalar.

174
6.1.

J. Pasupathy

Parity of Charged Pions

Consider the reaction of the capture of negative pions by deuterium

n-d

nn

For slow pions, the capture takes place from rest i.e. from the atomic S-orbit
(l = 0). Since deuteron has spin 1, the total angular momentum of the initial state
is 1 + 0 = 1. It follows, therefore, that the total angular momentum of the final
state is also 1 by conservation of the total angular momentum. This allows the
following four possibilities for the two-neutron wavefunction.
As neutrons are spin t particles and obey Fermi statistics, the only allowed
possibility is 3 Pl' The above reaction conserves parity. Using the parity of
deuteron as + 1, and the fact that 3 P 1 has negative parity, we notice that the pion
must have negative parity.

Question: How do we know that parity is conserved in strong interactions?


One can study, for example, the scattering of pions on an unpolarized proton
target and study the polarization of the final proton. It is easy to see that if parity
is conserved, then this polarization must always be perpendicular to the
scattering plane.

6.2.

Parity Violation in Weak Interactions

We have discussed strange particles (Section 3) and mentioned the existence of


heavy mesons with mass"" 1000 me' The theoretical situation was quite puzzling
in the early fifties, since there were apparently two strange mesons called and
r which have decay modes

e+

n+no,

r+

n+n+n-

The masses ofe+ and r+ were nearly the same and so were their lifetimes. On the
other hand, consider their parities. Assuming the conservation of parity in the
above decay processes, if both e+ and r+ were spin zero particle, clearly e+ has
positive parity, while r+ has negative parity. By an analysis of the final three pion
energy spectrum, Dalitz established that r + is indeed a spin zero particle.
Here then is the puzzle: How can two particles having nearly identical mass and
lifetimes have different parities? In other words are e+ and r+ the same or
different particles? This puzzle was brilliantly resolved in 1956 by T. D. Lee and
C. N. Yang who, after carefully examining the then existing experimental data,
concluded that indeed there was no evidence for parity conservation in weak

Particle Physics, Symmetries and Conservation Laws

175

decays. Their suggestion that parity is violated in weak decays was confirmed by
C. S. Wu et al. by their experiment on the fJ-decay of polarized 56CO.
6.3.

Charge Conjugation

As is well known, Dirac solved the problems of the negative energy states
predicted by his equation by postulating the existence of positrons which are the
antiparticles of electrons. Formally, one can introduce an operator called charge
conjugation and show that the Dirac equation is invariant under this operation.
As an illustration of the selection rule following from charge conjugation
invariance, consider the decay of positronium (a loosely bound state of e + and
e -). It exists in ortho and para forms. The ortho-positronium is a 3 S 1 state while
the para positronium is ISO state. Since charge conjugation interchanges
e + +-+ e -, the same effect can also be obtained by interchanging the spatial and
spin coordinates of the electron and positron. Therefore, the eigenvalues of C is
(_1)1+8 and is equal to -1 for ortho-positronium and + 1 for para-positronium.
The photon field is odd under charge conjugation. Consequently, ortho
positronium can only decay into three photons, while para decays into two
photons.
As another example, consider the experimentally seen decay
nO -+ 2y.

I t follows that nO has C =

+ 1 and, therefore, nO -+ 3y is forbidden. Experimentally

r( n -+3y) < 3.8 x 10- 7 .


r(no -+ all)

For purely strong interactions, useful selection rules can be derived by


combining charge conjugation invariance and isospin invariance (Michel, Lee
and Yang). Since under charge conjugation, particles go over into anti-particles,
clearly particles which have nonzero electric charge, hyper charge etc., cannot be
its eigenstates. Thus,
On the other hand, we note that by using rotation in the isospin space, by
n radians about the second axis exp( - inT2) we can form the G-parity operator
G = Cexp(-inT2)'

Now
n+ = (nl
n-

= (nl

nO = n 3.

+ in 2)/j2,
- in 2

)/j2,

1. Pasupathy

176

Therefore, underexp -inT2 clearly n 2 - j . n 2 n 3 ...... -n 3 and n 1 ...... -n 2. It follows


then
Gln+)

-In+),

Gln-) = -In-),
GinO) = -Ino).

All the charge states of the pion are eigenstates of G-parity with eigenvalue - 1.
As an application, consider the annihilation of antiprotons by neutrons at rest.
The G-parity of
G(pn) = (_l)L+S+ 1,

For L = 0, G = ( _1)S+ 1. This implies that in the triplet state annihilation will
proceed through an even number of pions, while in the singlet state into an odd
number of pions.

6.4. 'C-violation in Weak Interactions:


Along with the recognition of parity violation, it was also recognized that charge
conjugation invariance is also violated in weak interactions. Consider
n+ ...... p+ v~

and

n- ...... p- Vw

The v~ in the first decay is always left-handed while the anti-neutrino v~ in the ndecay is always right-handed. From charge conjugation on the first one, we
would conclude that the anti-neutrino in n- decay should also be left-handed,
since the chargc conjugation leaves the helicity unaltered. However, note that
under a parity operation, a left-handed particle becomes right-handed. Therefore,
although C and P are not individually conserved, the product CP is conserved in
this decay.

6.5.

T-invariance

Consider the classical collision of two billiard balls. We know that the equations
of motion (Newton's laws) are invariant under time inversion, t ...... - t. That is to
say, if one reverses the momenta, then the system will retrace its trajectory. In
quantum mechanics, however, one cannot talk about trajectories. Symmetries
are implemented by corresponding symmetry operators. We know that physically measurable quantities are the various scalar products of the vectors in
Hilbert space, the space of state vectors. The in variance of probabilities under
a symmetry operation R means

1(1jI,W

= I(RIjI,R)12.

It was pointed out by Wigner that this implies that either (1jI,) = (RIjI,R) or

177

Particle Physics, Symmetries and Conservation Laws

(1/1, )* = (R1/I,R). We arefamiJiar with the first one which means R is an unitary

operator, RRt = Rt R = I, the symmetries we encountered earlier are, in fact,


implemented in terms of unitary operators. On the other hand, if (1/1, )* =
(R1/I,R), then R is said to be anti unitary. It was pointed out by Wigner that
time-reversal invariance can be represented only in terms of anti-unitary operator.

An excellent discussion of time-reversal invariance can be found in the book by


T. D. Lee. In particular, this book explains clearly why it is impossible to find
direct tests of time-reversal invariance unlike tests for the symmetries C and P.
This is because testing time-reversal invariance would require that we not only
reverse the momenta and spins of the interacting particles, but also maintain the
coherent phase relations that exist between scattered particles - a practically
impossible task. So most of the tests of time-reversal are only in the restricted
sense, for checking the consequences of the principle of detailed balance.
As an example of the principle of detailed balance, consider the reactions
pp --+ n+ d

and

n+ d --+ pp.

It is possible to perform both the above reactions. The differential cross-section


for unpolarized initial beams is given by

da
+
2n 1 "
2
p;
L.,IMpp~"+dl (2)3
d UA(pp--+n d)=-4vpp
spms
n V"d
and
da +
2n
d --+ pp) = - 3(2S
d A(n
~"
vnd

\~
+1)
1t

L.,
SpInS

2
p;
IM"+d~Ppl (2)3

7r

vpp

Here vpp and V,,'d refer to the relative velocities in the pp and n+d systems,
respectively. The factor t = ! x ! in the first reaction is the average over the
initial polarizations of the target (p) and projectile (p). Similarly, the factor
t(2S". 1) occurs for the (n+ d) initial state, with S" referring to the spin of the pion
(S" = 0). Now using time reversal invariance, we can show that the matrix
elements Mpp~"'d and M"'d~pp satisfy the relation

spins

IMpp~"'dI2 =

spins

IM"d~PpI2.

It follows, therefore, that the two cross-sections are related at the same

centre-of-mass energy. This was initially used to determine the spin of the pion.
Or alternatively, knowing S", this can be used as a test for time-reversal
invariance, in the restricted sense of detailed balance.
6.6.

CP- Violation

We pointed out that although C and P are individually violated, the combination
CP is still a symmetry of weak interactions. Consider the neutral KORo complex.
In weak interactions, we know that strangeness is not conserved. Therefore,

178

J. Pasupathy

through second-order transitions the states IKo > and IKO) can mix. Consider
now the combinations

These are eigenstates of the CP operator with eigenvalues + 1 and -1,


respectively. If CP were conserved in weak interaction, then the decay of the
long-lived neutral kaon esentially Kn
KIJ-->

nO nO, n+ n-

is forbidden, since the final pions are in a relative I = 0 or S state and so have
P = + 1 and C = + 1, thus CP = + 1. In 1964, Cronin and Fitch found that in
fact, the long-lived Kaon decays into n + n- and nO nO pairs, thus establishing
violation of CPo Experimentally,

117 + -I

_lamPlitude (K~
I' d
amp Jtu e (Ks

n-)I_
-->--> n+
+
n n ) - 2.27

x 10

-3

CP violation is one of the most ill-understood topics in particle physics. We note


in passing, as remarked earlier the existence of baryon asymmetry in the universe
also demands CP-violation, which mayor may not be the same as the one seen
above in the KO - K O complex. In addition, there is a potential source of
CP-violation in quantum chromodynamics (QCD).

6.7.

CPT-Theorem

Although C, P and T may not be individually good symmetries, it is expected that


the overall product CPT (applied in any order) is a good symmetry. This is
a general consequence of local field theory and a proof can be found in the book
by T. D. Lee, along with some of the consequences of this symmetry. The most
important consequence of CPT symmetry is the equality of the masses and
lifetimes of particles and anti-particles. From the KORo complex, one can deduce
that CPT symmetries holds to an accuracy of better than 1 part in 10 14 .

7.

ys-Invariance and Weak Interactions

Soon after Fermi's theory of f3-decay, it was recognized that the theory was
inadequate. It was extended by Gamow and Teller who suggested that, in
addition to the non spin-flip transitions in nuclear f3-decay, there is also an
interaction inducing spin-flip transitions (0+ --> 1 +) in nuclear f3-decay. In field
theory language, the interaction responsible for decay is represented by the local
coupling of the four spin 1 fields (fermions), namely n, p, e and ve' If one insists
only on Lorentz invariance, then there are, in principle, five arbitrary couplings;

Particle Physics, Symmetries and Conservation Laws

179

vector, scalar, pseudoscalar, axial vector and tensor, denoted by V, S, P, A and T,


respectively. At low energies, i.e. in the nonrelativistic limit, V and S have the same
structure and correspond to the Fermi transitions, while A and T reduce to the
Gamow-Teller transitions. With the discovery of muon the strange particles, the
realm of weak interactions was extended. Further, with the discovery of parity
violation, the number of couplings, even in ordinary {3-decay, now doubled to ten!
Towards the end of 1956, the theoretical picture of weak interactions was quite
confusing, since there was a variety of data not only confusing but sometimes
even conflicting. For example, did the weak interaction responsible for the
n decay have the same strength and Lorentz structure as that for the nuclear
{3-decay? We are, of course, familiar with the universal character of electromagnetic and gravitational interactions. Are weak interactions also described by
a universal theory?
The decisive analysis was made in early 1957 by Sudarshan and Marshak who
postulated the notion of Y5-invariance. Under this, the universal Fermi interaction is invariant under an arbitrary chiral rotation or Y5 rotation of each of the
participating Fermi fields. Using the anti-commutation properties, it is easy to
see that only an equal mixture of V and A interactions is invariant under this
transformation. We already know that it is only the left-handed neutrino that
participates in weak interactions. In other words, it is "'~) = (1 + YS)"'v' the
left-handed projection of the neutrino field "'v that enters in the interaction
Lagrangian. We can now write
L

W.1.

~(;r;(a)y .I,(b))(:r;(C)), ,I,(d))


11 'I'L I''I'L 'I'L I''I'L .

y2

Subsequent experiments have confirmed the correctness of the universal V-A


theory of weak interactions. The above form for L w .1. forms the very basis of the
modern electroweak gauge theory.
8.

Strong Interactions, Quarks and Gluons

It has been known for a long time now that the proton is an extended object.

Already, the magnetic moment


Jl p

ell

= 2.973 - 2m pc

is different from the Dirac value, suggesting that the proton is not a point particle
like the electron. The experiments at Stanford by Hofstadter and his colleagues in
the late fifties, using electron scattering measured the distribution of the charge in
protons and neutrons. These experiments gave a value for charge radius of ~O.8
Fermi for the proton. We have also seen that a large number of resonant states
have been discovered in n-nucleon scattering, strongly suggesting the possibility
that the proton is only the ground state of a system with internal structure, the

1. Pasupathy

180

resonant states being the excited states. In the late fifties, Sakata generalized the
notion of Isospin to SU(3) so that strange particles could also be included in the
classification scheme. Just as proton and neutron form the T = t representation
of the Isospin group (SU(2)), one would expect particles to fall (degenerate or
nearly so) into multiplets belonging to various representations of SU(3), when
strange particles are included.
It was Gell-Mann and Ne'eman who introduced the eight-fold way according
to which the baryons, p, n, A, L, 3 belong to the eight-dimensional representation
of the SU(3) group. This theory had many successes, however it was somewhat of
a puzzle as to why none of the known hadrons could be assigned to the
fundamental triplet representation of the SU(3) group. A little later in 1964
Gell-Mann and Zweig independently postulated the existence of quarks
belonging to the triplet representation of SU(3). The known hadrons (protons,
neutrons, etc.) are to be regarded as bound states of these quarks, denoted by
u, d and s. Thus,
p = (uud)
n = (ddu)

3 x 3 x 3 = 10

+8+8+1

A = (uds) etc.
Using group theory, this implies that the up quark (u) must carry an electric
charge + i in units of proton charge and down (d) and strange (s) quarks,
In
subsequent years, much effort has been devoted to experimentally discovering
these objects. The failure to detect these was considered to imply that either they
are too heavy or they are mathematical artifacts.

-to

9.

Need for Colour

Consider the doubly charged isobar L1 + +. According to the quark model


L1 + + = (uuu).

Since the spin of L1 + + is the magnetic quantum number ranges from -1 to


We expect the ground state to correspond to zero-relative orbital angular
momentum between any two pairs of quarks. So the state with mj = 1will arise
from the combination of the quark spin projections, each of which must now be
+ t. But this is in contradiction with the Pauli exclusion principle. This
catastrophe can be avoided by introducing the colour degree of freedom for the
quark, as suggested by Greenberg.
The decisive turning point in accepting quarks as fundamental constituents of
hadrons, came with the famous discovery of Bjorken scaling in 1969 by
experiments at Stanford on deep inelastic scattering. This experiment consists of
scattering highly energetic electrons on protons and studying the cross-sections
as a function of the energy loss of the electron, v = (E j - Ef) and the momentum

1.

Particle Physics, Symmetries and Conservation Laws

181

transfer of the electron q2 = (Pi - Pf)2. Earlier, Bjorken has predicted that, for
large values of v and q2, the scattering cross-section, apart from kinematical
factors, becomes purely a function of the dimenionless variable
_q2
X=--.

2mpv

This was interpreted by Feynman, Bjorken and Paschos as scattering of the


electron by point-like constituents called partons with x being interpreted as the
fraction of the proton's momentum carried by the parton that is hit by the virtual
photon.

10.

Gauge Invariance

Charge conservation: Consider the Lagrangian density ! involving a set of


complex fields i(X) and let us assume that ! is invariant under a global change of
phase of the field i(X)
(1)

where 8 is an arbitrary constant and qi is the charge carried by the field i(X),
Under an infinitesimal (8 1) change of the fields i(X) and their gradients
01" i(X), the change in the Lagrangian density is given by
fJ!

b!

b!

= bc/J;(x) fJi(X) + fJ(0l" i(X)) fJ(0l" i(X)),

Using the equation of motion in the first term, we have

(2)
By assumption, when fJi is given by (1), i.e.
bi(X)
fJ!

=-

iqi ei(X),

= 0. Therefore, we find
0l"J/x) = 0,

where
J Jx)

fJ!

= - i ~ fJ(0l" i(X)) qi i(X),

(3)

This is called the Noether current associated with the transformation Equation
(1). It is easy to see that the transformations associated with (1) form a U(l) group,

J. Pasupathy

182

because if 8 1 and 8 2 are elements of the transformation, then so is 8 1 + 8 2 ; the


identity corresponds to 8 = 0 and the inverse is given by replacing 0 by - 8.

ID.1.

Local Gauge Invariance

This concept was introduced by Weyl, originally in the context of general


relativity in his unsuccessful attempt to unify gravity with electromagnetism.
Weyl's idea was based upon making the requirement that the scale for measuring
distances or the gauge could be chosen independently at different spacetime
points. Soon after the discovery of quantum mechanics, this idea was revived by
Weyl and London independently in the form of the principle that the phase ofthe
wavefunction can be chosen arbitrarily at each spacetime point. Returning to
Equation (1), we shall now demand that 2' be invariant, not only for constant, but
also for an arbitrary spacetime dependent function 8 = 8(x, t). One can see that
the invariance of !i' under transformations for which 8 is a constant, does not
guarantee the invariance when 8 = 8(x, t). As an example, let
! =

-tA, </J* iY'</J + V(</J,</J*).

(4)

The offending term is the one involving the gradient all' for if
</Ji(X)

--+

e-iq,O(x) </Jlx),

(5)

then
(6)

i.e. all </Ji(X) transforms inhomogeneously. To restore the invariance of !i' under
the above local gauge transformation, one introduces a vector field AIl(x) and
replaces all by the covariant derivative

The transformation properties of AIl(x) under local gauge transformation, is


discovered by requiring
(DIl</JY = (all - ieqiA~)</Jax)

(7)

I.e.
(8)

Since we have introduced a new dynamical field A/x), we ought to include its
kinetic energy term which is - iFllv pv, with Fllv = all Av - avAli" Therefore, the
complete local gauge invariant Lagrangian becomes
! = -!(DIl</J)*DIl</J

V(</J,</J*) - iFllvpv.

Particle Physics, Symmetries and Conservation Laws

183

Now consider the invariance under a transformation belonging to a compact


semi-simple non-Abelian Lie group G. Let t a be its generators with the Lie algebra
(9)

where Cabe are the structure constants of G. For G = SU(2) cabe = eabe. Let us
make the transformation
(X) -+ '(x) = exp( - iTa ea(x))(x)
= U(e)(x).

(10)

The ya's are the representations of Equation (9) appropriate for the field (x),
thus in the SU(2) case, if (x) belongs to the spin 1/2 representation, then the ya's
are essentially the Pauli matrices. The field derivatives transform as
(11 )

To restore local gauge invariance, we proceed as in the U(I) case. That is, a~(x) is
replaced by D/l(x), where
D/l (x) = (a/l - igya A~(x))(x).

As before by requiring
(D/l(x))' = (exp - i(J" ya)D/l(x),

one determines the transformation property of A/l(x). A little algebra gives


(12)

It is fairly straightforward to show that the transformation (12) is independent of


the representation ya. The kinetic energy term is constructed from the covariant
curl
F~v = a/lA~ - avA~ - igC be A:;A~.

The complete Lagrangian is given by


2' = -iF~Jllva
10.2.

+ 2'(DIli,i).

( 13)

Illustration - QeD

The Lagrangian for this theory is given by


2' = _!F~J/lva

lJI(iQJ

+ m)t/I,

flavours

a = 1,2, ... ,8.

(14)

Here the gauge group is colour SU(3)c which has (3 2 - 1) = 8 generators. The
quarks are in the fundamental 3 representation. m is the quark mass matrix
QJ - ~ - ig e$ aAa /2. The gauge field A/l(x) quanta are called the gluons. Aaare the
familiar Gell-Mann matrices.

J. Pasupathy

184
10.3.

Glashow, Weinberg and Salam model (leptons only)

As we have seen, it is only the left-handed fields that enter weak interactions. The
gauge group here is a direct product of the weak isospin SU(2) and an Abelian
hypercharge group U(1). The particles left-handed electrons and neutrino, are
assigned to the doublet representation of SU(2)L' while the right-handed electron,
to the singlet representation of SU(2)L'

L=

(:e)L'

R=e

The hypercharge of the fields is determined from

Q = T3

+ 2'

So we have

Q
0
-1
-1

v,
eL
eR

T3

1
2
1
-2

-1
-1

-2

The Lagrangian is therefore given by


!l'leptons

t)
l( p. + ., p. + .

_ (OIL i
+ "2 g' BII -

= RylI(olI + ig' BII)R + LY II

_l.Fa
F Ava -l.B
BAv_2
4 p.v
4 p.v
X

(Op.cP -

i~' Bp.cp -

",+

'V

ljLB
2

igr All L",+

'V

~te
2
p. 'V",+

tt'Ap.CP) + U(cp+,cp) +

+ flT'cpR + L C.
Here cP is the Higgs field which transforms as a doublet under SU(2)L' Bp. is the
gauge field of the U(l) group and Bp.v its fidd tensor while A~ are the gauge fields
of SU(2)L and F~v are the associated field tensor. This Lagrangian will be
discussed in detail in later chapters.

Further Reading
1. T. D. Lee, Particle Physics and Introduction to Field Theory, Harwood Academic Publishers, 1981.
2. Y. Nambu, 'Quarks: Frontiers in Elementary Particle Physics, World Scientific, 1985.

11. Building up the Standard Gauge


Model of High Energy Physics
G. RAJASEKARAN
Institute of Mathematical Sciences, Madras 600 113, India

t. Introduction
The standard model based on the gauge group SU(3) x SU(2) x U(l) describes
all that is presently-known of high energy physics. Our aim in this chapter is to
build it up from the beginning.
We start with the simplest notions of Abelian gauge field theory and the
breakdown of symmetry and gradually build up the various strands that make up
the SU(2) x U(l) electroweak theory. We then take up the strong interaction
sector and show how deep-inelastic scattering, asymptotic freedom and colour
lead up to quantum chromo dynamics (QCD). The renormalization group
equation is shown to provide the foundation for asymptotic freedom and the
justification for QCD. Combining the electro weak and QCD sectors, the
complete standard model is then constructed. Its strengths and weaknesses are
briefly discussed and some views beyond the standard model are presented in the
final section.
This chapter is mainly intended for physicists who have not had much
exposure to high energy physics although it may also benefit other beginners in
high energy physics. The level is very elementary and technical details are
omitted. The other chapters in this volume on (pre-gauge-theoretic) particle
physics and on elements of quantum field theory may be regarded as prerequisites
for the understanding of the present chapter.
2.

U(t) Gauge Theory

Consider* the Lagrangian of a complex scalar field :

2'

81l *8 1l - V(*)

(1)

Here V, called a potential, is a function of * and, in particular, for


a renormalizable theory, it is a quadratic function of * . We take
V(* ) = /1 2 * + ).(* )2.
(2)
Then the Euler-Lagrange equation becomes (0 + /1 2)

-2).(*).

We use the metric goo = I; g1l = g22 = g33 = -I.

185
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe,

(D 1989 by Kluwer Academic Publishers.

185~236.

186

G. Rajasekaran

Fig. 1.

F or A = 0, this is the Klein-Gordon equation. Thus, /1 2 * in V represents the


mass term while A(* )2 represents the quartic interaction vertex shown in
Figure 1. The quanta of the complex scalar field represent charged particles of
spin zero.
The Lagrangian in (1) is invariant under the global gauge transformations:
*(x)

-4

e+ia*(x),

(3)

where a is an arbitrary constant. By applying Noether's theorem, one finds


a conserved current:

f'

(4)

* all - (all *),

0lljll = 0

:t f

d 3 xl = 0,

(or)

(5)

where Sd 3 xjO is the total charge.


We now try to enlarge the symmetry. Instead of the constant phase a, we
envisage a spacetime dependent phase a(x) in the transformation:
(x)-4e- ia (X)(x);

*(x)-4e+ ia (X)*(x).

(6)

However, the important point is that the Lagrangian in (1) is not invariant under
this more general symmetry transformation, since the derivative term in (1) has
a more complicated transformation
oll-4e- ia oll - i(Olla)e- ioc .

(7)

In order to ensure invariance, we now have to add a vector field All to the
system with the transformation
(8)

where e is a constant. Using (7) and (8), we find that the combination (all
and its complex conjugate have simple transformation properties*:
(OIl

+ ieAIl) -4 e - ioc(Oll + ieAIl),

(all - ieAIl)*

Hence (0"

+ ieA") is

-4

e+i>(OIl - ieAIl)*

called the gauge-covariant derivative.

+ ieA")
(9)
(10)

Standard Gauge Model of High Energy Physics

187

and their product (Oil - ieAIl)*(oll + ieAfl) is invariant. This product now
replaces the derivative term Oil * Oil in the Lagrangian (1).
Since we have introduced a new field All into the system, we need derivative
terms (kinetic energy) for All. If we define
(11)

we see that FIlV is invariant under the transformation in Equation (8). Thus, we
have the complete Lagrangian:

ff

_*p

F llv

+ (Oil

- ieA)*W

+ ieAIl)

- V(*).

(12)

The beauty of this Lagrangian is that it is invariant under the transformation


defined by equations (6) and (8). We have thus achieved invariance under
spacetime dependent phase transformations and learnt that this necessitates the
introduction of a vector field. This vector field All is just the electromagnetic
vector potential and pv is just the electromagnetic field. So, the Lagrangian
describes a charged scalar field interacting with the elecromagnetic field.
It must be noted that a mass term of the form 1M2 All All added to the
Lagrangian (12) would violate the invariance under (8). So, the 'gauge field' All
describes a massless vector boson and this is quite ok for the photon which is
known to be massless.
The transformations of the type in Equation (3) with constant rx belong to the
group U(1). The more general tranformations defined through equations (6) and
(8) are usually called gauge transformations. Sometimes the case of constant rx is
called global or rigid gauge transformation, while that of spacetime dependent
rx is called local gauge transformation. At this point, it is appropriate to note the
analogy with relativity. Rigid transformations of the coordinates lead to special
relativity while (arbitrary) x-dependent transformations lead to general relativity.
Following this analogy, we shall call the transformation with constant rx as special
U(l) transformation and the case of x dependent rx as general U(l) transformation.
3.

Spontaneous Breakdown of Symmetry - Goldstone Model

The canonical momenta and Hamiltonian corresponding to the Lagrangian in


Equation (1) can be easily worked out

off

n = 04> = *;
Yf = n4>

off

n* - - - 04>* -

A..
'I'

O)'
(where . == at

(13)

+ n* 4>* - ff

= n*n

+ V*'V + V(*).

(14)

So, the total energy of the system is


H=

Yfd3x = fd 3 x[n*n+v.v+ V(*J

(15)

G. Rajasekaran

188

Consider the nature of the potential function V(* ) given in Equation (2).
The constant A has to be positive, otherwise V will become negative for
sufficiently large and, hence, the energy in Equation (15) will become
unbounded from below. So, there are only two cases to be considered both of
which are plotted in Figure 2.
v

\I

(a)

(a) 112>

( bl

(b) 112 < 0

Fig. 2.

In case (a), V is always positive, while in case (b) V is negative for small values of
(where the J12 * term dominates), however V becomes positive for
sufficiently large values of . Case (a), corresponds to normal particles with
positive (mass)Z, and there is nothing more to be said about it. Case (b) is the
interesting case. Although this apparently corresponds to tachyons (particles
with negative (mass)Z), this is not the correct interpretation as is clear by looking
at the Figure 2. Whereas for case (a) the state with = 0 is a state of a stable
equilibrium and, hence, it is the ground state, in the case of (b), = 0 corresponds
to a maximum of the potential and, hence, is a state of unstable equilibrium. For
this latter case, excitations around = 0 have a tachyonic mass, corresponding
to the negative curvature az v/aa*. The true ground state must be identified
with the minimum of the potential, where has a nonvanishing value o (see
Figure 2b). The curvature is positive here and so the tachyons do not exist. In
quantum field theory, o must be interpreted as the vacuum expectation value of
written as ). This must be independent of x, otherwise Poincare invariance
of the theory will be lost.
We must now recognize that V is actually a function of two fields, the real and
imaginary parts 1 and 2 of :

<

(16)

So, in Figure 2 the abscissa may be regarded as 1 and the full shape of the
potential V is obtained by rotating the figure around the ordinate. Thus, we
obtain Figure 3 for the interesting case of (b). (We have added a constant to V so
that V ~ 0.) We see that the minimum of the potential occurs all along a circle of
radius o in the 1 - 2 plane. We can choose anyone point along this circle as

Standard Gauge Model of High Energy Physics

189

a
p

= massless mode
= massive mode

Fig. 3.

the ground state of the system; however once we choose it, the circular symmetry
(which is the U(I) or SO(2) symmetry) of the system is broken. This is the
mechanism of spontaneous breaking of symmetry.
An important consequence follows. Since it does not cost any energy to move
around the circular trough of minimum potential, there exists a massless particle.
As can be seen from Figure 3, movement along a direction normal to this circle
costs positive potential energy and this corresponds to a normal particle with
positive (mass)2. Thus, the choice of a proper ground state eliminates the two
tachyonic quanta (corresponding to l and 2) and, instead, we end up with
a massless mode and a normal massive mode. This massless mode is called the
Nambu-Goldstone boson and this result is called the Goldstone theorem
(proved by Goldstone, Salam and Weinberg) which states that spontaneous
breakdown of any continuous symmetry is followed by the massless NambuGoldstone boson.
[It is worth noting that if the symmetry which is broken is a discrete symmetry,
we do not get any massless Goldstone boson. Consider, for instance, the case of
a single-component real field . If we choose V to be
V = /1 22

+ Jc4,

(17)

the system has a discrete reflection symmetry -> - . For case (b) illustrated by
Figure 2b, there are just two possible ground states, corresponding to = o and
= - o' For either choice, the reflection symmetry is spontaneously broken,
but there is no Goldstone boson.]
We shall now transcribe the above physical description of spontaneous
symmetry breaking to analytical form. Adding a suitable constant to the
potential V in Equation (2), it can be rewritten as
V = Jc(* - 6)2

where 6

= -

pe

(18)

(/12/2)0) > 0, for case (b). We put

iO ,

where p and eare real fields. In the l - 2 plane of Figure 3,

(19)

ecorresponds to

190

G. Rajasekaran

angle, while p corresponds to length. These correspond, respectively, to the


modes along the circle, and perpendicular to it. In terms of these real fields, the
Lagrangian in (1) becomes,
! = 0llpcjl'p

+ p2 01l 801l 8 -

A(p2 - cP6)2.

(20)

Note that the ground-state value or the vacuum expectation value of p is given by
(21)

<p) = cPo = constant.

The excitations around this ground state are described by the field I] defined by
p(X) - cPo = I](x).

(22)

In view of (21), I] has zero vacuum expectation value. Substituting (22) into
Equation (20), we get
! = 01l1]01l1]

+ (I] + cPO)2 01l 801l 8 -

A(4cP61]2

+ 4cPol]3 + 1]4).

(23)

This Lagrangian describes two real scalar fields I] and 8. Their masses can be read
off from the coefficients of 1]2 and 8 2 , respectively

mo

(24)

O.

Thus, I] is the massive mode and 8 is the Nambu-Goldstone boson.

4. Higgs Model
This is the model for spontaneous breakdown of general U(1). We now take the
Lagrangian of Equation (12) and again assume case (b) for the potential V. Using
the form of V given in Equation (IS), we have
! = (Oil - ieAIl)cP*W

+ ieAIl)cP

- hallAv - avAY - )(cP*cP - cP6f. (25)

We again introduce the two real fields p and 8 defined by


(26)

cP = pe iO ,

but the new trick is to transform All also


All

1
Bil - -01l8.

(27)

We substitute these into the Lagrangian of Equation (25). Since 8 looks like the
gauge function a. of equations (6) and (S) and since ! is now gauge invariant, the
result of the calculation is obvious. We get the same form of! as in (25) but with
p and Bil replacing qJ and All
! = (all - ieBIl)pW

+ ieBIl)p -

i(cllB v - aVBIl)2 - A(p2 - cP6)2.

(2S)

The 'gauge function' 8 does not appear and, hence, there is no massless boson!
Again, translating the field p by its vacuum expectation value cPv and defining I] by

191

Standard Gauge Model of High Energy Physics

Equation (22), we get


2?

= 8Jll]8Jll] + e2(1)0 + 1])2 BJlW - ;'(41)51]2 + 41>01]3 + 1]4).

The masses of the fields

m~
mB

=
=

I]

t(8JlB v

8v BY (29)

and BJl can be identified

2fo1>0,

(30)

-J2e1>o.

(31)

Thus, not only has the massless Goldstone boson disappeared, but the vector
boson also has acquired mass. The original massless vector boson AJl had only
two (transverse) components; what has happened is that the AJl swallowed the
massless Goldstone boson eand; thus, became the massive vector boson BJl. The
Goldstone boson has supplied the longitudinal component required by the
massive vector boson. This is the celebrated Higgs mechanism.
The moral is that if the symmetry which is broken is a general, (i.e. gauge)
symmetry then there is no Goldstone boson left in the system.

5. SU(2) Gauge Theory


The U(l) transformations considered so far form an Abelian (i.e. commuting)
group of transformations. Our aim is to generalize the U(l) theory to symmetries
based on non-Abelian groups such as SU(2) or SU(3). We proceed in parallel
steps. We take 1> to be a complex doublet of scalars
(32)
Under an SU(2) rotation, 1> transforms as
(33)

where't" = ('1"2"3) are the three Pauli matrices and IX = (a 1 ,a 2 ,a3 ) are three
real constants. Note that 1> t 1> is invariant under the transformation in (33).
A Lagrangian invariant under this 'special SU(2) transformation' is
(34)

where 1> t refers to the Hermitian conjugate 1> t = (1)t, 1>H


Next, let us try to generalize the above to the 'general SU(2) transformation':
1> ~ ei<I2-0(x) 1>,
(35)
where the a(x) are now functions of spacetime. In order to achieve general SU(2)
invariance for our Lagrangian, quite a bit of nontrivial algebra is necessary. First
note
(36)

G. Rajasekaran

192

In order to cancel the second term in (36), we have to introduce vector fields.
We introduce a triplet of vector fields W~(a = 1,2,3) which transform as

wa -> [e 'oJ
il

II

ab

W IIb

~
B
[(a Il ei h ) e - iI 'oJ eb
2g abc

(37)

where Ia(a = 1,2,3) are the SU(2) generators in 3 x 3 matrix representation given
by their matrix elements
(Ia)be

(38)

= - i E abC'

where Babe is + 1 or -1 if abc is an even or odd permutation of 123, respectively,


and it is zero if any two of the indices abc are the same. Combining (36) and (37),
one finds

(Oll- ig~'WIl}p->eiT/2.(01l- ig~'WIl)'

(39)

where we have introduced the vector notation for the SU(2) triplet: W il =
(W~" W;, W~).It is clear that this combination has a simpler transformation
property and so can be used for forming the invariant Lagrangian.
We next need the kinetic terms for W~. Define
G~v ==

all W~ -

Ov W~ + gBabe W~W~.

(40)

This transforms as
(41)

G~v -> [eiIJabG~v'

Hence, the generally invariant Lagrangian is*


2= -t(OIlWV-OVWIl + gW Il X W.)2+

+t(a +ig~'WIl) (all - ig~'WIl) ll

V(t).

(42)

Just as in the case of the Abelian gauge theory, the gauge field W II is massless. The
mass term tm?v W il ' W Il , if added to the Lagrangian in Equation (42), would
violate the general SU(2) invariance.
The theory of the non-Abelian gauge field W II was first constructed by Yang
and Mills in 1954. Note that, even in the absence of other fields such as , the
Yang-Mills field W II is self-interacting. The Lagrangian (42) contains terms cubic
and quartic in W Il , describing the cubic and quartic vertices of Figure 4. In this
respect, the Yang-Mills field differs from the electromagnetic field and is more
like gravitation. Since the gravitational field couples to everything which carries
energy-momentum and since the gravitational field itself carries energymomentum, it has to be coupled to itself. Similarly, the Yang-Mills field W il

* Henceforth, we will not be very careful in raising or lowering indices; W.. W" really stands for
W"W".

Standard Gauge Model of High Energy Physics

193

Fig. 4.

couples to everything which carries SU(2) quantum numbers and since WI' is
a vector under SU(2), it has to interact with itself.
The SU(2) non-Abelian gauge theory given above can be easily generalized to
any compact Lie group such as SU(n), SO(n), Sp(n) or even an exceptional group
or direct products of these.

6.

Spontaneous Breakdown of SU(2) Symmetry

Special SU(2): We take the potential V to be always ofthe(b) form. The analogue
of Figure 3 must now be plotted in terms of four real fields contained in the
complex doublet . We again separate these into the length type and angular type
of fields by using
(43)
where p and (J = (8 1 ,8 z ,8 3 ) are four real fields, taking the place of two complex
fields 1 and z' Since V is an SU(2)-invariant function of , it depends only on
p and not on 8. It is clear that the region of minimum potential (the analogue of
the one-dimensional minimum circle of Figure 3) is now a three-dimensional
manifold, corresponding to the three angles 81 ,8 z ,8 3 Thus, there are three
massless Goldstone bosons in this case and one massive boson corresponding to
p, or rather, to the shifted field p - <p>.

General SU (2): We take the generally invariant Lagrangian of Equation (42) and
make the substitution of in terms of p and (J through Equation (43). We also
transform WI' into W~ with the gauge function chosen to be (J:
(44)
As a consequence of general in variance, the Lagrangian has an identical form to
that in Equation (42), except that WI' is replaced by W~ and is replaced by (~)

194

G. Rajasekaran

and the 'gauge function'

(J

disappears.

!' = -t(OIl W~ - Ov W~ + gW~

W~)2

+[(011 - i9~'W~) C)J x


x

[(011 -

i9~' W~)

C)] -

A(p2 - 6)2.

(45)

Hence, the Goldstone bosons have disappeared and all the three vector bosons
have become massive. The mass terms for the vector bosons are easily obtained
from the relevant part of Equation (45) by the replacement of p by its vacuum
expectation value o:-

(46)

We have ignored the Lorentz vector index f1 as well as the prime on the W fields.
We thus find that all the three vector fields acquire the same mass given by

mw

7.

(47)

j2go.

One More Model

In the SU(2) model considered above, the scalar field was a complex doublet field
and this led to a system with all three vector bosons gaining mass after symmetry
breakdown. We next consider a SU(2) model with a real triplet scalar field. In this
case, all the vector bosons do not become massive.
Special SU(2)

!' =

101l '0IltfJ -

V(tfJtfJ),

(48)

where tfJ = (1' 2' 3) is a triplet (vector) representation of SU(2) and it is taken
to be real. We put

~~

ea ",""",

(D.

(49)

Standard Gauge Model of High Energy Physics

195

where we have used the three-dimensional (column) matrix notation for 4>, and Ia
are the 3 x 3 matrix representation of the SU(2) matrices, already given in
Equation (38). The fields p, ()1 and ()2 are three real fields replacing 4>1,4>2' and 4>3'
By following the same reasoning as before, when SU(2) symmetry is broken
as a consequence of the nonvanishing vacuum expectation value of the scalar
field, ()1 and ()2 will become massless Goldstone bosons, while p will become
massive.
General SU(2)

(50)

where we have used the SU(2) vector notation for both WI' and 4>. Again,
following the same argument, we see that ()1 and ()2 will become the longitudinal
components of two of the vector bosons which will, therefore, emerge as massive
vector bosons. The third vector boson will remain massless. This can be worked
out from the piece !g2(W I' x 42 contained in the above Lagrangian by replacing
4> a by its vacuum expectation value 4>0 ba3' Thus,

(51)

(52)

where we have dropped the Lorentz index on the vector field W for notational
convenience. Thus, W 1 and W 2 have masses equal to g4>o, while W 3 remains
massless.
8.

General Case of Non-Abelian Symmetry Breakdown

Let us consider any compact tie Group and work out the symmetry breaking.
Let g be the number of generators ofthe group, which is also the number of gauge
bosons and let 4> contain n real components.
Writing 4> in the form
(53)

196

G. Rajasekaran

where r are the generators in the representation of <p, we take v nonvanishing


components for p and r = n - v non vanishing components for 8:

p=

Pa
Pv

},

n,

8 = (8.... 8,,0 ... 0).

(54)

~/

r=n-v

'----------..~
9

This split-up between the angle-type variable 8 and length-type variable P is


completely determined by the representation to which <p belongs. The number of
length-type variables v is, in fact, equal to the number of independent invariants
one can construct out of <p. This number v is called the canonical number of the
representation. The 8 fields are massless while the P fields lead to massive
excitations. Hence, the number of Goldstone bosons is given by the difference:
r = n - v: This is also the number of gauge bosons which will become massive
and so the number of massless gauge bosons is g - n + v.
In tne examples already considered above, for the doublet <p, the only invariant
is <p t <p and for the real triplet also, there is only one invariant <p' <p and so v = 1
for both. Hence there is only one P field in both these examples of SU(2) breaking
and the rest of the field components must be accommodated in the angle-type
variables 8 each leading to a Goldstone boson.
We may also write down the general mass matrix for the vector bosons,
resulting from a spontaneous breakdown of symmetry
M;b = g21~f31~y<Pf3><Py>.

(55)

This is a generalization of the mass calculation in Equations (51) and (52).

9.

SU(2) x U(l) Model

We are now ready to face a more realistic model (needed in high energy physics),
which is obtained by combining the U(I) and SU(2) models already discussed
above.
We start with the Lagrangian of a scalar field <p, which is a doublet under SU(2),
and being complex,has a nonvanishing U(I) charge also
(56)

(57)

This Lagrangian has special SU(2) x U(l) in variance. Substitution of the form

Standard Gauge Model of High Energy Physics

197
(58)

reveals the presence of three massless Goldstone bosons ()l' ()2 and ()3 and
a massive scalar boson p. (Since t is the only invariant, the canonical number
v = 1.)
To achieve general SU(2) x U(1) invariance, we need a triplet of SU(2) gauge
bosons Wiland a singlet of U(1) gauge boson Bil . The generally-invariant
Lagrangian is
!i'

= -(0ll Wv

- Ov W il + gW IL

+t(a + ig~.WIl
ll

wy -

t(ollB v - ovB)2 +

i~' BIl)(OI' - ig~.WIl - i~' B Il )

- J.(t - 6)2,

(59)

which is a combination of the Lagrangians in Equations (25) and (42). We have


called the SU(2) gauge coupling constant as g and the U(1) gauge coupling
constant as g'. Now, since there are four gauge bosons, whereas the number of
Goldstone bosons is only three, one massless gauge boson survives and along
with that a general U(1) symmetry (which need not be the same one we started
with) remains unbroken.
After making the gauge transformation with the gauge function 0, the field
ogets eliminated and the final form ofthe Lagrangian is obtained from Equation
(59) by the simple substitution
p = o

+ IJ,

(60)

where o is a constant and IJ is the massive scalar field. Thus, we have


!f = -t(OIL Wv - 0v W il

1-(0

+4

+ gW~,

)(- gW3 + g' B


0 -g(Wl + iW 2 )

WY - t(OIlB v

OVBIl)2

- g(W 1 - iW2))2( 0)
gW 3 + g' B
o +

+ IJ-dependent terms.

(61)

The vector boson mass term which can be read off from this equation, is
tg 2 6(W 1 + iW 2)(W 1 - iW 2)

+ t6(gW 3 +

g' B)2.

(62)

We thus identify the massive fields and their masses as follows


+ _

W;;
Z

=
I' -

W! iW~.

J2 '

+ g' Bil .
J g2 + g,2 '

gW!

(63)

198

G. Rajasekaran

The fields defined here are the normalized fields. W are complex fields and
correspond to massive charged vector bosons, while Zil is a real field and
corresponds to a massive neutral vector boson. The combination orthogonal to
Zil remains massless and so we shall identify it with the photon field All
(electromagnetic vector potential):

= -g'W~ + gB Il .
J g2 + g,2 '

(64)

Il-

It is convenient to define the weak mixing angle ()w by

tan ()w

g'

(65)

=-

so that, part of Equations (63) and (64) can be rewritten as

+ sin ()wBIl'
All = -sin()w W~ + cos ()wBIl

Zil = cos ()w W~

(66)

and
(67)

This model is, in fact, the successful electro weak model of Glashow, Salam and
Weinberg, which unifies the weak and electromagnetic interactions through
SU(2) x U(l) gauge theory. After symmetry breakdown, a U(l) gauge symmetry
remains unbroken and it is identified with electromagnetic U(1), with the
corresponding massless gauge boson, namely the photon. The three massive
vector bosons Wtand Zil mediate the short-ranged weak interactions such as
fJ-decay.
In terms of wt, Zil and All' the Lagrangian becomes
fL'

= -tw;v W-';v + m~ W; W-,; - -l:F~v - -l:Z~v +!m~ZIlZIl - [2ig sin ()w {AiW-';v W: - W; W;v)} - g2 sin 2 ()w{AIlAv W; W; - All All W: W;}] -

+ 2ig cos ()w{Z/W-,;v W: - W; W;v)} +


()W{ZIlZV W; W; - ZIlZIl W: W;} +

- 2ig sin ()wFllv W; W;

+ g2
+ 2ig cos ()WZIlV W; W; - g2 cos ()w sin ()w
{AIlZvW;W; + AvZIlW;W; -2AIl Z Il W:W;} +
cos 2

+ g2

{W; W; W; W; - W; W: w-,; W;}

+ ,,-dependent terms,
(68)

where we have put


wtv ==
F llv

aJlW; - av wt;

== allAv - avAil"

(69)

Standard Gauge Model of High Energy Physics

199

We make the following observations on the structure of Equation (68).


(1) The coupling of the charged vector bosons Wj to the electromagnetic field

AI" is automatically contained in the Lagrangian, provided we identify


9 sin

Ow

e.

(70)

(2) In particular, all the terms within the square brackets [ ... ] in the above
equation arise from the so-called 'minimal' electromagnetic coupling
arising from the replacement
(71 )

(3) However, there is a nonminimal term also. This is the piece FI"v W; Wvwhich, in fact, ascribes an anomalous magnetic moment to the W bosons.
The value ofthe anomalous magnetic moment Kw is unity, thus giving 2 for
the g-factor of the W boson:
gw

= 1 + Kw = 2.

(72)

This feature is a consequence of the symmetry of the cubic Yang-Mills


vertex between the three vector bosons and is a characteristic of any theory
in which charged vector bosons are incorporated into a Yang-Mills
theory.
(4) There exists a perfect AI" ~ Zil symmetry. As a consequence, the charged
particles are coupled to Z/1 exactly in the same manner as to A/1' the only
difference being the replacement of 9 sin Ow by -g cos Ow (see Figure 5).
(5) Our last comment is on the W+ W+ W- W- term, which implies a direct
coupling among the charged bosons without involving the electromagnetic
field. It is, in fact, the presence of this term which makes this theory of
massive charged vector bosons a consistent one; without such a term, the
theory of massive charged vector bosons was known to be an inconsistent
theory [1].

- 9 cos

9w

Fig. 5.

10.

'Standard Model' before Gauge Theory

Our aim is to construct the standard model of gauge theory. Before doing that, it
is useful to have a brief glance at the standard model of high energy physics that
existed (say, in the late 60's) before the advent of gauge theory. This pre-gauge
theoretic standard model can be described by the Lagrangian

200

G. Rajasekaran

+ e[iyioll- ieAI") - meJe + iveyllollv e +


+ ,u[iy)JO..l - ieA..\) - mllJjl + ivp y..\ 0). VI" +

!i' = -iFI"VFIlV

+ u[iyp(oP + iieA'")

- muJu -+- a[iYpW

+ s[iYI"(oP - tieAP) - msJs +

- ~eAP) -

md}-+-

ji1{J: ,J;}-+G

-+- strong interactions among the quarks,

(73)

where

r; = 1eh(1 - Ys)v e + 1jlY..\(1- Y5)V Il +


_.

(l-ys)

+(acosBe + sSlllBJY).--2-- u,

J; = (J;(

(74)
(75)

This Lagrangian describes the electromagnetic and weak interactions of the


quarks u, d, s, with respective electric charges 1, - t and - t (in units of the
electronic charge e) and the leptons e, jl, ve ' V1"' with electric charges - 1, - 1,0, 0,
respectively. The existence ofthese quarks as the constituents of the hadrons had
already been guessed from hadron spectroscopy. However, nobody knew the
precise form of the 'strong interaction' among the quarks which is responsible for
the binding of the quarks inside the composite hadrons. So, we have left it
unspecified in Equation (73).
The weak interaction, however, was rather precisely known to be the current
x current form of Feynman and Gell-Mann given in Equation (73), with the
weak current being given by the V-A form of Sudarshan and Marshak given in
Equation (74). In this equation, the strength of the weak interaction has been
distributed among the ordinary f3-decay transition (described by the au piece of
the weak current) and the strangeness-changing decay (described by the su piece
of the weak current) in the proportion cos Be and sin Be' respectively. This is called
Cabibbo universality and the empirical value of the Cabibbo angle is given by
sin Be :::::; 0.22.
Violation of CP invariance was experimentally well-established by that time,
but not theoretically understood and so the above Lagrangian in Equation (73)
does not incorporate CP violation. It is also worth pointing out that standard
axioms of quantum field theory require the symmetrized form of the current
x current interaction, given by the anti-commutator of currents in Equation (73)
otherwise even CPT theorem will be violated [2].
What is the connection of the weak and electromagnetic interaction given in
Equation (73) to the SU(2) x Uti) model developed in the earlier section? This
connection is made through the algebra of the weak and electromagnetic currents
which we discuss below.

Standard Gauge Model of High Energy Physics


11.

201

Current Algebra and SU(2) x U(l) Charges of the Fermions

The electromagnetic interaction contained in the covariant derivatives of


Equation (73) can be regrouped in the form of eJ1"m. A l , where the current J1 m. is
given by

m.

-eyle - {iYlll

+ tiiylu -

!ayld - !syls.

(76)

We shall now show that the weak and electromagnetic currents of the quarks and
leptons given by Equations (74) - (76) satisfy the SU(2) x U(l) algebra. To do this,
let us split the currents into the leptonic and hadronic (quark) parts

Jt

= H(e)

+ H(Il) + ji(q),

(77)
(78)

and let us write these currents in matrix notation with the lepton pairs and quark
pairs collected into doublets

(01 ~)G")
._( )-(-d') (1-1'5) (0
q Yl
2
1 ~)G}

.(-)( ) _ (- -)_ (1 - 1'5)


J A. e - ve e y,-----'-'=A
2

(79)

(80)

(0

h e

. + ()

__ ) (1 - Y5)
= (Vee
Yl--2.:....:::.c.- 0

jt(q)

= (iid')Yl (1

j1'm. (e) = (ve e)y l

~ 1'5) (~

~)Ge}

(81)

~)G}

(82)

(~ _~)Ce}

j1'm'(q) = (uJ')Yla

-~)(~)

(83)
(84)

where we have defined the Cabibbo-rotated quark


(85)
d' == d cos Oc + s sin (Jc'
The muonic currents are similar to the electronic currents and, hence, are not
written separately. We, thus, see that the weak currents involve the raising and
lowering matrices of SU(2) algebra

r+ =

(~ ~ )

and

r- =

(~ ~).

(86)

The electromagnetic current involves a diagonal matrix which is not the third
SU(2) matrix
r

(1 0)
0

-1 '

(87)

G. Rajasekaran

202

,3

but can be written as a linear combination of and the unit matrix. So, by taking
the difference between the electric charge matrix Q and the 'weak isospin' matrix
3 , we get a unit matrix (multiplied by a number), which we shall call the
13 =
weak hypercharge Y:

tr

Y=Q- I 3'

(88)

So, for leptons, we have

Y=(~ -~)-K~ -I0)

0)
0) =61(1() 0)
Y= (~ _~) -K~
= __

~(1

(89)

while for quarks

-I

(90)

1 .

,3

This hypercharge matrix Y commutes with all the SU(2) matrices , and and,
hence, can be taken to be the generator of the independent U(I) symmetry. We
thus have the SU(2) x U(I). The hypercharge values for the leptonic and quark
doublets can be inferred from Equations (89) and (90) to be - t and t;, respectively.
To be more precise, we must split the leptonic and quark fields into their
left-handed and right-handed parts by the definition:

A=
R

i(1

+ (5)!

(91)

for all the fermionic fields. The weak currents involve only the left-handed fields
and so these fields form the doublets under SU(2) while the right-handed fields
must be regarded as singlets under SU(2). The right-handed fields have
nonvanishing hypercharge, however, and their values are equal to their electric
charges Q (by Equation (88)).
The SU(2) and U(I) quantum numbers of all the fermions are given in Table I.
The right-handed neutrino vR is a singlet under SU(2) and has Y = () and so does
not participate in the weak as well as the electromagnetic interactions. Hence, it
has been dropped from the table. It may not even exist; in any case it has not been
detected so far.
Table I.

Fermion

SU(2)

qL ==

doublet

.l

doublet

-2

(;1

lL(e) ==
uR
dR
eR

(~)L

singlet
singlet
singlet

.l
;[

3
.l
-3

-1

We rna y now wri te the zeroth components of the leptonic and quark currents in

Standard Gauge Model of High Energy Physics


the form
j~(e) = !l~
j~(q)

j&(e) =

,i fL'

(92)

!qt ,iqu
-!IL1L -

j&(q) = iqt qL

203

(93)
e~eR'

+ !U~ uR -

td~ d~,

(94)

(95)

where i = 1, 2, 3 and we have defined the Cartesian components

jl =

(jt

+ h);

ji = -i(jt- ji:)

(96)

and also defined the hypercharge current

jI = jt m . -

(97)

j~.

In quantum field theory, in general one has the commutation relation


[1jJ t(x)AIjJ(x), IjJ t (y)BIjJ(y)]
= IjJ t(x)[A, B]IjJ(x)<5 3 (x - y)

(98)

which follows from the canonical equal-time anticommutation relation for Dirac
fields
(99)

In equation (98), ljJ(x) is a multicomponent Dirac field, A and B are matrices and
matrix multiplication is implied. Use of Equation (98) allows one to trivially
verify the SU(2) x U(l) algebra for the leptonic currents and quark currents
separately and also for the total currents:
JA=jA(e)

+ jA(J.l) + j;,(q),

(100)

[J~(x),J6(Y)] = iijk J~(X)<53(X -. y),


[JMx),J~(y)]

= o.

(101)

By integrating these equations over x and y, we also get the algebra of charges:

[t,Ij] = iijk 1\

[Y, Ii] = 0,

(102)

where we have defined the SU(2) x U(I) fermionic charges:

Ii =
Y=

12.

f xJ~(x)
d3

(i = 1,2,3),

f d xJMx).
3

(103)

The Electroweak Gauge Theory

We are now ready to discuss the Glashow-Salam-Weinberg electroweak theory.

In fact, the Lagrangian of this theory is simply obtained by adding the leptonic

204

G. Rajasekaran

and quark terms to the Lagrangian of the SU(2) x U(l) model given in Equation
(59). Thus, we get

2'

= -t(a/l Wv

- av W/l

+ gW/l

W v)2 - t(a/lB v - avBY

.- y1'(:'u1I' + Ig-r
. 'f W I' + 6ig'B)
+ IqL
,u qL +
+ iuRy/l(a/l + iig'B/l)u R - ORy/l(a/l -

+ iT;. y/l( all + ig~. W/l -

+ ie R y/l(a /l -

ig' B/l)eR

~g' BIl)zL +

g') 12 -

'f
+ 1( all + ig-rW/l
+ i2B/l

- (hu ih c U R

~g' B/l)dR+

A(t - fi)2-

+ hd ih dR + he Tr- e R + h.c.),

(104)

where

'l'c

= .

IT

~* =

'I'

~~* ) '

-'1'1

(lOS)
(106)

hu, hd and he are arbitrary coupling constants and h.c. refers to Hermitian
conjugate.
There are two groups of additional terms in the above Lagrangian - invariant
kinetic energy terms for the quarks and leptons and terms of the type If which
couple the Fermi fields fwith the Higgs fields and which are called Yukawa
terms. The former contain the couplings of the fermions with the gauge fields
Wand B with couplings specified by their SU(2) and U(1) quantum numbers
given in Table I and, hence, are invariant under the general SU(2) x U(l) group.
The Yukawa couplings with the Higgs field also are invariant under the same
group. By construction, they are SU(2) scalars and they also conserve the U(l)
quantum number. In fact, the Lagrangian in Equation (104) contains all the terms
allowed by general SU(2) x U(l) invariance and renormalizability. The term
renormalizability will be explained below.
Note an important omission: the fermionic mass terms niff are missing. It is
impossible to add any fermionic mass terms without violating SU(2) x U(1)
symmetry. The only SU(2) x U(l) invariant terms are of the type G.L qL' uR U R etc.
but these are zero:

hA =J(1 ~ Ys)

(1

~Ys)f= 0

(107)

Standard Gauge Model of High Energy Physics

205

and similarly for JRfR' There are no IrJR type of term which conserves
SU(2) x U(l) quantum numbers. Hence, all the fermions at this stage are
massless. Fermion masses will be generated by the spontaneous breaking of
symmetries.
The muonic terms which have been omitted in the above Lagrangian, are
exactly similar to the electronic terms. Note that we have used the d quark rather
than the Cabibbo-rotated quark d' = d cos 8c + s sin 8c ' In effect, we have put
8c = 0 and omitted the strange quarks. This omission of the strange quark terms
in the Lagrangian is deliberate. If we had used d' instead of d, this would have led
to the derivative terms in the Lagrangian

(108)

as compared to the correct derivative terms for full-fledged Dirac particles d and
s:
(109)

This defect is due to the fact that the other orthogonal combination,
s' == -d sin 8c

+ s cos 8c '

(110)

has so far been ignored. Including this in the Lagrangian would restore the
derivative terms in full measure for the two particles d and s
(111 )

However, what about the SU(2) x U(1) invariance of s'}' /1 8" s'? One simple way
of ensuring this is to assume that s' is a singlet under SU(2), since there is no
partner for s' to make up a doublet. Its Y value must be assumed to be equal to Q,
for consistency with the relation:
Q = 13

+ Y.

(112)

However, we shall not pursue this rather asymmetrical assignment of quantum


numbers, for there is a more serious phenomenological problem with the strange
quarks, which we shall discuss soon. At that point, we shall give the correct
treatment for s quarks. For the present, we shall carryon with 8c = 0 and ignore s.
The Higgs potential A(t - 6)2 in the Lagrangian of Equation (104)
implies a non vanishing vacuum expectation value o for which leads to
breaking of SU(2) x U(1) symmetry and generation of mass for three of the vector
bosons, leaving the fourth vector boson massless, exactly as in the earlier section.
For the present Lagrangian, non vanishing o has one more consequence arising
from the Yukawa terms Tf. It is clear that the replacement of with its constant

G. Rajasekaran

206

vacuum expectation value <Po leads to mass terms for the fermions. We have

hu CiL <pc u R + hd CiL <pdR


-+

hu(iiLOL)(

+ he TL <peR + h.c.

~O)UR + hd(U LOL)(2JdR + h (VLeL{2JeR + h.c.


e

= hu <Po ULu R + hd <Po OL d R + he (Do eLeR + h.c.


= hu <Po (u LUR + uRu L) + hd(OL d R + OR d L) + he(e L eR + e Re L )
=

hu <PO uU

+ hd <PO od + he <Po ee,

(113)

where we have assumed that <Po and the Yukawa coupling constants h are all real
and used the following relations for the chiral Fermi fields fL and fR:
(IfR)t = JRfL'

JLfR = /1 ~ I' 5) (1 ~ I' 5) f = ](1 ~.y 5) f,


7' I" = [(1- 1'5) (1- I's)r= [(1- I's)r
JRJL.
2
2"
2 .'

IfR

+ h.fL = Jf.

(114)

We thus identify the masses of the quarks and leptons:


(115)

where the last equation has been added to make it more complete. The moral is
that spontaneous breaking of SU(2) x U(I) generates masses not only for the
gauge bosons, but also for the fermions.
This completes the construction of the electroweak gauge theory.

13.

Consequences of the Electroweak Theory

The interactions of the quarks and leptons with the gauge bosons W il and Bil are
all contained in the covariant derivatives occurring in the Lagrangian of
Equation (104). They can be collected together and written in the alternate form:
5' = gW .J
Il

J:

Il

+ g'B Il (Je.m.
Il

- J3)

(116)

Il '

where J il and
are, respectively, the currents of the weak isospin group SU(2)
and weak hypercharge group U(l):

Il

(u) + (-vee-)

= (uo)" (1 - 1'5)
fll
2 t" d

I'll

(Ve)

(1 - 1'5)
2 t" e '

(117)
(118)

On re-expressing the fields Wiland Bil in terms of the physical fields Wi;, Zil and

Standard Gauge Model of High Energy Physics

207

AJl' using Equations (63) and (66), we get

"

= g

sin ewJ~m. AI' +

gF,{J;

2y 2

W; + J; W;} +

+-g-{J3-2sin 2 e rm}Z
cos ew I'
w I'
I'

(119)

The first piece in " is the familiar electromagnetic interaction with the
identification already made (70)
(120)
g sin Ow = e.
The second piece containing interaction of the 'charged currents' J with the
charged bosons W +, must be compared to the old current x current form of the
weak interaction in Equation "(73).
We see that the Fermi contact interaction of the old form of the weak
interaction describing processes like [J-decay are replaced by the W -exchange
form (See Figure 6). The Fermi coupling constant GF get related to g2 multiplied
by l/m~ which is the propagator of W boson for small momentum-transfers. The
relation is
(

)2 _1 _~

_g

2fl m; - fl

(121)

Hence, combining Equations (70), (121) and (67) and using the known values of
GF and e, we get
37-4GeV

(122)

mw = -si-n-Ow-'

(123)

The third piece in Equation (119) describes the 'neutral current' {J~2 sin 2 Ow J~.m.} interacting with the neutral vector boson Z II" This is a new weak
interaction predicted by the electro weak gauge theory which was not present in
the old 'standard model' Lagrangian of Equation (73). This leads to the current
x current form:*
Neutral

' effective

G
fl
J
F

N
I'

N.

J 1"

(124)

Fig. 6.

* We have used 11m; for the Z propagator at low momentum-transfers and replaced m; cos
m; (by Equation (123)).

8w by

G. Rajasekaran

208

v - scalter Ing

Fig. 7.

The existence of neutral-current weak interaction, which will lead to processes


such as elastic v-scattering (Figure 7) with a strength comparable to that of the
usual charged current weak interaction responsible for {J-decay (Figure 7), can be
regarded as a natural consequence of unifying weak interaction with electrodynamics. Neutral current acts something like a bridge between conventional
weak and electromagnetic phenomena. Hence, the discovery of the neutral-current
weak interaction in the neutrino reactions in 1973 and the subsequent detailed
studies which showed the properties of the neutral-current interaction to be
exactly those predicted by the SU(2) x U(l) model, helped to confirm the model.
Note that the neutral-current is not of the V-A form, the relative strengths of
Vand A being determined by the mixing angle 8w Detailed analyses have shown
that all the neutral-current interactions among the leptons and quarks, so far
studied, are in agreement with the predictions of the form in Equation (124), with
(125)
In this volume devoted to the interface between astrophysics and high energy
physics, it is particularly relevant to point out the astrophysical significance of
the neutral-current interaction of the neutrinos. This interaction (Figure 7) leads
to coherent scattering of neutrinos on nuclei and, hence, to neutrino pressure.
(Without neutral currents, such coherent scattering of neutrinos is not possible.) Possible importance of this neutrino pressure on supernova explosion has
been considered in recent literature.
Let us now go back to the expressions in Equations (122, 123) for mw and mz .
Determination ofthe weak mixing angle 8w in neutral current processes, allows us
to determine the masses of Wand Z bosons. Using Equation (125), we get
mw ~ 82 GeV,

(126)

mz

(127)

94 GeV.

We thus see that the weak bosons are very massive, almost 100 times the mass of
the nucleon. This is the reason for the apparent weakness of the weak interaction
at low energies. (See Equation (121) for GF') At energies much larger than
100 Ge V, the strength of the weak interaction is measured by g2 and so becomes
comparable to that of the electromagnetic interaction.
A proton-antiproton collider with centre-of-mass energy of 540 GeV was
specially constructed for the discovery of the weak bosons Wand Z and the
search culminated in their actual discovery in 1983 with masses predicted in

Standard Gauge Model of High Energy Physics

209

Equations (126,127) thus providing a spectacular confirmation of the electroweak SU(2) x U(t) gauge theory.
Let us now come back to the problem of the strange quark encountered in the
last chapter. Introduction of the Cabibbo-rotated quark d' = d cos Be + s sin Be
into the charged currents Jt will describe the decays of strange hadrons correctly,
but the problem is in the neutral current. The contribution of d' to the neutral
current J ~ is
0'

d'

(0 cos Be

+ s sin BJ 0 (d cos Be + s sin BJ,

(128)

where 0 is some linear combination of)' I' and}'I' }' 5' The cross-term 0 sand
sOd lead to strangeness-changing neutral-current weak decays such as

with the same strength as the usual charged-current weak decays. Experimentally,
such strangeness-changing decays are not seen and, hence, the problem.
The solution of this phenomenological problem was provided by Glashow,
Iliopoulos and Maiani (GIM). They suggested that the unused orthogonal
combination s' = -d sin Be + s cos Be be combined with an yet-to-be discovered
charmed quark c to form a new SU(2) doublet

in addition to the old SU(2) doublet of quarks:

So, the neutral-current contribution from both d' and s' is


0'

d'

+ s' 0

s' =

a0

+ sO s.

(129)

This equality can be regarded as a manifestation of the invariance of the norm of


the two-dimensional vector with components d and s under a two-dimensional
(Cabibbo) rotation:
( d')
s'

= (

c~s Be
- sm Be

sin Be)(d).
cos Be S

(130)

The important point is that in Equation (129), the strangeness-changing pieces


os and sd have disappeared. This is the famous GIM mechanism.
But, then, where is the hypothesized charmed quark? Remarkably enough,
hadrons with certain peculiar properties which could be interpreted if they were
identified as bound states of charmed quark and charmed anti quark (cc) (just as
n, K, , etc. are bound states of the form Ull, us, SS, etc.) were discovered in a series
of exciting experiments in October 1974. Subsequent analysis established the

210

G. Rajasekaran

correctness of this identification and this, in turn, established the correctness of


the GIM conjecture. These ciS bound states are called tjJ. More will be said on
tjJ particles in Section (24).
Apart from the four quark 'flavours' u, d, sand c, a fifth flavour b (called
'bottom' or 'beauty') was discovered in 1977-78 by a repetition of history, namely
through the observation of the bound state b6. To complete the SU(2) doublet
structure, one more quark flavour t (called 'top' or 'truth') must exist. If it exists
then, the three quark doublets (referred to as three generations):

would be in parallel with the three generations of lepton doublets which are
already known to exist:

The T lepton (with mass 1.78 GeV) was discovered in 1975; the existence of its
associated neutrino v, has been inferred indirectly from the decay properties ofr.

14.

Renormalizability

The Lagrangian in Equation (104) describing electroweak theory is exactly


invariant under the general SU(2) x U(1) symmetry. Of course, the physical
solutions of the theory describe massive Wand Z and massless photon and,
hence, the general SU(2) x U(l) is broken. But the distinguishing feature of the
mechanism of spontaneous breaking of symmetry through the non vanishing
vacuum expectation value of 4>, is that although the solutions break the
symmetry, the Lagrangian as well as the equations of motion remain invariant.
This symmetry is not merely a matter of aesthetics. It turns out that it is this
invariance under general transformations which is directly responsible for the
renormalizability of this theory.
What is renormalizability? Relativistic local quantum field theory is, in
general, afflicted with ultraviolet divergences, i.e. the higher-order loop diagrams
give divergent conributions from the ultraviolet end (k -> CI)) of the virtual
momenta. However, fortunately there is a class of quantum field theories in which
finite meaningful results can be obtained for physical quantities in spite of thc
presence of these divergences in the intermediate steps of the calculation. This is
done by absorbing these divergences into a few parameters of the theory such as
masses and coupling constants occurring in the theory and, thus, renormalizing
these parameters. In the class of renormalizable theories, after this renormalization of the parameters, no more divergences remain; but for nonrenormalizable
theories infinitely more types of divergences remain.

Standard Gauge Model of High Energy Physics

211

Examples of renormalizable theories are:


QED:
Yukawa Coupling:
Self coupling of :
and examples of nonrenormalizable theories are:

If/tf; If/tf; ,

Fermi theory:
Derivative coupling:
Massive vector boson theory:

In perturbation theory, the elementary criterion of renormalizability is simply


that the degree* of divergence D of any Feynman diagram be independent of the
number of vertices or of the number of internal lines. For instance, in the case of
QED as well as Yukawa coupling, the degree of divergence is
(131)

where Fe and Be are, respectively, the number of external fermion and boson lines.
This is independent ofthe number of vertices or the number of internal lines in the
diagram. After renormalization of a few simple processes with small values for Fe
and Be' for which D is positive, we see that D becomes negative for the rest of the
theory, thus leading to a renormalizable theory.
On the other hand, for Fermi theory,
(132)

where V is the number of vertices in the diagram. Here D increases with the
number of interaction vertices which is the same as the order of perturbation
theory. A finite number of renormalizations is not enough and so this is
a nonrenormalizable theory.
Our interest is in the massive vector boson theory. The coupling for this theory
If/y" tf; V" is the same as in QED. Why is this nonrenormalizable then? The reason
lies in the difference between the propagators:
massive boson:

(g"v - k"kvlm~)/(k2 - m~),

(133)
(134)

For k --+ CfJ, because of the extra term involving k"kv, the massive-boson
propagator has two additional powers of momenta as compared to the photon
case. Hence, for the massive boson case, we have to add two times the number of
internal boson lines Bi to the degree of divergence D in QED given by Equation

* Defined as the overall power of momenta in the numerator minus that in the denominator in the
Feynman integral.

212

G. Rajasekaran

(131) and we get


D = 4 - iFe - Be

+ 2B

(135)

Into any given Feynman diagram with specified numbers of external lines Fe and
Be' we can easily introduce any number of additional internal boson lines Bj
which will correspond to higher-order processes. Thus, the degree of divergence
again increases arbitrarily and we end up with a nonrenormalizable theory.
General invariance comes to our rescue here. In a generally-invariant theory,
one can change the gauge. (We had an example of this while doing Higgs
mechanism.) There exists a gauge in which the k/1 kv term of the massive vector
propagator can be dropped so that the propagator becomes g!1v1(P - m~) whose
high-momentum behaviour is the same as that of the photon propagator g/1v1k2.
Thus, the theory becomes renormalizable.
If we had added explicit mass terms 1m~ W /1 W /1 to the Lagrangian in
Equation (104), this would break the general SU(2) invariance of the Lagrangian
and we would not be able to remove the k/1kv term in the propagator by a gauge
tranformation. It is only because we left the Lagrangian generally-invariant and
brought masses for the vector bosons through spontaneous symmetry breaking,
that we are able to remove the k/1 kv term and achieve renormalizability*. Hence,
the importance of spontaneous symmetry breaking in the construction of the
electroweak theory.
The proof of renormalizability of non-Abelian gauge theory with spontaneous
symmetry breaking is not as simple as we have indicated; ours is only a heuristic
argument. The proof was first given in 1971 by 't Hooft. In fact, it was 't Hooft's
work which revived interest in the generally-invariant SU(2) x U( 1) electroweak
model, which had been ignored by most physicists although it had been
constructed four years earlier. The subsequent experimental discovery of the
neutral-current gave a further boost to the theory, as we have already discussed.
As mentioned above, Fermi's theory which was the basis of weak interaction
physics, belongs to the class of nonrenormalizable theories and the construction
of a renormalizable weak interaction theory had remained as one of the
fundamental problems in high energy physics. General invariance followed by its
spontaneous breaking has solved this problem.
However, there is an obstacle. The axial vector coupling of fermions which is
a chief feature of weak interactions creates a quantum-field-theoretical anomaly
in the higher orders of perturbation theory (See Figure 8) and destroys the
renormalizability of the theory. This subject of axial vector anomaly, as well as
other anomalies, has become an important topic of research in modern quantum
field theory and we cannot do justice to that topic here. For our purposes, it is
sufficient to note that although the anomaly exists for leptons and quarks

* It turns out that for a massive neutral vector boson coupled to a conserved current, the k" k, term

can be dropped, even if the mass term !m~ V" V" is introduced explicitly (i.e. not by spontaneous
symmetry breaking). However, for weak interactions involving charged massive vector bosons,
explicit mass term would lead to a nonrenormaliza ble theory.

Standard Gauge Model of High Energy Physics

J'J'J-'

213

gauge fields
leptons or quarks

Fig. K

separately, it turns out that the SU(2) and U(1) quantum numbers of the leptons
and quarks are so arranged that the coefficient of the anomalous term is equal
and opposite for the leptons and quarks and, hence, the total contribution to the
anomaly is zero. Hence, renormalizability of the theory is saved. Note that for this
to be valid the exact correspondence between leptons and quarks is essential; the
number of generations of the leptons and quarks has to be equal and the top
quark must exist!
15.

Spontaneous Symmetry Breaking and Phase Transitions

There exists a similarity between the spontaneous breakdown of symmetry and


the phenomenon of phase transition. In particular, Kirzhnitz and Linde [3] in
1972 pointed out the close analogy between the Goldstone-Higgs Lagrangian of
sections 3 and 4 with V chosen to be type (b) and the free energy expression in the
Landau-Ginzburg phenomenological theory of phase transitions. As a consequence of this analogy, there exists a critical temperature Te , above which the
symmetry between weak and electromagnetic interactions is restored. So,
a collection of leptons and quarks with conventional weak and electromagnetic
interactions will behave entirely differently if their temperature is raised above Te'
The sriking physical differences are as given in Table I I.
Table II.

T..

T<

mw

82GeV

m,

94GeV

T>

T.

m, = 0

Weak interactions weak and


short ranged; Electromagnetism
stronger and long-ranged

Both weak and electromagnetic interactions have


same strength and are
long-ranged.

G. Rajasekaran

214

However, the critical temperature Tc is of the order

Tc ~

<> ~ mw
~ 500GeV ~ 10
g

16 OK.

(136)

This is certainly too hot for terrestrial physics, but not for the physics of the early
universe. In fact, phase transitions in the early universe is now a hot topic of
research where high energy physics and astrophysics come together.
Let us return to the analogy between spontaneous symmetry breaking and
phase transitions and consider, in particular, phase transition of a normal metal
to super-conducting state. Here, the correspondence is very close and, in fact, the
Higgs Lagrangian of Equation (25) can be: regarded as the relativistic generalization of the Landau-Ginzburg model for the superconductor. The superconducting state with a non vanishing order parameter is the analogue of the broken
symmetry state with non vanishing vacuum expectation value for the Higgs field.
It is known that magnetic fields cannot penetrate inside a superconductor for
large distances beyond the London penetration length. This is known as the
Meissner effect. An equivalent statement is that, the photon has become massive
inside a superconductor (the mass being given by the inverse of the penetration
length), which agrees with our result in Section 4 that the U( 1) gauge boson of the
Higgs model becomes massive as a result of symmetry breaking.
This analogy with superconductivity may throw further light on the mechanism
of spontaneous symmetry breaking which is a crucial ingredient in our
construction of the electro weak theory. In this construction, the spontaneous
breaking of symmetry was facilitated by the introduction of the 'elementary'
Higgs scalar field . The analogue of in superconductivity is the 'Cooper pair'
formed by the composite of two electrons. Can the elementary field of the
electroweak theory also be replaced by some composite Jf(where f is a Fermion
field)? We do not know at present.
We may also raise here a related question. Note that the electroweak theory
described by the Lagrangians of Equations (104) or (68) contains the 'physical
Higgs boson' 1] which is the remnant of ,po Does this 1] particle exist in nature?
Again, we do not know the answer at present. Results of searches for 1] in the
ongoing experiments as well as experiments projected for the future, may lead us
to a better understanding of the electro weak symmetry breaking.
So much for electro weak theory. We now turn to QeD.

16.

Deep Inelastic Scattering, Asymptotic Freedom and Colour SU(3)

Remember the gap in Equation (73) of Section 10, namely, the unspecified 'strong
interactions among the quarks'. We now specify that these interactions are to be
described by a non-Abelian gauge theory based on SU(3), the so-called colour
group. The theory is known as quantum chromodynamics (QCD). According to
this theory, each of the quarks (u, d, . .. etc.) is a triplet under colour SU(3); since

Standard Gauge Model of High Energy Physics

215

SU(3) has eight generators, there are eight colour gauge vector bosons and they
are called gluons. The QCD Lagrangian is
!l'

-aIlGt - avG~ - gfijk G~G~)2

{iyll(a l - ig~G~) -

m},

(137)

where q is a quark field (u, d, ... etc.), G~ is the gluon field, g is the gauge coupling
constant, i goes over 1 to 8, f ijk are the structure constants ofSU(3) group and ).i/2
are the SU(3) generators in the triplet representation of the quarks. The colour
index of the quark (going over 1,2,3) is suppressed. The QCD Lagrangian
contains the interaction vertices shown in Figure 9.

:~,\x
gluorl - gluon

",er tlces

)-

quark - r;luorl vertex

Fig. 9.

Colour denotes a new degree of freedom which actually had its origin in old
quark physics - namely the conflict of the apparent total symmetry of the
three-quark wave function in the baryonic ground state with Fermi-Dirac
statistics. As a simple example, consider the baryon L1 + + (1238) which is
a doubly-charged spin-3/2 baryon occurring as a resonance in the pn+ system at
a mass of 1238 MeV. It is made up of three u quarks each of electric charge 2/3, so
that the total charge is 2. The wavefunction of the three u quarks in the ground
state contains a spatial part which is symmetric, corresponding to zero relative
orbital angular momenta and a spin-part which is also symmetric corresponding
to total spin 3/2. There is good phenomenological support for this assumption.
But then the total wavefunction of the three quarks is symmetric under
interchange of their space and spin labels, thus violating the anti symmetry
requirement of fermionic wave functions. Antisymmetry is restored by the
invention of a new quantum number, called colour, which is three-valued and
assigning an anti symmetric colour wavefunction for the three bound quarks.
Now the total wave function made up of spatial, spin and colour parts is
anti symmetric.

Why QeD?: For a long time, physicists had given up field theory as a useful
approach for understanding strong interactions and taken to the S-matrix
approach. So, what caused the resurgence of field theory in strong interaction
physics and what is the reason for going for this non-Abelian gauge field theory
(QCD)?

G. Rajasekaran

216

e
photon of large q2

Fig. 10.

The reason comes from an experiment -the so-called deep inelastic scattering of
leptons on the nucleon: (See Figure 10)
It was found that, as observed by a high q2 probe, the nucleon behaves as ifit
were composed offree, point-like constituents (called partons by Feynman). The
lepton scatters off each parton, elastically and incoherently. The incoherent sum
of all parton cross-sections gives a very good description of the experimental
results. Thus, the complete cross-section for the electron scattering off the
nucleon can be written (schematically) as
(IN

~I

11
0

dxJ;(x)(li{X)'

(138)

where (l;(x) is the electron-scattering cross-section of the ith parton with


fractional longitudinal momentum x and J~(x) is the probability for finding the ith
parton with fractional longitudinal momentum x inside the nucleon. Integrating
over all the fractions and summing over all the partons i incoherently, we get the
electron-nucleon cross-section. It was a remarkable discovery that such a complicated process could be described by such a simple formula. This simple
behaviour of deep inelastic scattering is also known as Bjorken scaling and,
naively, it is related to the absence of a length-scale or momentum-scale at high
energies in local quantum field theory. Similar results were found also for the
neutrino-nucleon scattering processes:
v11

+ N --> J1 + hadrons (charged-current weak interaction)


--> v11 + hadrons (neutral-current weak interaction).

This phenomenon has a rather close resemblance to Rutherford's famous


:x-particle scattering experiments which led to the discovery of the nucleus inside
the atom. Thomson's spread-out atomic model would lead to soft scattering (i.e.
small scattering angles) only. Experimentally, Rutherford and collaborators
found hard scattering (i.e. large scattering angles), thus showing the presence of
the point-nucleus inside the atom. In the same way in the deep inelastic
lepton-nucleon scattering, even for large q2 (i.e. large scattering angle), scattering
was observed to take place, in contrast to what would be expected for a spread

Standard Gauge Model of High Energy Physics

217

out nucleon. This leads to the discovery of point-like constituents deep inside the
nucleon.
More detailed study of the experimental data revealed that these partons are in
fact quarks; they seemed to have the same spins and charges as expected for
quarks.
Attention should now be drawn to the adjective 'free'. In addition to being
point-like, the quark-partons behave as if they are free. If they are interacting, the
cross-section formula would not be so simple.
Now, the quarks are bound by tremendous attractive forces to make up the
nucleon. So, the interaction between quarks should really be superstrong. And
yet, when observed through high q2 probes, this superstrong interaction weakens
to such an extent that the quarks behave as free particles.
For quite sometime this was a mystery. On the other hand, this provided an
important clue about the nature of the strong interaction itself. We can now say
that any theory of strong interactions should satisfy this property, namely, it
should tend to a free particle theory or a free field theory at high q2. Is there any
such theory?
Consider nonrelativistic potential scattering, i.e. nonrelativistic particles
interacting through well-defined smooth potentials. Since the total energy can be
written as E = T + V, as the kinetic energy T increases, the potential energy
V becomes less and less important in comparison, so that for high energies the
theory does tend to a theory of free particles, for properly defined smooth
potentials.
But, of course, this is not useful for high energy physics which has to be
described by relativistic quantum mechanics. Here, particle-production dominates at high energies and potential description fails.
So, we should ask the same question in the realm of relativistic quantum field
theories. Here it is renormalization group which provides the required technique.
By using renormalization group, one can define a momentum-dependent
coupling constant g(q2), also called effective coupling constant. So, what we need
is a theory in which
g(q2) ---> 0

for q2

---> 00.

(139)

Such a theory is called asymptotically free, i.e. the theory tends to a free field
theory for asymptotic momenta.
To cut the long story short, it was soon discovered that none of the
conventional field theories such as cjJ4, Yukawa interaction If/t/lcjJ or QED If/y II t/I All
is asymptotically free. Of all the renormalizable quantum field theories, only
non-Abelian gauge theory was found to possess the unique distinction of being
asymptotically free. The characteristic triple gluon vertex shown in Figure 11 is
the essential ingredient that makes this theory asymptotically free.
So, asymptotically free non-Abelian gauge theory emerged as a good choice for
a theory of strong interactions. Since the colour degree of freedom with three

218

G. Rajasekaran

Fig. 11.

colours was already available, as explained at the beginning of this section, the
gauge group was taken as the colour SU(3) and QCD was born.
In the next few sections some details on the theory of asymptotic freedom will
be given.

17.

The Renormalization Group Equation [4]

Consider a renormalizable field theory such as c/>4 theory described by the


Lagrangian
!f -

ac/> ac/> _

- 2 axil aXil

1 2,/,2 _
2/1 '"

,/,4

g", ,

(140)

where c/> is a real scalar field. This theory is characterized by a single dimensionless
coupling constant g and a single mass /1.
Let ['(Pl' .. Pn) be a renormalized n-point Green's function of this theory for
n external particles of momenta Pl' .. Pn' Pictorially, ['(Pl' .. Pn) is represented by
the sum of all the Feynman diagrams of the type indicated in Figure 12.
We take Green's function to be single-particle irreducible and external-line
truncated, i.e. diagrams of the type in Figure 13 in which a single-particle line
connects two parts ofthe diagram are not included in ['(Pl' .. Pn) and, further, the
external lines are not provided with propagators.
It is possible to show that for asymptotic momenta i.e. for Pl'" Pn -> 00,

-x *
Fig. 12.

Standard Gauge Model of High Energy Physics

219

Fig. 13.

r satisfies the renormalization group equation

[p :p +

fJ(g) :g - ny(g)Jr = O.

(141)

(This is also the asymptotic version of the so-called Callan-Symanzik equation.)


A quick derivation of this equation goes as follows: In the asymptotic region (p 1 ,
P2'" Pn ~ p) one might be tempted to think that all memory of the actual mass
p would be lost and all Green's functions would be independent of p. This is
wrong. r is a Green's function for renormalized fields expressed as a function of
the renormalized coupling constant. But the normalization of the field and the
value ofthe renormalized coupling constant are defined on the mass shell. So, the
Green's functions remember the mass shell, no matter how far we go into the
asymptotic region. Therefore, the correct statement should be that, all memory of
the actual value of p is lost, except for that which is contained in the scale of the
fields and the value of g. In other words, in the asymptotic region, a small change
in mass can always be compensated for by an appropriate small change in g and
an appropriate rescaling of the fields (n fields for the n-point function). Equation
(141) is just the mathematical expression of this statement.
Another way of looking at the renormalization group equation is to observe
that for a renormalizable theory, once the (infinite) renormalizations of the bare
quantities render the theory finite, any further finite renormalizations do not
change the predictive content of the theory. The renormalization group equation
(141) simply expresses that fact.
For renormalizable theories, the renormalized Green's function r expressed as
a function of the renormalized mass J1 and renormalized coupling constant g is
a finite function of g and p and, hence, the coefficient functions fJ(g) and y(g) in the
partial differential equation (141) should also be finite functions of g. That they
are functions of g alone, follows from dimensional argument. fJ(g) is the so-called
Callan-Symanzik function and it characterizes the field theory in a very
important way and y(g) is the anomalous dimension of the field operator <p.
In (141), p need not be the actual mass of the particle; more generally, it is an
arbitrary mass at which fields and coupling constants are normalized. In this
form, Equation (141) is applicable even to a massless theory.

220

18.

G. Rajasekaran
Formal Derivation of the Renormalization Group Equation

Let go be the bare coupling constant and)l the arbitrary mass at which fields and
coupling constants are normalized. The renormalized coupling constant 9 is
a function of these
(142)
9 is actually a function of the ultraviolet cut-off A also, but we shall suppress its

dependence. We hold A fixed and do not consider variations in A. The


unrenormalized Green's function r 0 is a function of go; expressing go in terms of
9 using the inverse of the equation (142) and performing a multiplicative
renormalization, we get the renormalized Green's function r (which is, in fact,
independent of A in renormalizable 4 field theory):
(143)
where Z is the field-renormalization constant. Since the Jl dependence enters only
after renormalization, (we are considering either the asymptotic region or
a massless theory) r 0 does not depend on Jl. Hence,

( oro)
of.!

90

o.

(144)

Using (143) and multiplying by

Jl, (144) becomes

Jl(oz-niZ) r + JlZ- nI2 (or) = o.


OJl 90
OJl 90

(145)

For the renormalized Green's function r(g(go,


derivative in the following way

(!:)90 = (!:)g + G:)90e~t

Jl), Jl),

we convert the partial

(146)

Thus, we get the desired equation:

{Jl(:Jl)9 + p(g)(:g)~ -

ny(g)}r

==

where we have defined

p(g)

Jl(:g)

(147)

Jl 90

and

y( 9 )

!!.(OlnZ)
2 0
Jl

90.

(148)

Thus, the above derivation has also yielded the definitions of p(g) and y(g).

Standard Gauge Model of High Energy Physics

221

19. Solution of the Renormalization Group Equation


Let us rewrite the RG equation:

{J.l

V~ + fJ(g) :g -

ny(g)}r(g, APi' J.l)

= 0,

(149)

where we have put back all the dependences into r. Pi denotes the set of n external
momenta all of which are multiplied by a scale factor A and our aim is to
determine the behaviour of r for large A.
If d is the canonical (or naive) dimension of the field (for scalar field, d = 1),
then the mass-dimension of our n-point Green's function r is nd. Hence, r can be
written as a product of, J.l nd and a function of dimensionless quantities only

(150)
So,

r(g,APi'J.l) =

An{i)"df(g'A~)

:dndcP(9, i,pi).

(151)

a
1
lnd acP
And AacP
J.lVJ.lr(g,APi,J.l)=A J.laJ.l = VA'

(152)

therefore

where we have used the fact that the dependence of cP on J.l and A. is through J.lIA
only. We now define the variable t:
(153)

t=lnA
Combining (149), (152) and (153), we get

{ - :t + fJ(g) :g}cP

= ny(g)cP.

(154)

Equation (154) can be solved by making use of its similarity to hydrodynamic


equations, t and 9 playing the roles of time t and position x and fJ(g) playing the
role of the velocity function at the point g. In hydrodynamics, one defines the
moving coordinate x in terms of which the partial differential equation gets
converted into a total differential equation which can then be solved. The same
*By dimension, we mean the mass-dimension which is equal to the negative of the length-dimension
sinceh=c=l.

222

G. Rajasekaran

trick is used here. One defines the moving coupling constant g(g, t) by

0-

o~ (g, t) =

P(g);

g(g, 0) = g.

(155)

g(g, t) is also called the effective or momentum-dependent coupling constant


(note that t contains the scaling factor for the momenta), and plays a very
important role in renormalization group analysis. In terms of g(g, t), the solution
of (154) can be directly obtained (see next section) and multiplication by And then
gives r:
r(g, APi' /1)

= Andr(g(g, t), Pi' /1) exp{ n

l'(g(g, t' dt'}-

(156)

It can be seen that the essential dependence of r on A. or t has been isolated;


apart from the factor And, it is contained in the exponent. The main point is that
the r on the right-hand side of (1 56) contains Pi and not the unknown dependence
on APi. It is true that there is still a A. or t dependence of this r through g(g, t), but
this is a mild dependence as will be clear from what follows. The crucial
dependence is in the exponent.

20.

Hydrodynamic Analogy

This is essentially an appendix to the last section. Consider the hydrodynamic


equation

op

at (x, t)

where

op
+ v(x) ax (x, t) = S(x)p(x, t) ,

= density of bacteria in a fluid moving in a


v(x) = velocity of the fluid in the pipe,

p(x, t)
S(x)

(157)

pipe,

= some external influence (such as illumination)


affecting the bacterial population.

To solve such an equation, one first defines the 'moving coordinate' x(x, t) by the
equations:
ax

-at (x, t) = v(x); x(x,O) = x.

(158)

Then, Equation, (157) can be thrown into the form

d:

(x(x, t), t)

= S(x(x, t p(x(x, t), t),

(159)

where the left-hand side now contains the total derivative in time. Equation (159)

Standard Gauge Model of High Energy Physics

can be integrated to give


p(x(x, t), t)

p(x, 0) exp

S(x(x, t')) dt'.

223

(160)

A similar technique can be used to get the result quoted in Equation (156).
21.

Fixed Points and Asymptotic Freedom

From (155) and (156), it is clear that it is P(g) which controls the asymptotic
behaviour of f; for, P(g) determines 9 which then is used to construct the solution
in (156). Actually, it is the zeroes of P(g) called the fixed points which control the
asymptotic behaviour of f in a crucial way. This can be seen as follows:
For illustration, consider the example shown in Figure 14a where P(g) is
positive, but has a zero at g = g*. With this form of P(g), Equation (155) leads to
the behaviour of 9 shown in Figure 14b. 9 starts with the value g at t = 0 (as

g* - - - - - - - - - -

cj

rJIg)

I a)

I b)

Fig. 14.

demanded by the boundary condition in (155)) and increases with t since the
'velocity' 8g/8t = p(g) is positive. But, as g* is approached the velocity becomes
smaller and smaller and so g changes less and less. At g*, the velocity P(g*) is zero
and that is the asymptotic value of g,
g(g, t) ------7 g*.

(161 )

t---+oo

As t -+ 00, Aand, hence, APi -+ 00. So, g* is called the ultraviolet fixed point of P(g).
We may next consider the infrared limit A -+ 0 which corresponds to t -+ - 00
(see (153)). By running the above argument for negative t, one can convince
oneself that
g(g, t)

t---+ - 00

) O.

Hence, in Figure 14a, the origin g

(162)
=

0 is an infrared fixed point of P(g).

224

G. Rajasekaran
g (t)

______

P(g)

IR

-0) 4-

t::o

t-+

CD

(tl)

Fig. 15.

Just for fun, we may consider a theory with a number of fixed points as shown
in Figure 15a. The corresponding behaviour of g is indicated in Figure 15b.
The fixed points of P(g) alternate between ultraviolet and infrared. As shown in
Figure 15b, the ultraviolet (t -+ (0) and infrared (t -+ - (0) asymptotic limits of
g d<!pend on the starting values of g at t = O.
Let us now go back to (156). The ultraviolet asymptotic behaviour of r can. be
now easily obtained by making the replacement g -+ g*, where g* is an ultraviolet
fixed point:
1
r( g, APi'
Jl ).Ie

-+00

) A.ndr(g*, p,., ,..II) eny(g*)t


(163)

So, after all this, we recover the power-behaviour in the scale parameter A., but the
important point is that the exponent is not the naive or canonical dimension d,
but the dynamical dimension d + y(g*) evaluated at the UV fixed point g*. The
ultraviolet asymptotic behaviour of Green's function for fields is dictated by the
anomalous dimension y of the field at the UV fixed point g* of the p function.
This anomalous dimension would spoil the Bjorken scaling of deep inelastic
structure functions. But, as already noted in Section 16, experiment suggests that
Bjorken scaling is valid. What is the way out? The answer is that we need a theory
in which the origin 9 = 0 is an ultraviolet fixed point. In contrast to Figures 14
and 15, our Pfunction should start negatively near the origin, as shown in Figure
16. In this case, the 'velocity' is negative and g decreases to zero asymptotically in

f3

(g)

n_

(a)

Fig. 16.

t --+

(b)

Standard Gauge Model of High Energy Physics

225

the ultraviolet region


(164)

g(g,t)~O.
t --+ CD

This is the 'asymptotically free' theory. The asymptotic behaviour of r is now


governed by free field theory (i.e. 9 = 0). The asymptotic anomalous dimension is
zero: y(O) = O. Such a theory can provide a framework for understanding Bjorken
scaling and parton-structure.
As already mentioned, non-Abelian gauge theory alone possesses the unique
distinction of being asymptotically free and, hence, QCD.
22.

Asymptotic Freedom of QCD

The QCD Lagrangian is given in Equation (137). Our aim is to calculate P(g) and
y(g) to the lowest nontrivial order in g. Rather than use the formal definitions of
(147) and (148), we proceed as follows. We ignore the quark fields first and
calculate the renormalized Green's functions in perturbation theory for a few
values of n, say n = 2 and n = 3 (number of external gauge boson lines). For
n = 2, one gets*
r

(2)ab
!-LV

a
b
-()(p)=--+
!-L V

."'-,( - ',;-- +
'--

+~+

higher order diagrams

(165)

Here, CG is the quadratic Casimir operator for the adjoint representation of the
group and it is defined by
(166)

Since the Green's functions are truncated ones, two inverse propagators have
been multiplied into our Green's function and so r(2) is actually the inverse ofthe
propagator. For n = 3,

other diagrams

The dotted lines in the fourth diagram represent the so-called 'ghosts' whose existence in the virtual
states of non-abelian gauge theory was discovered by Faddeev and Popov, Another technical point is
that the Landau gauge has been chosen in the calculations,

226

G. Rajasekaran

(167)
These perturbative expressions for
renormalization group equation

{J1 ~
OJ1 + [3(g)~
8g -

ny (g)}
G

r(n)

P2)

and

r(3)

are substituted into the

0 (n = 23)
, ,

where YG(g) is the anomalous dimension of the gauge field and the values of [3(g)
and YG(g) are determined from the requirement that the renormalization group
equation be satisfied. The results are the following:

(168)

Note CG defined by (166) is positive. As already advertised, [3 is negative near the


origin g = 0 and, hence, non-Abelian gauge theory is asymptotically free. It is the
cubic vertex characteristic of the non-Abelian gauge field that is responsible for
the negative [3.
We may now include the quarks. The quark-gluon vertex is of the asymptotically nonfree type like the electron-photon vertex in QED, and adds a positive
contribution to [3. The results are

YG=

r:

Yq= 0

CG

~ Cq}(:nY + O(g4),

+ O(g4)

(169)

where Cq is defined by

Tr(~ ~)

2C q bu.

(170)

The anomalous dimension of the quark field Yq remains zero in order gZ.
From the expression for [3 in (169), it follows that the condition for asymptotic
freedom or ultraviolet stability of the origin is
(171)

Standard Gauge Model of High Energy Physics

227

For the SU(3) group relevant for QeD,


C =NJ
q

4'

(172)

where N J is the number of flavours (N J enters because the trace in (170) should be
taken over all the quark degrees of freedom including flavour). Hence, (171)
becomes
(173)

In other words, as long as the number of flavour quantum numbers is less than or
equal to 16, the quark contribution does not destroy the asymptotic freedom of
QeD. (At present, we have three generations of quarks, which implies N J = 6,
and so asymptotic freedom appears to be safe.)
The situation with respect to Higgs bosons is more complicated. Once the
Higgs scalar field is added to the system further independent coupling constants
such as the cfJ4 coupling constant enter the picture and the origin is generally
unstable with respect to these coupling constants. So Higgs bosons are avoided in
the QeD sector and we have the central dogma of high energy physics, namely
SU(3)colour symmetry is exact. The price we have to pay for keeping asymptotic
freedom is a theory with massless gluons, which leads to terrible infrared
divergences. We shall come back to this a little later.
Let us write (using (169) and (172))
bg 3
{3(g)=-4;;

b=12n(33-2NJ 0

(174)

and solve the equation for the effective coupling constant


(175)
To solve this, it is better to rewrite it in the form
b
2n'

d
dt g2(t)

(176)

The solution is
(177)

Or
g2(t)

g2(0)
--'--h--

(178)

1 + 2ng2(0)t
For deep inelastic lepton-hadron scattering, we may make the following

228

G. Rajasekaran

identification

InA

t =

1 q2
-21n2'
qo

(179)

where q2 is the momentum transfer to the hadron and q6 is a reference value. Let
us also define

(180)
Then, Equation (178) can be rewritten as
as (q

2)_
-

as(q6)

(181)

2 .

1 + ba s (q6) In q 2
qo

Thus, the effective coupling constant goes to zero for q2 ->


approach to zero is rather slow, only logarithmic. Defining

1)

Ac2,= qo2 exp ( - ba s(q6)

00,

however, the
(182)

we get
1

as(q ) = bln(q2/A;)'

(183)

Since b is a known constant (apart from the slight uncertainty in the number of
flavours, we see that QeD is characterized by one unknown constant A c ' which is
to be determined by experiment. Unfortunately, there is considerable uncertainty
in the empirical determinations of this parameter. A recent analysis gives
Ac = 150~i6gMeV.

For Ac

as(q2

(184)

100 MeV,
= I

Ge V2)

0.2.

(185)

Thus, even at, 1 GeV, the QeD coupling constant is fairly small, thus justifying
perturbative calculations.

23.

Infrared Problem and Colour Confinement

It is illuminating to consider the contrasting behaviour of the effective coupling


constant in QED and QeD as a function of t = 11n (q2/q6 ). This is illustrated in
Figures 17 and 18.
In QED, the f3 function is positive near e ~ 0 and so the effective coupling
constant e(t) increases with t or with q2. So, it is not asymptotically free. But for
q2 -> 0, i.e. t -> - 00, the problem is very well controlled. This means that in the
infrared region, there is no real difficulty with QED, as is well-known.

229

Standard Gauge Model of High Energy Physics

(I)

OE D

- 00""- t

1 -+CD

Fig. 17.

But in a non-Abelian gauge theory such as QeD, the behaviour is completely


reversed. The theory is asymptotically free in the ultraviolet region and, thus, is
a good candidate for a theory of strong interactions, as we have already discussed.
However, the infrared region is really catastrophic for non-Abelian gauge theory.
Hence, QeD does not really exist as a theory on the mass-shell.
It is hoped this can be turned to our advantage. The infrared catastrophe can
perhaps be used to solve another problem~namely the problem of colour
confinement. What the sketch in Figure 18 shows is that a single non-Abelian
quantum on the mass shell (q2 = 0) has infinitely large colour charge
g( - 00) --+ 00, and so will copiously emit virtual quanta. These virtual quanta may
surround and completely screen the original quantum. So, a single non-Abelian
quantum (i.e. the gluon) with q2 = 0 cannot exist as a free particle outside the
hadron.
The situation inside the hadron is different; short distances correspond to high
q2 for which the effective coupling goes to zero, because of asymptotic freedom
and so the quanta do behave as massless particles inside hadrons.
What is described above is the infrared mechanism for colour confinement.
This mechanism can also work for quarks, since the interaction of the quarks with
gluons are governed by the same effective coupling constant g(t).
There are many other mechanisms which have been discussed in the recent
literature for confining colour. However, in spite of much work, the dogma of
colour confinement remains an unproved hypothesis.

oeD

0) ....

Fig. 18.

1 -+ CD

230
24.

G. Rajasekaran
Tests of QeD

This is still an important area of experimental and theoretical activity in high


energy physics, but we shall be very brief.
(i) Parton Model and Scaling. As already mentioned in Section 16, the
motivation for QeD came from observed scaling in deep inelastic lepton-hadron
scattering. Quantum chromodynamics via asymptotic freedom provides the
theoretical foundation for scaling and the parton model. So, the many early
successes achieved in the confrontation of the parton model with experimental
data, can be regarded as tests of QeD.
(ii) Logarithmic Corrections. Since the approach of the QeD coupling
constant to zero for asymptotic momenta is logarithmic (see Equation (183)),
there are logarithmic corrections to the parton model and scaling. Such
corrections appear to have received experimental support. However, by their
very nature, logarithmic variations are hard to see clearly, as is evidenced by the
large uncertainties in the experimental determination of the QeD scale
parameter Ac occurring in the logarithm (see Equation (184)).
(iii) Narrow Widths of 1/1 and r. As remarked in Section 13, the charmed
quark c was discovered through a certain peculiar property observed for the
particle which was being interpreted as a bound state of c and C. This peculiar
property is the strikingly narrow decay width observed for 1/1:

r",

60KeV

(186)

which is in sharp contrast to the large widths expected for strongly interacting
hadrons. For instance, p meson has width

rp =

150 Mev.

(187)

Since the mass of 1/1 which is 3.1 GeV, is much higher than the mass of p which is
770 MeV, the available phase space is much more for 1/1 decay and, hence, the
expected width of 1/1 is several hundred Me:V. This was the puzzle of the 1/1 particle.
It was resolved by asymptotic freedom; the momentum-dependent coupling
constant of QeD evaluated at 3.1 Ge V is small enough to provide an explanation
for the small width of 1/1. For, decay width or decay probability is a product of the
coupling constant and phase space apart from other kinematic factors.
Thus, the correct interpretation of 1/1 and its properties not only requires
a crucial ingredient of electroweak theory, namely the existence of a new quark c,
but also asymptotic freedom which is a characteristic of QeD.
The phenomenon of small width repeats itself for bl:) bound states called r
(upsilon), occurring at mass'" 10 GeV, where the width is

ry ~ 42 KeV.

(188)

The same phenomenon may occur with a vengeance for ttbound states (called
toponium), whose mass is expected to be very high (> 80 Ge V). At such high
energies or momenta, the strong QeD coupling constant would have become so
small that weak and electromagnetic decays may dominate over strong decays!

Standard Gauge Model of High Energy Physics

231

(iv) Jets and Gluon Radiation. At high energies, electron-positron annihilation


is known to produce hadrons in the form of two jets and this has been understood
to be due to the production of a quark-antiquark pair which subsequently
materializes in the form of a pair of jets made up of hadrons (see Figure 19a). If
QCD is right, one must also see events with three jets, the third jet coming from
a gluon radiated away from a quark or an antiquark (see Figure 19b). Such

(a)

Fig. 19.

a three-jet phenomenon was discovered in the e + e - collider, PETRA, at


Hamburg (with a c.m. energy '" 30 Ge V or higher). This is generally taken to be
the evidence for the existence of the gluon which, in turn, supports QCD. After the
advent of the pp collider at CERN (with a c.m. energy of the order of 500 GeV or
higher), jet phenomena and gluon physics have received further experimental
support.
However, one must keep in mind that all the above tests of QCD are indirect
and are to be contrasted with the direct test of electro weak theory, such as the
discovery of the neutral current or the discovery of Wand Z bosons. In fact,
because of the dogma of colour confinement, QeD is doomed to indirect verification
only.

25.

The Standard Model of High Energy Physics

We have now built up all the elements of the standard model and we assemble
them here. The standard model is based on the gauge group SU(3) x SU(2) x
U(l). While SU(3) leads to quantum chromodynamics (QCD) and decribes strong
interactions among the quarks, SU(2) x U(1) leads to quantum flavour dynamics
(QFD) and describes electroweak interactions among the quarks and leptons.
The gauge bosons ofQCD are the eight gluons G~(i = 1 ... 8). The gauge bosons
ofQFD are W~(a = 1,2,3) and BI"
The colour symmetry SU(3) is supposed to be unbroken, leaving the gluons
massless, paving the way for colour confinement. The electroweak symmetry
SU(2) x U(I), on the other hand, is broken and the breaking is presumed to be
induced by the nonvanishing vacuum expectation value of the Higgs scalar field

232

G. Rajasekaran

which is chosen to be a doublet under SU(2). The vacuum expectation value

<) is chosen to be ~ 300 Ge V in order to obtain consistency with the observed


value of GF and Equations (63) and (121).
The particle sector comprising leptons and quarks is taken to be the three
generations of fermions

In the above, !Y. denotes the colour index of the quarks. Among these fermions, the
top quark, t, which is the heaviest, has remained elusive, although the UAI
experiment at the pp collider has obtained some evidence for its existence around
a mass of about 40 GeV.
Since the weak interaction is helicity-dependent, it is necessary to separate the
helicities of fermions, as already explained. Counting the Land R helicities as
distinct particles, we have 15 particles for each generation. For the first
generation, they are the same as before

where the doublets and singlets under SU(2) are explicitly indicated.
Now let us write down the Lagrangian of the standard model, which is
obtained by combining our Lagrangians of QCD and QFD (Equations (137) and
(104)).
! = - t(0l' G~ - ovG~ -

-t(o/i w~

g3f ijk

- Ov w~ -

G Ge)2--

g2eabcwtw~)2 - t(0I'Bv - OVBI')2-

~qnLlII'(OI' + ig3~G~ + ig2fW~. + i~lBI' )qnL-

~iinR'l(OI' + ig3~G~ + iiglBIl)UnR ,,_

- ~ dnR yl' 01'

),i.

+ ig 3"2 G~

gl

- i3 B I' d nR -

~TnLi{OI' + ig2fW~ - i~lB/i}nL-

- I enR yl'(0l' -

ig 1 BI')e nR +

+ 1(01' + ig2fW~ + i~lBI')12 -A(t - 6)2-

-I

m.n

(r~nqmLCunR

+ r~nqmLdnR + r~nTmLenR + h.c.).

(189)

Standard Gauge Model of High Energy Physics

233

The first three terms describe the pure gauge field part of the SU(3) x SU(2) x
U(1) non-Abelian gauge theory. We now use 93,92 and 91 to denote the gauge
coupling constants for these three gauge groups, respectively. In the fermionic
terms, the SU(3) colour and the SU(2) flavour indices have been suppressed and,
instead, the index n is used to denote the generation number. The left-handed
SU(2) doublet quark of the nth generation is denoted by qnL and the
corresponding right-handed SU(2) singlets are denoted by U nR and d nR . For the
leptons, InL is the doublet while enR is the singlet.
The last group of terms describes the Higgs field cP and its interactions with
itself, with the gauge bosons and with the fermions which are, respectively,
responsible for the spontaneous symmetry breaking, generation of Wand
Z masses and generation of the masses for the quarks and leptons. Note that
,I.e

'f'

,1.*

IT 2 'f'

r.

= ( - cP!)
cP

(190)

The masses of the quarks and leptons arise from the Yukawa couplings of

cP given in the last part ofthe Lagrangian in Equation (189), as briefly explained in

an earlier section. In contrast to the rest of the Lagrangian, the Yukawa coupling
constants r::'n, r~n and r:;'n mix the generations. In fact, the mass terms are
nondiagonal with respect to parity, as well as flavour quantum numbers such as
strangeness, charm etc. These mass matrices can be diagonalized*, but then the
mixing between the generations enters through the charged-current weak
interactions (mediated by the W bosons).The mixing is described by a unitary
matrix V called the Cabibbo-Kobayashi-Maskawa matrix which is a 3 x 3
generalization of the 2 x 2 Cabibbo rotation matrix already introduced earlier:
(

8c
sin 8c

COS

- sin 8c ).
cos 8c

(191)

For the three-generation case, V can be written in terms of three rotation angles
8 1 ,8 2 and 8 3 and a CP violating phase b. Such a CP-violating phase exists only if
the number of generations is ~ 3, as was first pointed out by Kobayashi and
Maskawa. Thus, CP violation also can be introduced into the standard model.
The charged-current weak interaction is therefore modified to
UY/l(1 - y.)VDWIl

+ h.c.,

(192)

where U stands for (ii ct) and D stands for

()

and the elements of the mixing matrix V control the various flavour-changing
charged-current weak transitions among the quarks. A similar mixing matrix will

* For details of this diagonalization procedure, see for instance reference [5].

234

G. Rajasekaran

also exist in the leptonic sector if neutrinos have masses. For massless left-handed
neutrinos, suitable redefinition of the neutrino fields removes leptonic mixing (see
[5J).(1f neutrinos. have masses, they will mix and this will lead to neutrino
oscillations.)
The neutral-current weak interaction, on the other hand, is not modified and
remains diagonal in the generation space, because of the unitarity of the mixing
matrix. This is a generalization of the original Glashow-Iliopoulos-Maiani
mechanism which achieved the cancellation of the strangeness-changing neutral
current sector.
The diagonalization procedure finally yields expressions for the mixing matrix
V and the diagonal mass matrices in terms of the Yukawa coupling matrices
occurring in the standard model Lagrangian. So far, there is no theoretical
framework for fixing the values of the Yukawa coupling constants and, hence,
there exists no theoretical understanding of the values of the elements of the
mixing matrix or the diagonal mass matrices. They are purely empirically
determined.
As already mentioned, the standard model Lagrangian given in Equation (189)
is supposed to describe all that is known in high energy physics. That is the
achievement of two decades of work (the 60's and 70's).
Dirac, referring to his relativistic wave equation ofthe electron, is supposed to
have said that it describes all of chemistry and almost all of physics. In the same
vein, we are tempted to say that the standard model lagrangian describes all of
physics except gravitation. However, note the contrast in complexity. Whereas
the Dirac equation can be written down on one single line and there is no
adjustable constant, the standard model Lagrangian occupies almost half a
page and, further, there are more than 20 constants to be fixed by experiment. The
model lacks the simplicity which is the hall mark of any truly fundamental theory.
This supplies the chief motivation for going beyond the standard model.

26.

Beyond the Standard Model

Let us first spell out in more detail the standard reasons usually given for
attempting to go beyond the standard model.
(i) Too many parameters: Counting the coupling constants, the boson masses,
the various quark and lepton masses and the quark mixing parameters, the total
number of independent parameters in the standard model is about 20. They are
all empirically determined and there is no fundamental theoretical understanding
ofthese numbers. As we have already mentioned, this is one of the weakest points
ofthe standard model and the strongest motivation for considering possible next
steps.
(ii) Generation puzzle: The standard model contains no explanation for the
existence of several generations of quarks and leptons, nor any clue as to the
actual number of generations existing in nature.

Standard Gauge Model of High Energy Physics

235

(iii) Pattern within one generation: The model does not explain why quark and
lepton charges are quantized in a related way: why does integer charge come with
colour singlets and non integer charge with colour triplets? Also, the model does
not explain the apparent quark-lepton universality: why do the quarks and
leptons possess identical SU(2) properties? Since the same pattern repeats at least
three times (for the three generations), there must be a particularly good reason
for this pattern.
(iv) Unification: Theoretical physicists have an innate urge for unification. It
is felt that, in nature, the three interactions must be unified in some manner so
that the three gauge coupling constants g1'g2 and g3 are replaced by a single
unified coupling constant. The energy scale of this so-called grand unification
turns out to be '" 1014GeV.
(v) Inclusion of gravitation: Unification of other forces with gravitation is, of
course, an important aim of physics. This becomes all the more compelling if the
other forces are already unified and that unification scale (10 14 GeV) is so close to
the gravitational scale, given by Planck mass (10 19 GeV).
(vi) Hierarchy problem: Assuming that there is an important .energy scale
beyond the standard model such as the grand unification scale or the Planck
scale, it is difficult to understand how particles with masses corresponding to the
low energy scales of the standard model can survive the enormous self-energy
correction. Vector bosons and fermions may be protected from such corrections
by gauge symmetry or chiral symmetry, respectively. Scalars and their vacuum
expectation values (which generate the masses of the vector bosons and fermions)
are not generally protected. In the presence of the large energy scale ( > 10 14 Ge V),
the small scale of the standard model ( '" 100 Ge V) cannot be maintained.
The various avenues open to high energy physicists in going beyond the
standard model are the following
(a)
(b)
(c)
(d)
(e)

Grand unification,
Preons,
Induced gravity,
Supersymmetry and supergravity,
Higher dimensional unification,
(f) Superstrings.

For a brief introduction to these ideas, see, for instance, [6].


Grand unification solves problems (iii) and (iv) mentioned above. Preons do
not solve any of the problems (nevertheless, they may turn out to be the correct
next step!). Induced Gravity may provide a revolutionary solution to problem (v);
it claims that gravity and the geometry of spacetime may be derived from the
quantum effects of matter interacting through the other forces of nature (weak,
electromagnetic, strong, etc.), quite the opposite to what Einstein strove to
achieve.
The chief virtue of supersymmetry is that it provides an elegant solution to

236

G. Rajasekaran

problem (vi). Supergravity coupled with grand unification is capable of solving


problems (iii), (iv), (v) and (vi). Higher dimensions offer a beautiful geometrical
understanding of the forces contained in the standard model; gravitation in
a ll-dimensional spacetime unifies four-dimensional gravitation with fourdimensional SU(3) x SU(2) x U(l) forces.
Finally, superstrings in 10 dimensions offer the tantalizing hope of achieving
a finite or renormalizable theory of gravity, in which case superstring theory may
turn out to be the correct theory of quantum gravity. Current advertisements
claim that as a bonus, superstrings may solve all the problems of high-energy
physics (iHvi) mentioned at the beginning of this section.
To conclude, we must bear in mind that everything beyond the standard model
is a speculative idea. None of these ideas has an iota of experimental support at
present. In fact, many of these theories beyond the standard model have a bearing
on the super high-energy scales 10 14 _10 19 Ge V and so their direct experimental
confrontation is not expected soon. This is very unfortunate. However, indirect
clues coming from lower-energy experiments in the immediate future may be of
great value in deciding the future course of the subject.

References
1. T. D. Lee and C. N. Yang. Phys. Rev. 128.885 (1962);
N. Nakamura, Prog. Theor. Phys. 33, 279 (1965);
K. H. Tzou, Nuovo Cim. 33, 286 (1964);
2. G. Rajasekaran, Phys. Rev. 160, 1427 (1967).
3. D. A. Kirshnitz and A. D. Linde, Phys. Lett. 42B, 471 (1972).
4. S. Coleman, Lectures at the 'Ettore Majorana' School, 1971.
5. P. Langacker, Phys. Rep. nc, 185 (1981).
6. G. Rajasekaran, in R. Ramachandran (ed.), Recent Advances in Theoretical Physics, World
Scientific, Singapore, 1985, p. 89.

12. Introduction to Grand


Unified Theories
K. C. WALl
Physics Department, Syracuse University, Syracuse, NY 13244, U.S.A.

1.
1.1.

Grand Unification - A Survey of Basic Ideas


Introduction

Grand unified theories (GUTs) postulate that the description of the elementary
particles and their interactions will simplify enormously at some very high
energy E,

E> MGUM(grand unification mass).


The fundamental interactions which we distinguish and recognize as weak,
electromagnetic, and strong at low or laboratory energies, will become the
manifestations of one gauge interaction above M GUM ' This unified interaction
will operate between a set of basic constituents of all matter. In its most pristine
form, such a unified theory is described by a Lagrangian ft', where
ro _

.z -

1.G GJlV
- 4 JlV
-

1.FmF
2 1[/ ,

(1)

with F being a set of fermions belonging to an irreducible representation of


a grand unification group G. GJlV represents the field tensor corresponding to the
gauge bosons belonging to the adjoint representation of G and D is the covariant
derivative such that ft' is invariant under a set of local, non-Abelian gauge
transformations on F and GJlv.
The simple-looking Lagrangian (1) ideally contains the secret to everything - the unfolding of the universe from the instant of its creation to its present
state. Unfortunately, we do not have the necessary calculational tools. In the
absence of the latter, it becomes necessary to add additional terms to ft' in (1) to
make it amenable to practical calculations. It is clear that the full symmetry of
G does not manifest itself in the low energy region. The symmetry of G has to be
broken. This symmetry breaking could, in principle, arise from the dynamics
described by the Lagrangian (1) (dynamical symmetry breaking), but in the
absence of reliable calculational techniques, at least as a practical recourse, it
becomes necessary to introduce fundamental scalars (Higgs mesons) and add ft't/>

237
B. R. lyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 237-280.

1989 by Kluwer Academic Publishers.

238

K. C. Wali

to f', where

(2)
in which the first term represents the kim:tic energy of the scalar fields, including
the interactions of the scalar mesons with the gauge bosons. The second term
represents the fermion-Higgs couplings and the third term, the Higgs potential,
contains the self-interactions of the 's. Initially, (f' + f' <I>) is constructed so that
it is gauge invariant under local gauge transformations belonging to G, but by
giving non vanishing vacuum expectation values to appropriate 's, the
symmetry of G is broken in one or several stages to SU(3}c x SU(2} x U(1} and
eventually to SU(3}c x U(l}EM' The gauge bosons other than those that belong to
the residual symmetry, acquire mass at the expense of the would-be Goldstone
bosons and their effects at low energies generally become suppressed by inverse
powers of their masses. Fermions and the surviving Higgs mesons can also
acquire masses in the process of symmetry breaking. However, the implementation of symmetry breaking this way is the most troublesome feature of the
theory, as it introduces considerable arbitrariness. Given the fermionic content,
gauge invariance principles establish uniquely the gauge boson content and the
gauge interactions in the fundamental Lagrangian f'. This cannot be said of the
Higgs part of the Lagrangian. The choice of the Higgs representations, the form of
the Higgs potential, and the fermion-Higgs coupling parameters are not fixed by
gauge principles alone. Consequently, the particle spectrum and the particle
masses are not completely predictable.
In addition, at the outset, one may raise several questions of a fundamental
nature. Are the basic fermions F, quarks and leptons as we know them in the low
energy region, or are they ultimately composites of even more 'elementary'
constituents called 'preons'? If the scalars are fundamental, should one regard
them in a symmetric relation with the fundamental fermions as in supersymmetric models, or should they be regarded as composites or condensates of
the fundamental fermions as in dynamical symmetry breaking approach? At
present, we believe there are three families or generations offermions (quarks and
leptons) which have identical SU(3}c x SU(2) x U(l} quantum numbers. What is
the origin of this degenerate fermionic family structure'?
In spite of these fundamental questions, we shall assume for the most part of
this article that quarks and leptons are thl~ fundamental constituents and explore
some general consequences of GUT ideas in this section. Subsequently, we shall
focus our attention on a specific GUT model based on G == SU(5} to illustrate
some of the calculational details leading to quantitative results.

1.2.

Motivation for Grand Unification

The strong motivation for grand unification arises mainly from the remarkably successful theory of Glashow, Salam and Weinberg, which describes

Introduction to Grand Unified Theories

239

electromagnetic and weak interactions (electro weak) in a 'unified' manner. This


theory, for energies E> 100 GeV, has as its starting point a Lagrangian of the
form (I) in which the relevant group GEW = SU(2) x U(I). The left-handed
(right-handed) components of known quarks and leptons transform as doublets
(singlets) under the 'weak iso-spin' group SU(2) of GEW : they carry a 'weak
hyper-charge' corresponding to the U(1) of G EW ' The four gauge bosons of G EW
are also massless to begin with.
A single complex doublet of Higgs scalars is used to break the symmetries of
GEW (retaining the renormalizability of the theory). When the neutral component
of the Higgs doublet is given a non vanishing vacuum expectation value, three of
the four gauge bosons and the fermions acquire mass. The remaining massless
gauge boson can be identified with the photon. The chosen mass scale at
which the symmetry breaking occurs, distinguishes weak and electromagnetic
interactions at low energies. The weak interactions are described by effective
four-fermion, V-A, charged and neutral current interactions. The electromagnetic
interactions are described in the usual fashion by the surviving U(I)EM gauge
symmetry. Combining this with quantum chromodynamics (QCD) which
attributes strong interactions to the gauge bosons of an exact SU(3)-color
(SU(3)cl symmetry group, we have the completed picture of elementary
interactions at low energies (E < lOOGeV)
MEW

SU(3lc x SU(2) x U(I) 100GeV SU(3)(' x U(l)EM'


The enormous simplicity and success of the above-described theory, leads one
naturally to ask, as Pati and Salam first did: can one obtain a unified description
of quarks and leptons and the three low energy interactions? It is then
straightforward to seek a group G which, in one or several stages of symmetry
breaking, leads to

Such an idea, besides the aesthetic appeal of unification, will help to reduce the
number of parameters (more than 25 of them) required to describe the new
physics in the low energy region associated with the old and new generations of
quarks and leptons. The embedding of U(I)EM in a simple or semi-simple group,
could explain the long-standing puzzle of charge quantization. Quark-lepton
unification could counter questions like why is IQp + Qe I < 10 - 21,)
The known low energy interactions will all have a common strength 9 at
E ~ MG' Hence, the first requisite of a grand unified theory is to explain the
differing strengths and other distinguishing features of the observed strong, weak
and electromagnetic interactions in the low energy region. The non-Abelian
nature of the theory makes the coupling strength 9 energy dependent and, at
E < MG' 9 evolves via renormalization group equations in different ways in
SU(3), SU(2) and U(l) sectors (depending on the spectrum in each sector) giving

240

K. C. Wali

the low energy values g3' 9 2' 9 l' respectively. We can, of course, consider this in
the reverse order. Knowing g3' g2 and gl in the low energy region, we can study
their evolution as a function of energy. If no additions to the low energy particle
spectrum are postulated, one finds that g3' g2 and gl merge to a common value at
MG ~ 1014 - 10 15 GeV.
This mass scale is relevant from another consideration. The quark-lepton
unification implies in general that quarks and leptons be put in one multiplet. The
unified NEW (nuclear-electro-weak) interactions will then cause quark-lepton
transitions leading to baryon and lepton number violations. The most dramatic
consequence of this is the possibility of observing proton decay! Viewed as an
effective four-fermion process, we can associate a mass scale with proton decay,
a mass scale which makes the decay rate consistent with the known experimental
bounds, but still large enough to be observable. In the simplest possible model,
this mass scale turns out to be 10 15 Ge V, roughly the same scale at which the low
energy coupling constants merge together. This remarkable coincidence is one of
the reasons why grand unification appears to be more than just a wild
speculation.

1.3.

General Consequences of Grand Unification Ideas

From the previous subsection, it should be clear that the most natural and
important consequence of grand unification ideas is the possible violation of the
known low energy conservation laws such as the baryon number and lepton
number. Another consequence of grand unification is the existence of exotic
species of particles at very high energies. The surviving heavy Higgs scalars is one
such species, the non-Abelian magnetic monopole is another. Since U(I)EM is
embedded in a simple or semi-simple group G, electric charge is quantized. The
breakdown of G then inevitably produces magnetic monopoles. All these are
general consequences of grand unification which are true in any model. They are
of relevance to both terrestrial and astrophysical measurements, some of which
we summarize in Table I.
Since baryon and lepton number violations are such important consequences
of grand unification, we shall consider a few details regarding them, concerning
ourselves with some model independent questions such as the selection rules,
mass scales and the underlying mechanisms. Since the interactions responsible
for these violations have a mass scale M MEW ~ 10 2 , it is convenient to write
the transition matrix elements for I1B =1= 0, I1L =1= 0 processes in terms of an
effective Lagrangian composed offour fermion-type operators involving quarks,
leptons and Higgs fields. Because of the mass scales involved, it is also convenient
to use the transformation properties of these fields under SU(3)c x SU(2) x UO)
and construct Lorentz and gauge invariant products of the fields making up local
operators of different dimensions. The breaking of SU(2) x U(I) --+ U(I)EM can
afterwards be treated as perturbative corrections. Thus, one can write an effective

241

Introduction to Grand Unified Theories

Table I. General consequences and their significance' to terrestrial and astrophysical


measurements
Significance
Conseq uences of
grand unification

Terrestrial

Astrophysical

Baryon-number
violation

Proton decay.
N -N oscillations.
H-H transitions.

Baryon asymmetry
of the universe

Lepton-number
violation

Neutrinos with mass,


Majorana neutrinos,
neutrino oscillations,

Missing mass
in galaxies,
clusters,
formation of
galaxies.

Monopoles

Existence of monopoles,
monopole-induced
proton decay.

Monopole problem.

Phase transitions

II B /Il y

Inflationary
universe,
solution of horizon
and flatness
problems.

Lagrangian !i' eff'


(3)

where c is a dimensionless constant which will be related in specific models to the


gauge or Higgs couplings via group theory and renormalization group equations.
M is the mass scale relevant to the transition. 0 d is a polynomial in quark, lepton
and Higgs fields of dimension d. Each 0 d has characteristic selection rules on
f...B,f...L and represents a specific transition. Thus, we can classify Od'S by d, f...B,
f...L. A few examples of such operators are shown in Table II.
It is worth noting at this stage that to calculate real physical processes (proton
lifetime, N-N transition time) from!i' eff in (3), we need to make further dynamical
assumptions and approximations. The characteristic mass scales given in Table
II can, at best, be described as estimates. First of all, we note that for a fixed value
of the mass scale, if we have several O/s which contribute to a process, the
operator with the smallest dimension dominates, since it suffers the least
suppression by inverse powers of the mass. Conversely, if O/s of different
dimensions contribute comparably to a process, the associated mass scales will be
different, M decreasing as the dimension increases. The values of the mass scales
given in Table II take these facts into consideration. They are also constrained by
existing limits on f...B i= 0 processes. Further, one assumes, prompted by the
desire to have testable predictions, that although unobserved so far, these

K. C. Wali

242
Table II.

Examples of operators for I1B ,i 0 processes


Sel.

Dim.d
6

Rule

liB

liL

-I

Fields
in Od

Observable
process

Mass
scale

qqql

p -+ e + nO

10 14 _10 15 GeV

n----..e+rr-

liS = 0, 1

!1B=-L=-1

10 1O -1011GeV

qqql'q,

n-+e-K

qqqlllq,q,

p--+e+V-V

104-1O'GeV

qqqqqq

n~fi

lO'-10 6 GeV

liS = 1

liL

11

liB

= - - =-

liB

-2

d = 6 - lowest dimensional operator for I1B # O.

processes will be observable in the present or next generation of experiments. In


other words, the mass scales given in Table II are only lower bounds and not rigid
constraints on the model. In addition, there are, in practice other customary
approximations concerning the hadronic wave functions, questions regarding the
relative significance or insignificance of specific diagrams and different methods
of calculations. Because of these reasons, predictions of the theory carry some
uncertainty in quantitative details. For example, there is an uncertainty and
variation among calculations by different groups of about two orders of
magnitude for the proton lifetime. The current theoretical and experimental
values of 'p and 'nri are

Experiment
7 x 10 30 years
(Kolar gold field)

Theory
10 29 to 10 33 years

> 3 x 10 7 seconds
(From stability of
nuclear matter)

10 7 seconds to
anything

Similar considerations hold in purely ie:ptonic processes with I1L -# O. On the


experimental side, there are some positive indications about neutrino masses and
neutrino flavor oscillations. Other processes with I1L -# 0 which will play an
important role in the future are the J1 -. ey transition and neutrinoless double
f3-decay.

1.4.

Astrophysics and Grand Unification

The important consequences of grand unification we have discussed so far, are


the low energy relics of the spontaneously broken grand unification symmetry.

I ntroduction to Grand Unified Theories

243

To obtain a full display of the NEW interactions implied by grand unification, we


need energies> 10 15 GeV which are beyond present-day or future terrestrial
accelerators. Astrophysics and early cosmology are the natural arena for testing
grand unified theories. In the standard Friedmann-Robertson-Walker model of
the Universe, we start with a hot dense phase with temperatures exceeding
10 16 GeV in the first 10- 35 sec after the big bang. In its very early stages, our
Universe was like a giant accelerator, the high temperatures and densities being
maintained for a relatively longer time than would be possible in terrestrial
accelerators. Under those conditions, there should have been copious production
of all the particles that we know and those we do not know - all the particles
including the superheavy ones that grand unified theories predict. Forces of
grand unification were in full operation. One can trace the effects of these forces
through the subsequent adiabatic cooling of the universe down to the present
epoch and compare with contemporary astrophysical measurements.
The most exciting possibility in this direction is the resolution of an old puzzle:
why is there very little antimatter in the Universe, in particular why is the net
baryon density nB - 1O-8 ny, where ny is the number density of photons? One
could postulate this baryon asymmetry as an initial condition for the Universe,
but it would be more satisfying if we could start with an initial state of zero
asymmetry and produce the observed nB/ny in the course of the evolution of the
Universe. As noted long ago by Sakharov, this requires (a) AB "# 0 processes, (b)
CP-violation, and (c) a nonequilibrium state of the Universe when the AB "# 0
processes are being frozen out. The last requirement can be obtained in an
expanding Universe. In a grand unified version ofthe expanding Universe, we can
thus find a natural explanation for baryon asymmetry. In most grand unified
theories, we can produce, by ajudicious choice of parameters, the number _10- 8
for nB/ny but only just. We can by no means claim that currently available grand
unified models provide a quantitatively perfect solution of the problem.
The interaction with astrophysics provides several constraints on theories and
parameters in particle physics. Table III summarizes some of these constraints.

1.5.

Theoretical Problems

Unification leads to theories with great predictive potential, since the number of
arbitrary parameters is reduced. But as we have noted earlier, this advantage is
partially sabotaged by the Higgs sector of the theory. There is arbitrariness in the
choice of the Higgs representation and the form and parameters of the Higgs
potential. Besides, we may have problems of gauge hierarchy and fine tuning.
(This is a strictly technical problem of perturbative computation but,
nevertheless, quite worrisome.) There have been several attempts at a more
satisfactory description of symmetry breaking. The dynamical breakdown of
symmetry is one of them, the use of extra symmetries like supersymmetry is
another. Studies of dynamical symmetry breaking are patterned after the

K. C. Wali

244
Table III.

Examples of astrophysical constraints

Astrophysical
measurement

Mechanism

Constraints on
particle theory

Energy loss
from stars

Bremstrahlung and photoproduction of light


particles like axion,
Majoron, ...
rate,;; rate of loss due
to photons

Vacuum expo values of


Higgs fields,
Number of neutrino
flavors
N, ,;; 760 (red giants)
,;; 23 (carbon burning
stars)
,;; 50 (neutron stars)

Deceleration
parameter
expansion
rate

Mass density of the


universe p ,;; P"
Mass density due to
stable and long lived
particles';; Po'

Scale in varian t
densities of grand
unified monopoles,
background sea of
neutrinos
vacuum walls,
spontaneous supersymmetry breaking
scale,
gravitino masses.

He
abundance

Part. species at T = 1 MeV,


Expansion rate, Free2e
out temp., nip ratio

Light neutrino flavors

Deuterium
abundance

Decay of light Higgs


particles, axions
affects entropy per baryon.

Masses of light Higgs


particles, axions.

Estimate of
dark matter

Dark matter due to


neutrinos? Axions?

Mass of neutrinos

Baryon
asymmetry

Baryon-number viola
tion, C, CP violation, departure
from equilibrium
necessary.

Such interactions exist


in grand unified
theories. Gives constraints on parameters
of the theory.

,;;4

breakdown of chiral symmetries in QeD. If we ignore the current algebra masses


of the quarks, QeD has SU(NF)L x SU(NF)R X U(lk+R chiral symmetry. The
strong binding forces of QeD produce qq condensates which break this chiral
symmetry to SU(Nk+R x U(1)L+R' This can be thought of as effective Higgs
scalars being formed as qq bound states, which then acquire nonzero vacuum
values. Since the weak group SU(2)L x U(I) is a nondiagonal subgroup of the
chiral group, this process gives a dynamical breakdown of SU(2)L x U(1),
resulting in W-bosons with a mass - 30 MeV instead of the desired -100 GeV.
In technicolor models, therefore, one repeats this scenario at TeV energies so that
the W-boson masses come out right. For this purpose, one needs extra fermions

Introduction to Grand Unified Theories

245

(technifermions) to provide the effective scalars. One also needs massive


'extended technicolor' (ETC) gauge bosons to provide masses to ordinary quarks
and leptons. We may thus need scalars again to make the ETC bosons massive.
This perhaps can also be done dynamically, but one ends up with a baroque
theory of very little predictive power. The ETC bosons also lead to
flavor-changing neutral currents (FCNC) and this has made technicolor models
unpopular. Nevertheless, realistic models are possible where FCNC are
suppressed to acceptable levels. Technicolor, it should be noted, provides
a simple solution to the fine tuning problem, although there are still
unsatisfactory points when one gets into details.
Supersymmetry provides another resolution of the problem. In technicolor
models, there are no quadratic divergences and, hence, there are no radiative
corrections to W-boson masses which are of the grand unification scale in
magnitude. The SUSY answer to hierarchy is to cancel the unpleasant quadratic
divergences by contributions from the supersymmetric partners. An example is
the mixing of MG and Mw by gauge boson loops which is cancelled by the
gaugino loop contribution. Thus, the vastly different values of MG and Mw
can be naturally preserved in perturbation theory. Since we need to break
supersymmetry (based on phenomenological considerations), there will be some
mixing ofthe two scales, but this can be controlled by the scale ofSUSY breaking.
There is a second level of the hierarchy problem. Beyond the question of
keeping M Gand M w separate once they are chosen at the tree level, one might
ask: how can we, in an intrinsic sense, understand why M G M w? SUSY models
might provide an answer through the inverted hierarchy scheme of Witten [5].
Some of the other facts about SUSY models are: (a) N = 4 SUSY Yang-Mills
theory is finite to all orders in perturbation theory (except gauge-dependent
parameters); (b) in supergravity models one may be able to predict Mw from
MPlanck and M G ; (c) N = 8 super-gravity may be finite up to 7-100p order.
These provide powerful incentives to make realistic SUSY models. On the
experimental side, the most significant fact about SUSY models is the incredibly
rich particle spectrum. If SUSY is broken using an extra U(1) symmetry, the
scalar partners of quarks and leptons should have masses <40 GeV. For
radiative mass generation mechanisms with no extra U(l), these masses should be
~Mw' The next generation of accelerators should thus provide a definitive
answer on SUSY models.

1.6.

Concluding Remarks

Grand unified theories are an important advance in the quest for a unified
description of elementary particle interactions. Beyond the appeal of unification,
they have also raised the possibility of interesting and new phenomena, e.g.
proton decay, NN oscillations, neutrino masses. Ideas for understanding old
puzzles like the baryon asymmetry of the Universe, have also been generated.

246

K. C. Wali

While our experience with successful gauge theories, like electrodynamics and
the standard electro-weak theory, gives confidence in the use of gauge principles
as a unique basis for constructing grand unified theories, choice of gauge group,
representations, symmetry breaking mechanism, etc. have a great deal of
flexibility. And the very idea that the unification (if it occurs at all) must occur at
very high energies unreachable by terrestrial accelerators, makes it very difficult
to provide detailed tests on the basis of which grand unification models can be
discriminated. Proton decay experiments currently under progress - if they
establish proton decay conclusively - may come to being the closest in providing
strong support for the underlying ideas of unification. Precise data on the various
branching ratios, observation of NN oscillations, rare decay processes, will
certainly help to make further progress.
In view of the high energies of the order of 10 15 Ge V or greater needed to
directly test the grand unification ideas, one may look to astrophysics and our
ideas of the very early Universe. Astrophysical measurements do provide
constraints on models of grand unification, but these involve matching numbers
which depend on cosmological and astrophysical models with the predictions of
models of unification. Astrophysics ca.n only give us guidelines. We need
terrestrial experiments to sort out the various models. The SU(5) model, because
of its minimality, has been analyzed to the fullest extent.lfit is not the right model,
it can be ruled out most easily because of the least ambiguity in its predictions.
There are also new theoretical directions towards a fundamental theory of
elementary particle interactions. The proliferation of the number of degrees of
freedom that we should consider, viz. color, flavor, technicolor, etc., and the
perplexing mass spectrum of quarks and leptons, have prompted investigations
of composite models of quarks, leptons and gauge bosons. However, most of
these 'preon' models do not have any verifiable predictions. The question of
compositeness for quarks and leptons can be discussed in a serious way only after
high energy regimes become experimentally accessible.
We have discussed supersymmetric grand unification only briefly.
Supersymmetry is, in some sense, the maximal symmetry of the S-matrix and, as
such, would be a nice starting point for a theory. Furthermore, as we have
mentioned, this may be the way to incorporate several desirable features like
hierarchy and finiteness in our theories. The full potential of SUSY models of
grand unification and their embedding in supergravity theories is only now being
revealed by current investigations.
Kaluza-Klein-type theories are another direction of investigation. One starts
with dynamics in a higher-dimensional (> 4) spacetime and comes down to
ordinary spacetime by 'dimensional-reduction', i.e. the extra dimensions are
frozen out by subsidiary conditions. Ordinary 'internal' degrees of freedom
correspond to the 'frozen' degrees of freedom which are the relics of spacetime
symmetries associated with these extra dimensions. However, general
considerations seem to indicate that it is not possible to obtain chiral fermions in
four-dimensional space in this way. Currently, it is believed, most fervently in

Introduction to Grand Unified Theories

247

some quarters, that the best way to implement such ideas is through superstring
theories. In the latter theories, elementary particles are one-dimensional objects
in 10- or 26-dimensional spacetime. An effective field theory resulting from such
a starting point only has no anomalies for two internal symmetry groups, namely,
SO(32) and E8 x E 8. The resulting, renormalizable field theory includes gravity,
has supersymmetry and is believed to be capable of predicting the number of
generations. We have to wait and see whether this highly ambitious approach
leads to the ultimate solution we are seeking in our understanding of elementary
particles and their interactions.

2.

Grand Unified Theory Based on G == SU(5) (Minimal SU(5

In this section, we shall study a grand unified theory based on the gauge group
SU(5). Generalization to any SU(n) should be apparent. By decomposing SU(5)
with respect to SU(3)c x SU(2)L x U(1)w, we see how the particle content and
various interactions appear in the theory. In this exact SU(5) limit, all the
interactions are governed by one gauge coupling strength. In addition to the
known strong (QeD) and electroweak interactions (SU(2) x U(1)), new interactions appear which are due to the additional gauge bosons (lepto-quark)
contained in SU(5) and, as their name implies, they can effect transitions between
quarks and leptons. They are responsible for proton decay.

2.1.

Background; the Standard Model

Let us first recall the salient features of the standard model of 'low energy' strong
and electroweak interactions.
(1) It is a non-Abelian gauge theory based on the semi-simple Lie algebraic
structure H,
H = SU(3)c

SU(2)L x U(1)w,

where SU(3)-color, SU(3b is an exact symmetry describing the strong interactions. The electro weak interactions are governed by SU(2)L x U(1)w nonAbelian gauge theory which is also exact above energies E > M w ~ 100 Ge V but,
for E < M w, the theory breaks spontaneously to the usual quantum electrodynamics of U(1)EM and the V-A theory of weak interactions.
(2) A family or generation of quarks and leptons consists of 15 helicity objects.
Quarks (leptons) transform according to 3 (1) representation of SU(3b Both
quarks and leptons transform as doublets or singlets under left-handed SU(2)L

248

K. C. Wali

and their electric charge Q obeys the relation

+y

Q = T3

t,O

Weak
hypercharge
U(l)w

Weak
isospin
SU(2)L

At present, we think that there are three families of quarks and leptons. These,
along with their quantum numbers, are shown in Table IV.

Table IV. The three families of quarks and leptons. The subscript i denotes the color index
and L the left-handedness of the particles. The: superscript c denotes the charge-conjugate field.
SU(3)c

SU(2)

(3

U(l)

i)

(j

-1)

(j

t)

(I

-1)

I)

(1

G1

(:1

(~1

(U;)L

(cDL

(tDL

(dDL

(S:)L

(bDL

(:~ )L

C'

(e+)L

(/1 +)L

)L

(:.~ )L
(T+)L

In gauge theories with spontaneous symmetry breaking, it is convenient to use


two-component Weyl spinors, instead of the conventional four-component Dirac
spinors, to describe the fermions, since they are all massless at the beginning and
acquire mass only through spontaneous symmetry breaking. Further, it is also
more convenient to use only the left- or right-handed components. Here, we shall
only use the left-handed component of the corresponding charge-conjugate field
to denote the right-handed components. (For details see the Appendix.)
(3) The quantum numbers satisfy certain 'anomaly constraints' which are
necessary in order that the theory is renormalizable. In the case of the standard
model these constraints are
(a)

Tr'Y=O,

(b)

Trny

= 0,

(c)

Tr y3 = 0,

where the single prime denotes the sum over all the colored states, the double
prime denotes the sum over the weak iso-spin states and, in (c), we have to sum
over both color and iso-spin states. Thus,
(a) 3i-2 - 3i + 3t = 2(-t) + 1 == 0
'-~
Contrihution from
quarks

'------y--'
Contribution
rrom Leptons

Introduction to Grand Unified Theories

249

(b) 2t-i+t=2(-t)+ 1 =0
~~
Quarks

Leptons

(c) 6(t)3 + 3( -i)3 + 3(t)3 + 2( _t)3 + 1 = 0


~

v
Quarks

Leptons

Note that each family is anomaly-free and, while the first two relations (a) and
(b) are satisfied for quarks and leptons separately, (c) is satisfied iff both quarks
and leptons are included in the family. Thus, anomaly constraints establish a link
between quarks and leptons. The existence of the (Vt' r) leptons demands the
existence of the pair (t, b) quarks.
(4) There are 12 gauge bosons in the theory: eight gluons and four vector
bosons out of which three acquire masses and become the intermediate vector
bosons Wi, Zoo The fourth remains massless to be the photon.
(5) The spontaneous symmetry breaking is implemented by introducing
a complex Higgs doublet
The standard model based on the above considerations has proved immensely
successful in the low energy regime accessible so far. Why, then, go further and
'unify'? The reasons are manyfold:
(1) The theory, as it stands, contains too many arbitrary parameters. There are
three distinct coupling parameters corresponding to the three gauge groups
SU(3b SU(2)L' and U(I)w. The weak mixing angle ()w is arbitrary in the theory,
and so are the quark and lepton masses, the generalized Cabibbo-type mixing
angles, the CP-violating phase, the parameters in the Higgs potential and so on.
(2) The phenomenon of 'superfluous replication', that is, the occurrence of
families with identical electroweak quantum numbers, is unaccounted for. It
suggests additional symmetries and/or selection rules.
(3) The quark-lepton correspondence (i.e. the identical classification of quarks
and leptons under SU(2)L and U(1)w) and their linking in the anomaly
cancellation constraints, suggest the possibility of further closer correspondence
between quarks and leptons. They may be parts of a single representation of
a higher symmetry group.
(4) In non-Abelian gauge theories, the coupling parameters are functions of
energy or mass scale. The strong coupling parameter of QCD decreases with
energy, whereas the SU(2) coupling parameter increases with energy. It is only
natural, then, to imagine that, at some high energy, these coupling parameters
attain the same strength and one may have a single parameter instead of three
describing the various interactions.
(5) In the standard model, there is no explanation for the charge quantization
(i.e., only the discrete values for electrical charges, i, t, 1,0 in units of e).
Also, no explanation for such relations as Q (electron) = - Q (proton),

(::l.

Q(V e )

Q(e-)

= Q(u) -

Q(d),

Q(e)

tQ(d),

etc.

These and other shortcomings of the standard model lead us to consider

250

K. C. Wali

further the higher symmetries which unify all three interactions as a first step.
A complete unification, of course, would be required, including gravity. We,
therefore, seek a group G,
G :::::> H == SU(3)c

SU(2)L x U(I)w'

(1)

preferably simple so that if it describes all the three interactions in a unified


manner, there will be single coupling parameter 9 corresponding to the gauge
group G. Further, since H has rank 4, the rank of G is prescribed to be at least 4. If
G = SU(N), the smallest group with rank 4 is SU(5).

2.2.

SU (5): Group Theory Aspects

The fundamental representation of SU(5) consists of 5 x 5 unitary, unimodular


matrices U,

utu =

1,

det U

1.

A general SU(5) transformation can be written in the form


(3)

where 7;, called the generators, are 5 x 5, Hermitian, traceless matrices. (}j are
real, continuously variable parameters. The rank of SU(5) is 4 and, hence, there
are four generators which can be simultaneously diagonalized.
A convenient basis to represent the generators is provided by the non-Hermitian
matrices T h, a, b = 1, ... , 5,

(4)
These matrices satisfy the commutation relations

[T/:, TdJ

= b~ T~

(5)

b~ T~

and they correspond to the usual raising and lowering operators. The Hermitian
generators are linear combinations !(T/: + T~), i/2(T'b - T~).
It is evident from (4), that the nondiagonal matrices have only one non vanishing
element (T'b)ba = 1 and zero everywhere else. There are five diagonal generators,
T~

(T~)bb

= -

t,

a -I. b,

(6)

but r.~= 1 T~ = 0 and, hence, there a.re only four independent, diagonal
generators, as there should be, since the rank of the group is 4. Therefore, a more

Introduction to Grand Unified Theories

251

convenient choice for the independent diagonal operators is the following:


tdiag(l, -1,0,0,0),

l;;diag(l, 1, -2,0,0), 1~(1, 1, 1, -3,0),


2y 3
2y 6

1
;'1/\(1, 1, 1, 1, -4).
2y 10

(7)

The normalization of the generators is arbitrary, but we shall stick to the


convention
(8)

Now our task is to label the generators so that the embedding of SU(3) x
SU(2) x U(1) is self-evident. To this end, let us denote the SU(5) generators as
T:, a, b = 1, ... ,5.

(9)

Then define the subset of the first eight generators as SU(3) generators. Assign
them the Greek labels 0(, p, ... which are assumed to take only the values 1,2,3, so
that
(10)

generate the SU(3) subgroup. Similarly, by prescribing, r, s to take only the values
4, 5, define the generators
(11)

which will generate the SU(2) group. The U(1) group can be identified as
(12)
These make twelve of the 24 SU(5) generators. The remaining twelve SU(5)
generators are T~ and T~(Tt, ri, ... ; Ti, TL ... ).
The two SU(3)c diagonal operators Tf, T~ can be identified as
1

T~

=t (

-1

00) ~}(Tl-

Tl),

Ts

1 (

= 2}3

(13)

K. C. Wali

252

The SU(2) diagonal operator which we shall designate as T'f (third


component of weak iso-spin),

J(T! - n).

(14)

-1
The U(I) generator which we shall identify with T'f given by (12) and
normalized according to (8) is

(15)

Let the electric charge operator Q be defined so that, in the fundamental


representation, it is
1.
3

.1

).

(16)

so that
(17)

2.3.

Particles; Their Representations

2.3.1. Fermions. Suppose we define the fundamental representation 50fSU(5),


decomposed with respect to the subgroup SU(3) x SU(2) x U(I), as follows:

= (3,

1,

-t) + (1,2, t),

(18)

(That the SU(3)-triplet must be SU(2)-singlet and the SU(3)-singlet must be


SU(2)- doublet is obvious. The U(I) quantum numbers follow from (15) and the
charge operator defined in (16) and (17).)
Then, the 5*-representation contains
5* = (3*, 1,

t) + (1, 2,

-t),

(19)

leading to charges of 1/3 for the SU(3)-triplet and charges 0 and -1 for the SU(3)singlet, SU(2)-doublet components.

253

Introduction to Grand Unified Theories

The quarks (dDL and the leptons (v e' e -)L of the 15-plet family have exactly the
quantum numbers contained in 5*. Hence, we can fit them into the 5*
representation.
The anti-symmetrized direct product 5 x 5 gives rise to a 10 representation of
SU(5). Its decomposition with respect to SU(3) x SU(2) x U(l) yields
10 = (3*, 1, -

i) + (3,
i

Q= -

Q=

2, i)A

+ (1,

(-t}

1,1)

(20)

Q = 1.

The three (ui)L quarks, the six (u;, d;)L quarks, and the lepton (e+)u altogether
ten, exhaust the quantum numbers of the 10. Thus, the 15-plet family can be made
to belong to a reducible combination of 5* and 10 representations of SU(5),
15

= 5* + 10

(21)

That each family belongs to a reducible and not to an irreducible representation is


considered as one of the unsatisfactory features of SU(5). But it is not a very
serious objection at this stage. On the contrary, what is worth noting is that the
combination is anomaly free and, hence, it can lead to a renormalizable gauge
theory.
In tensor notation, let ljJa be fields that transform according to 5 representation,
and Xa = (ljJa)t transform according to 5*. Higher representations transform as
the tensor products of the above representations. Denoting the anti-symmetric
tensor belonging to 10 by ljJ'Lb , we have for the 15-plet family of fermions,

d1
d

5*: XL =

d'3
e

(22)
L

-Ve

0
JO:IjJL=

-uz

J2 Uz

-u1

u1
0

ul

u2

u3

dl

d2

d3

u'3

u'3
0

_u l
_u 2

_d l
_d 2

_u 3

_d 3

0
e+

-e+
0

(23)
L

The 10 representation is written in the form of an anti-symmetric matrix with


introduced for convenience of normalization (Langacker). I t may
the factor 1/
be worthwhile at this stage to see cleary how the identifications of various

J2

254

K. C. Wali

particles in (23) arise:

a _

5.!/J -

(IV)(/. = 1, 2, 3
</J' r

= 4, 5.

Then, when we take the tensor product 5 x 5 and antisymmetrize, we obtain


(24)
From the definition ofthe charge operator (16) in the fundamental representation,
we see that !/J'P has charge -2/3, and !/J,4, !/J,5 have charges +2/3 and -1/3,
respectively. Hence, the identifications !/J 12 ex u~, !/J.4 ex u" !/J,5 ex d a etc.
2.3.2. Gauge Bosons. In a non-Abelian gauge theory, the gauge bosons belong
to the adjoint representation which in our case is 24 contained in the direct
product of 5 and 5*,

5 x 5*

= 24 + 1

(25)

Now we decompose 24 with respect to SU(3) x SU(2) x U(1), we obtain

24 = (8, 1,0) + (1, 3, 0) + (1, 1,0) + (3, 2*,

i)

+ (3*,2,

i).

(26)

The (8, 1, 0) components belong to the adjoint representation of SU(3). The QeD
gluons can be identified with them. Likewise the (1,3,0) and (1, 1,0) components
belong to the adjoint representation of SU(2) x U(1) and they will ultimately be
the intermediate vector bosons (W +, W-, ZO) and the photon. The remaining
twelve gauge bosons (3, 2*, -i) and (3*,2, i) are the new bosons which can cause
transitions between color triplet and color singlet fermions, that is, between
quarks and leptons. They are therefore called lepto-quarks. Note that their
electric charges are fractions, (4/3, 1/3), which they have to be in order to cause
a transition between a quark and a lepton.
Let the corresponding fields, the non-Abelian gauge vector potentials, be
denoted by A a in the Hermitian basis and by At; in the non-Hermitian basis. (For
simplicity of notation, we are suppressing the spacetime index.) The index a(b) will
transform like the 5 (5*) representation. Define the matrix A to be
_l_A
h
y'2

= ~ TA a = )'a Aa
~

a=1

2 '

(27)

where T. are the Hermitian generators defined earlier. The A's are the SU(5)
generators; they correspond to Pauli matrices for SU(2) or Gell-Mann matrices
for SU(3). Explicitly writing out the relations between the Hermitian and

Introduction to Grand Unified Theories

255

non-Hermitian forms of the generators, we can identify:


SU (3)c Generators

ten

ten -

Tl =
+ Ti),
T2 =
Tn,
T 5 = t(n - TD,
T6 = ten + n),
T3 = t diag(l, -1, 0, 0, 0) = t(T~ - T~),
T8 =

l;;diag(1, 1, -2,0,0) = 1;;(n


2y 3
2y 3

T4
T7

= t(Tj + Tf),
= t(n - T~),

+ T~ - 2n).

(28)

SU(2) Generators
T9 = HT1 + T~),
T 10 = t(T1- T~),
Til = tdiag(O, 0, 0, 1, -1) = t(T: - T~).

(29)

U ( 1) Generator
T12 = jidiag

(-t, -t, -t, t, t) =

ji( -

t T~ + tT~).

(30)

Lepto-quark Generators
T13
T16
T19
T22

= t(T! + Tt),
= t(n - Tn,
= t(n + n),
= t(Tl- T~),

T14 = HT! - Tt),


T17 = t(T~ + T~),
T 20 = Hn - Tn,
T23 = teT~ + T~),

T 15
TiS
T21
T24

= !(T~ +
= t(T~ = !(Tl +

= teT~

Til
T~)

T~)

- T~). (31)

Substituting (28H31) in (26), we can write


(32)
where for a #- b,

Ai = (A~)* = A 1 +2 iA 2, etc.
and for a = b,

(33)

K. C. Wali

256

A! = ~(All + AA12}
A~ = ~( -All + !sAI2).

(34)

Note that A~ + A~ + ... + A~ = o.


AI: thus defines the gauge field matrix which is Hermitian and traceless. To
separate and identify the different interactions, let us denote them as follows:
[A~J

a,

[G~],

p=

[A:] = [W, B], r,


[A~, A~] = [L].

1,2, 3 SU(3)c gluons


4, 5 SU(2) x U(l) electroweak vector bosons
Lepto-quark.

S =

Then, the matrix A will have the structure


A

[~i --f --~~

(35)

where written out explicitly,


GI _ 2B

G!

Ifo

2B

Gi

G=

Gj

G2

fo

Gi

G3

G~

Electroweak
Vector Bosons W =

3B

W3

-+--

.fifo
W-

W=

WI

+ iW

(36)

G~

---

2B

---

fo

w+
W3

3B

(37)

--+--

.fifo

.fi

Lepto
Quarks L =

-, ~2'
Y

X2
X3

-'J
y3

(38)

Introduction to Grand Unified Theories

257

Note that for states belonging to 5 x 5*, the charge operator Q is given by
Q = Qs

+ Qs*,

where Qs acts on the upper index and is given by (16). Qs* acts on the lower index
and is the negative of Qs. Thus,
QA~ = (Q~

+ Q~*)A~

and it is straightforward to verify that the gluons have zero electric charge, the
electro weak bosons have charges 1,0, and the leptoquarks have charges 4/3,
1/3.
2.3.3. The desired scalar fields, the Higgs scalars, can be constructed along
similar lines. We shall defer their discussion to the next section.

2.4.

Covariant Derivatives

In order to construct a Lagrangian invariant under a set of non-Abelian local


gauge transformations, it is necessary to generalize the notion of the ordinary
derivative 01' to that of the covariant derivative DI' in strict analogy with the
Abelian electro-dynamics. Recall that in electrodynamics (U(I) gauge theory),

01' ---> 0" - ig AI'(x),


where AI' (x) is the vector potential. In the non-Abelian case

01' ---> 0" = 01' 1 -

ig

fiAIl'

(39)

leading to the following covariant derivatives of the fermion fields in the case of
SU(5):
For the field ljIu transforming like the fundamental representation 5,
(40)

F or Xu transforming like 5*,


(41)

For IjIUb transforming like 10,


(D ,I,tb =
,,'I'

(0 ,I,ub _ ~(A
fi
,,'I'

)U ,I,eb _ ig s (A )b'l,ae)

I' e

'I'

fi

I' e

'I'

(42)

258

K. C. Wali

Finally, the required field tensor F Ilv to represent the KE of the gauge fields is
i9 5

(43)

Fllv = altAv - avAil - j2[AIl ,AJ,

which can be easily shown to be gauge covariant.


2.5.

SU(S) Gauge Invariant Lagrangian; Interactions

We can write the desired Lagrangian in the form


!f =!fG +!fF +!fH

+ !f FH ,

(44)

where
!f G =

with

FIlV

!f F

with Q>

KE term of the gauge fields


gllP gV).

= -

4 Tr FllvPv,

F P)"

= i(xdyO(Q>Xd + i(t/lL)fy(Q>t/lL)'

(45)

yllD U"

Note that these 'kinetic' energy terms include gauge-interactions of the fermions
along with the usual kinetic energy term X(]x. The !fH and !fFH terms in (44)
involve the Higgs fields. We shall consider them in the next section.
From (45), using (41) and (42), we can obtain the gauge interaction Lagrangian
!f l ,
!f[ = -

fi(Xda(J)~XLb +

+ fi(fdab(J)~t/lCb.

(46)

It is straightforward to expand (46) and regroup the various interactions


(Langacker (1981), Eq. (3.39), p. 263). Note that a single gauge coupling governs
the various interactions. This is in the limit of exact SU(5).

2.6.

Appendix

The SU(5)-gauge invariant Lagrangian (44) does not contain any mass terms.
Both the fermions and the gauge bosons acquire masses through the spontaneous
symmetry breaking mechanism. In such situations, it is convenient to introduce
two-component Weyl spinors to represent the fermions instead of the conventional four-component spinors which obey the Dirac equation,
(Al)

I ntroduction to Grand Unified Theories

where
yO =

e _I),

{yi} = Y = ( _

y5 = 1'5 = iyOyl y2 y3 =

259

~ ~),

(~ ~}

(A2)

If we denote

and

if! = 2!if!1

if!2)'

Equation (Al) reduces to


(A3)

i(y03 o +CJ'V)if!+ =mif!_,

so that if m = 0, the two equations are decoupled. While if! has a definite parity,
if! do not. Note that if we define the projection operators
PL

1(1 -

Ys),

PR

1(1 + Ys)

and
if! L,R = P L,R if! ,

then if!

(If/)L,R

If/ P L,R

if!L + if!R.

This shows that it is more convenient to introduce the Weyl representation of the
y-matrices in which ys is diagonal,
yO =

I)
(_ 0I - 0'

}' = (

- 0(1 (1).
0

(A.4)

Then,
if!L = PLif! =

1(1 -

ys)if! =

(~J,

so that if!L and if!R are effectively two-component spinors. Note that
(A.5)

and if if! c is the charge conjugate field,


if!c = Clf/T,

where C is the charge conjugation operator


C = _C T = _C T = -C- 1 = iy2 yO,

if!L,R C if!'L.R =' PL,Rif!c

(A.6)

C(lf/R,L)T,

If/L,R C if!'L.R =' - (if!R,L)TC- 1

(A.7)

K. C. Wali

260

That is,

or
(A.8)

which enable us to express the total four-component field in terms of only


left-handed two-component fields t/Ju (~~C)L'
3.

Spontaneous Symmetry Breaking

The SU(5) symmetric Lagrangian discussed in the previous section clearly does
not correspond to the real world. The strong, electroweak, and the new baryon
number violating gauge interactions are all governed by the same gauge coupling
parameter g5' The 24 gauge bosons corresponding to the adjoint representation,
and the IS-plet fermions are all massless. The high degree of symmetry has to be
broken so that the strong, electromagnetic, and weak interactions manifest
themselves distinguished from each other as we see them in the laboratory.
From the Glashow-Salam-Weinberg theory of electro weak interactions, we
know that the appropriate way of breaking the symmetry is to utilize the
spontaneous symmetry breaking mechanism (SSB) a la Higgs. This is the only
successful way we know at present of introducing symmetry breaking by giving
masses to some of the gauge bosons, yet maintaining the renormalizability of the
theory. In our case we want
SU(5)

Mx

l' stage
Mw

2n d stage

->

SU(3)e x SU(2) x U(1),

->

SU(3)e x U(1)EM'

That is,

E> Mx'
M w < E < M x'

E < Mw,

All gauge bosons and all fermions are massless. The theory is
SU(5) symmetric.
X, Y gauge bosons acquire mass. SU(5) is broken to SU(3k x
SU(2) x U(1). The gluons, the electro weak bosons W, B,
and the photon are massless. So are the fermions. This
represents the 1st stage of SSB.
W, B acquire mass. The symmetry is broken further to
SU(3)e x U(1)EM' with only the gluons and the photon
remaining massless. The fermions acquire mass as well. This
represents the 2nd stage of SSB.

In this section, we shall study how this is accomplished. The procedure involves,
firstly, the choice of a suitable irreducible representation (in general, a combination

Introduction to Grand Unified Theories

261

of irreducible representations) and secondly a pattern of vacuum expectation


values (VEV's) which minimizes a chosen Higgs potential.

3.1.

Spontaneous Symmetry Breaking

3.1.1. 1st Stage. As we have stated before, there is no rule or principle that
determines the choice of the Higgs representation, except that the desired
symmetry breaking pattern is possible in that representation. We shall begin by
examining the adjoint representation. Let a Higgs scalar multiplet <D belong to
the adjoint representation. In analogy with Equation (26) of Section 2, let
(1)

Like the gauge boson matrix, we can write <D as a 5 x 5, traceless, Hermitian
matrix. Further, by using SU(5) transformations, we can bring the matrix into
a diagonal form to consider what kind of symmetry breaking pattern it can give
rise to. Recall the transformation property and the definition of the covariant
derivative of a field belonging to the adjoint representation,
(2)

Written out explicitly,


(3)

Now consider the Lagrangian


Sf If> = Tr(DIl<D)t (DIl<D)

+ V(<D),

(4)

where V(<D), the Higgs potential, is a SU(5) invariant polynomial in <D, restricted
to be at most of the fourth degree. Higher than fourth degree polynomials ruin the
renormalizability of the theory.
For generating gauge boson masses, we need terms that are quadratic in the
gauge fields. From (3), the terms of interest are given by

g; Tr [All' <Dr [All' <D].


2

(5)

Assuming <D to be in the diagonal form


(6)

we ask what pattern ofVEV breaks SU(5) -+ SU(3) x SU(2) x U(1). It should be

262

K. C. Wali

clearly of the form

I>

= V diag(l, 1, 1, - 3/2, - 3/2)

= -j15VT~,

(7)

where V is the VEV, T~ is the generator corrresponding to U(1) (Equation (15),


Section 2). T~, being a multiple of identity in G and W spaces, commutes with the
SU(3) and SU(2) generators. Noncommuting parts involve only the X and
y bosons which will then acquire mass. Thus

[AI" 1>] = V

-15U

-1X2

-1 y1
-1y2

-1 X3

-1y3

1X 1 1X 2 1X3
2 1y 3
1 y1

and
Tr [AI" I>]t [AI" 1>] = 2: V 2 (X i X i

+ yiYJ,

(8)

implying that the X and Y bosons are no longer massless. Their masses are given
by
2
2
25 2 2
Mx = My = -gs V .
8

(9)

M x' My set the scale of symmetry breaking at this stage. Appropriately


choosing V, we can make the scale as high as we please. In this process of
symmetry breaking 12 Higgs scalar would disappear in giving masses to the 12
gauge bosons. Twelve would survive as physical, scalar particles. It should also be
noted that the fermions belonging to 5* and 10* remain massless along with the
gluons and the electroweak bosons, since 24 does not occur in 5* x 10 nor
10 x 10 and, hence, we cannot form Yukawa-like couplings of the Higgs scalars
with the fermions. Recall that only such terms, if they exist, can give rise to the
fermion masses when the Higgs scalars take on a nonvanishing VEV.
3.1.2. 2 nd Stage. To break the symmetry further, that is, SU(3k x SU(2) x
U(1) ~ SU(3)c x U(1)EM' consider a Higgs multiplet H belonging to the fundamental representation 5 ofSU(5), which consists of(3, 1, - t) and (1, 2, t), and can
be denoted by

Xl
H =

X3

cp+
cpO

X = (3, 1, -t),

rjJ = (1, 2,

t).

(10)

Introduction to Grand Unified Theories

263

The full Lagrangian would have the added terms (DIlH)t (DIl H), V(H), which are
the KE term and the Higgs potential term, respectively. The KE term contains

g; (A/1 H): (All H)a


2

and, hence, it has the potentiality to give masses to the gauge bosons. Suppose we
choose

(H)=-

J2

o
o
o
o

(11 )

Then

g2
~(A
H)a
2 /1 H)t(AIl
a

-4

g2 v2
2
~
2 -2 {IT + liW
2\
1

+ W2)
+ ~5 12 Z2} ,
2

(12)

where

(13)
which shows that the Weinberg angle Ow is given by
sin (}w =

( 14)

and
2

Mw

M z cos (}w

g~ 1/

= -4~.

( 15)

Thus, three of the four electroweak gauge bosons acquire mass leaving only the
SU(3)c gluons and the photon massless. The desired symmetry breaking is,
therefore, attainable in principle. The question is: is it realizable by a Higgs
potential as its stable absolute minimum? We shall consider this problem next, first
by examining only the adjoint representation and then the fundamental and the
adjoint together.

3.2.

Higgs Potential

3.2.1. Potential with the adjoint representation <1>. The most general quartic
potential involving <IJ belonging to the adjoint representation is of the form
fJ.2
,.1
V(<IJ) = ~ ; Tr(<lJ 2) + ;(Tr(<lJ 2))2

f3

+ ~<lJ4) + 3" Tr(<lJ 3),

(16)

K. C. Wali

264

where j1;;, Aa , a, fJ are constant, arbitrary parameters. By an appropriate SU(5)


transformation, we can bring (f) to the diagonal form
(17)
and, hence,
(18)
For the extrema, we require

oV
Oi =

o.

(19)

There is a general theorem due to Ruegg (H. Ruegg, Phys. Rev. D22, 2040
(1980 which states that (19) can be satisfied if and only if there are two distinct
eigenvalues of <1>. That is, the diagonal form (17) is essentially
either
or

(i)
(ii)

diag(, , , ', '),


diag(, , , , '),

3
4

+ 2' = 0
+ ' = o.

The alternative (ii) leads to SU(5) -> SU(4) x U(l). Confining our attention to
(i), we have the desired pattern (7) if = V, ' = -~ V. Then the extremum
Equation (19) leads to

oV(<I
15 2
- -_ 0 -_ ( - -j1
0<1>
2 a
For simplicity, let us assume
2 = V 2

A.2
39 RA.)
+ (15)2
-2 1 a 'I'A.2 + -105
a ' l ' + -1''1' .
4
4
Ie

fJ = O.

(20)

Then (20) yields

2j1;;
15Aa + 7a

(21 )

and

V:.

15j1~

(22)

=-----

mm

4(15A a

+ 7a)

This shows that the desired symmetry breaking pattern SU(5) -> SU(3) x
SU(2) x U(I) can be realized by a genera.! Higgs potential constructed from the
adjoint representation <1>.

3.2.2. Potential with the fundamental representation H.


potential involving H has the form

V(H) = -

fHt H + AI (Ht
j12

H)2.

The most general

(23)

It is straightforward to add this term to V(<I in (16) and carry out the general

Introduction to Grand Unified Theories


analysis. If SU(3)c survives, breaking SU(2) x U(\)

(Hi) = biS

fl'

265
---+

U(I)EM'
(24)

where

v2

2p2

=~

and

Af

g2v 2

M2 = - w
4

(25)

However, V(<1 + V(H) contains two neutral color triplets, one from <1> and the
other from H. In the process of symmetry breaking, one combination is gauged
away contributing to the mass of Y, but the other remains massless. It is
unacceptable as it can mediate proton decay resulting in too short a lifetime for
the proton.

3.2.3. Potential involving both <1> and H. Finally let us consider the most
general potential constructed from <1> and H including cross-terms between <1> and
H given by V(<1>, H) where
(26)

If we omit the last term in (26) (invoking for instance additional symmetry
<1> ---+ - <1>, which also leads to fJ = 0), the analysis of the minima of

v=

+ V(H) + V(<1>, H)

V(<1

leads to the conclusion that an absolute minimum of V can be obtained provided


( <1> ) = V dia g( \, 1, 1, -

~ - ~, - ~ + ~),

flbQS,

(flO) =

with e, V, and v determined from the following equations,

e=

:0 y: (tY+ o( ::),

J1 a2 =

15

2)

'Q

7
2 9
V2+-aV2+"
2
r 1 v +_y
30 2'

(27)
(28)

(29)
(For details see, M. Magg and Q. Shafi, Z. Physik c., Particles and Fields 4, 63
(1980).)
.
It turns out that the mass of the surviving neutral colored triplet that can
mediate in proton decay is given by
J1~

= -1h V 2 + O(V2),

266

K. C. Wali

which requires that Yz < 0 and not have too small a value, since it is necessary
that J1~ ~ M; in order that the proton decay rate is not too fast. We also should
note that the mass hierarchy between Mx and Mw requires that v2 < < < Vz
(v/V'" 10- 13 ). The question arises whether such a hierarchy can be achieved with
a reasonable choice of parameters in the potential.
A close look at this problem reveals one of the fundamental difficulties of grand
unified theories which require, in general, widely different scales of symmetry
breaking. To achieve the desired hierarchy in such scales, the parameters (in
Equation (20), for instance) require a delicate adjustment or fine tuning to
a fantastic degree of accuracy. For most reasonable values of the parameters, one
obtains Mw ~ Mx. Further, even if the parameters are adjusted at the tree level,
radiative corrections introduce large corrections to the scalar meson mass terms
requiring fine tuning again. As a result the whole procedure of SSB has become
suspect. A partial solution to this hierarchy and fine tuning problem is to be found
in SUSY models in which one can arrange cancellation between symmetric boson
and fermion contributions, thereby avoiding quadratically divergent contributions to the scalar meson mass terms. But SUSY models give rise to new
problems.

3.3.

Fermion Masses

If the original Lagrangian is to be SU(5) gauge invariant, there cannot be bare


fermion mass terms in the Lagrangian. Masses for the fermions arise after SSB
due to fermion-Higgs, Yukawa-type couplings. It is straightforward to verify that,
in the minimal SU(5) model with only 24 and 5 Higgs representations, it is only
the 5 representation that couples to the fermions in the model. The relevant part
of the Lagrangian !L' FH is given by

(30)
where, for the sake of completeness, we have introduced the generation structure
of fermions in the model with the index i,j = 1,2,3. The first term in (30)
corresponds to (5* x 10) fermion couplings and the second corresponds to those
of 10 x 10 fermions. When the symmetry is broken with

<H >=
a

baS

v/j2,

we have mass matrices


(d) _

Mij -

v
j2
r

ij

and

(u) _

Mij -

v _,
j21

ij

in the down- and up-charge sectors, respectively. When these mass matrices are

267

Introduction to Grand Unified Theories

diagonalized, we can define physical, mass eigenstates for the quarks. In minimal
SU(5), the bare masses thus obtained satisfy the relations,
(31 )

These relations between quark and lepton masses will be modified by radiative
corrections. We shall study this aspect in the next section.

4.

Predictions of Minimal SU(5)

In this section, we shall discuss three, what are considered to be classical


predictions of SU(5). These include, (i) the value of sin 2 Ow in the measured low
energy region, (ii) the mass of the b-quark, (iii) the proton decay and its life-time.
The first two predictions combine exact SU(5) limit with renormalization group
ideas to calculate the energy dependence of the coupling constants and the
effective mass of the fermions. The last prediction is the most dramatic one. That
these predictions follow more or less unambiguously made minimal SU(5), the
most popular GUT model. Other GUT models contain additional parameters
and therefore do not lead to clear cut predictions.

4.1.

The Weinberg Angle; Coupling Constant Renormalization

In the standard theory of electro/weak interactions .


.

sIn Ow

e2

=2 =
g

g'2

+g

12'

where 9, 9' are the SU(2) and U(l) gauge couplings, respectively. Thus sin 2 0w is
a totally free parameter. In SU(5), there is only one gauge coupling parameter g5'
and the exact SU(5) limit,
(1)

where 93 is the SU(3)c gauge coupling and 92 = g, 91 = g'. The above equality
follows when the generators of all the relevant subgroups are normalized in the
same way, namely

Tr(Ti T)

(2)

Hij.

Recall that the electric charge generator Q,

Q = T3 + v1 Tv
with
Q

= diag( -to -1, -1,0,0)

and

Ty = ~ diag( -1, -1,

-1, t, tl

268
Since
that

K. C. Wali

Ji Ty is the weak hypercharge generator in the standard theory, it follows


(3)

and, hence,
.

SIll

3/5
3
Ow = 1 + 3/5 ="8 =0.375.

(4)

The experimental value of 0.215 0.015 indicates a clear discrepancy until we


realize that the relation (4) is in the exact SU(5) limit, that is for energies, E Mx.
The measured value of sin 2 0w is at much lower energies E M w, where SU(5) is
broken. The couplings have departed, have 'run' away from their asymptotic
values. We must evaluate them at lower energies using Equation (1) as the
asymptotic boundary condition. The renormalization group (RNG) analysis
provides the necessary tool and the equations that govern the energy-dependence
of the coupling parameters. In what follows, we shall note a few salient features of
RNG ideas, before we apply them to our problem.
(a) Perturbative calculations in relativistic fields theories lead to infinities
which need to be properly subtracted by absorbing them into the redefinition or
renormalization of physical quantities such as mass and charge.
(b) Physical quantities are defined by choosing certain, convenient boundary
conditions at physical (or sometimes unphysical) points. In the A4 selfinteracting scalar meson theory, for instance, one defines the renormalized
four-point coupling ), by

rk4 ) (PI' P2' P3' P4) = -iA

at Pi = 0

or at

Pf

= 112.

(c) Different ways of defining the physical quantities clearly give rise to different
renormalized physical quantities. However, they must be related in a definite way
if physics is not to depend upon the arbitrary choice of defining the parameters.
Consider, for instance, two different renormalization schemes, Rand R',
(5)

where Z</>(R) and Z</>(R') are infinite, wavefunction renormalization constants in


the two schemes. Hence, eliminating o,
R(R)

Z; 1/2(R)

= Zr/J

,
1/2(R') R'(R)

= Z</>(R, R')R,(R')

(6)

Now Z</>(R, R') must be finite since it rdates two renormalized fields. Similar
considerations should hold for the coupling and mass parameters in the theory.
R --> R' can be viewed as a transformation. The set of all such transformations
form a group which is called the renormalization group. The translation of the
above statement into an analytic form leads to Gell-Mann-Low, or Callan-

Introduction to Grand Unified Theories

269

Symanzic equations which give the energy dependence of the coupling parameters. We shall not derive these equations, but use the results.
(d) Coupling parameters of a non-Abelian gauge theory are governed by the
equations
(7)

Therefore,
1

-2-

gn{Jl)

const - 2b n In Jl

(8)

and considering the value at Jl = M x = grand unification mass,


1

-2-

gn (Jl)

Mx

gn (Mx)

+ 2b n In -

Jl

for Jl M x

(9)

If at Mx, g3 = 92 = 91 = 95' we have the basic equations in the minimal SU(5)


model,
1
(X3(Jl)

Mx
Jl

- - = - - + 8nb 3 1n-,
(Xs(Jl)

-- = -(X2

(Jl)

(Xs

I
(Jl)

(Xs

(Jl)

Mx

+ 8nb 2 1nJl

I
(Jl)

Mx

- - = - - + 8nb l ln-(Xl

where

(Xi

/1 '

(10)

91/4n.

The coefficients hn are calculated in perturbation theory using 'dimensional'


regularization. If we neglect the contribution due to the exchange of Higgs
scalars,
bn =

(2N F

lIn)
48n 2

forn>2.

(11)

For n = 1,

2NF

b l = 48n 2 '

(12)

where N F is the number of quark flavors (N F = 6 for three generations).


Using (10), (11), and (12), we can derive the following three important relations:
.

(\)

. 2
1 5 (X(Jl)
sm Ow = - + --6 9 (Xs(/1)'

(13)

which relates sin 2 0w to the measurable, electromagnetic (fine-structure) coupling

K. C. Wali

270
parameter 1X(1l) and the strong interaction QCD 1X 3 (1l) =

(1 "38a 1(ll) ),

Mx
n
In --;; = 11 1X(1l) -

..

(n)

IX s (Il).

(14)

which is a very important relation as it enables us to estimate the GUT mass scale
M x from the knowledge of 1X(1l) and IXs(fl).
...

sm

(111)

B
w

55

24n

= - - -

I Mx
1X(1l) n - .

(15)

11

This is not an independent, but useful relation that follows from (13) and (14),
which exhibits an alternate way of estimating M x from the knowledge of sin 2 Bw
and 1X(1l). Incidentally, it also shows that sin 2 Bw < i.
Recall that M~ = Mi, cos 2 Bw. An accurate determination of Mw and M z
gives an accurate determination of sin 2 Bw , which then provides a precise
determination of M x.
These derivations, complicated as they are, involve several simplifying
assumptions. They neglect the scalar meson contributions and the proper
threshold behavior. When they are included, the formulae become even more
complicated and, to some extent, lose their predictive power. It is interesting,
however, that the qualitative behavior of sin 2 Bw and M x that emerge from these
formulae, are in accordance with experiments. From (13) and (14), we can
estimate
.

sm Bw
at 11

{+0.016}

= 0.206 _ 0.004 '

Mx

{3.4}

= 3.6 -3.2 x 10 14 GeV,

80 Gey.

In estimating the above numbers, we need the value of a(ll) and

IX s (ll)

at

11 = 80 GeV. In addition, we need to assume a value for the top-quark mass. We

know that IX -1(0) = 137. To evaluate it at 80 Ge V where strong interactions are


important, we need to use dispersion theoretic treatment of e + e - total crosssection. To evaluate IX s(Il), we need the value of the QCD parameter i\. The
above-mentioned values for sin 2 Bw and Mx assume
50 MeV ~ AMS ~ 500 MeV

and

mt = 20 GeV.

There are currently uncertainties in the values ofthese parameters, but the values
for sin 2 Bw and M x are reasonably stable. They are in good agreement with
experiments, and therefore constitute one of the successful predictions of the
model.

4.2.

Mass of the b-Quark; Mass Renormalization and c1!ective Mass Operator

The mass relations (31) discussed in the previous section are, as remarked earlier,
consequences of exact SU(5) symmetry. Like the coupling constants, they

Introduction to Grand Unified Theories

271

undergo renormalization. From RNG considerations, the equation for 'effective'


mass or 'running' mass is
dIn m(ll)

d In 11

= b(n)
m

2( )
gn 11,

(16)

where b~) is the coefficient (like bn in the case of coupling parameters) that arises in
the fermion self-energy calculation. Using Equation (9) for g;(Il), we obtain
dIn m(ll)

g;(llo)b~)

dIn 11

1 + 2b ng;(1l0) In (11/110)"

(17)

The solution of (17) gives

= [ gn(ll) -1i:;'>lb, .

m(ll)
m(llo)

gn(1l0)

(18)

where bn are given by Equation (11) and (12). If we consider the graph in Figure

(;)

Fig. I.

1 and note

L (T

a T");j

(TOf

=~

= n2 2- 1 c5;j, for SU(n)


n

(fY

~ 2,

(19)
(20)

for U(l),

we get, in our case of SUeS) --+ SU(3)c x SU(2) x U(I),


[

md,s,b, ... (Il)


md,s,

b,..,(M x)

=[

JII

g3(1l)
g3(M x )

~NII3

x [92(1l)

J44~~NI

g2(M x)

x Igdll)

gl (Mx)

11r?NI

= [ g2(1l) J44~lNI x [ gl(ll)


-16k I
[ me,ll,t, ... (Il)
g2(M x )
gl(Mxl'
me,Il" .... (M x )

(21)

(22)

which lead to
mb (ll)
m,(Il)

= [ g3(1l)

gs(Mxl

J II

~N1 13

I J

[ 9 (11) #;
gs(M x ) '

(23)

272

K. C. Wali

where we have used g3(M x ) = gl(M x ) = gs(Mxl. At J1 = 10 GeV, Equation (23)


yields
I11b

I11r

2.9

0.2,

to be compared with the experimental value of 2.6 - 2.9, which clearly must be
regarded as a successful prediction of the minimal SU(5). However, the same
considerations lead to

which is totally in contradiction with known considerations. I11,Jl11e is approximately 200, whereas, according to current algebra estimates, I11s/l11d is approximately 20. This discrepancy is generally blamed on the Renormalization Group
Equation. It is not expected to hold for very light quarks. (For a detailed
discussion of this topic, see the original paper of Buras et ai., Nucl. Phys. B135 66
(1978).)

4.3.

Proton Decay

The most dramatic prediction of any grand unified theory is the instability of the
proton due to baryon number-violating interactions. If the interactions are
unified, ifthe quarks and the leptons are in the same multiplets, it is to be expected
that there are B- and L-violating processes in the energy region E > M x. The
question is whether interactions leading to such processes can produce observable consequences in the low energy region and whether we can calculate their
effects with some measure of certainty. As a grand unified theory, minimal SU(5)
is well-defined; the most important feature being that there is only one GUT mass
scale M x. However, as we shall see, in calculating the life-time of the proton and
the various branching ratios, it is necessary to invoke other theoretical considerations over and above those of GUTs. Predicted life-time, even in the
minimal SU(5), is uncertain to the extent of two orders of magnitude.
The interaction vertices responsible for Band L violations are shown Figure 2.

~~ ~:

i1<:

y
~c
1

0= ]

-.J

-1

-1

(B

L)

Fig. 2.

Introduction to Grand Unified Theories

273

They involve the lepto-quark gauge bosons X and Y, leading to the interaction
Lagrangian,
ro
,,++ t,py U-cY"fi]
oZaB"cO
-_g5X-'[::J
M "uR.Y "++deR
L,yeL
Y Ul +
aL"cO V 2

9 5 Y- ,,a[
d- Ra Y"VR
c
+ j2
-

U
L,}," eL+

+ tapy U- cy Y"dB]
+ h.c.
l

(24)

The above interaction Lagrangian gives rise to tree-level graphs (Figure 3)

~e. ~vc >~_(


de

de

uC

Fig. 3.

leading to decays
p

-->

VC

+X

etc.

The vertices in (24) satisfy (as can be seen from the Baryon and Lepton numbers
under interaction vertices) AB # 0, AL # 0, but A(B - L) = O. When we extend
the model to include several families, we obtain selection rules AS = 0, AS = - AB.
These are selection rules characteristic of minimal SU(5). The tree-level graphs
when extrapolated down to 1 GeV, become effectively four-fermion interactions
with coupling strength xgVM;. The effective interaction feff is given by

g2
ferr = 2

52

Mx

[(Dapy uFy"uf)(2et Y"dt

- !;aPy(uj" }'" M)(VR Y" dR.)]

+ e: ""dR.)-

+ h.c.

(25)

where r:t., /3, Y denote the color indices. The family structure is not included, but
Fierz transformations and Fierz identities are used to obtain the effective
interaction in the above form.
In a more general analysis one, therefore, constructs all four-fermion operators
subject to SU(3) x SU(2) x U(I) invariance (S. Weinberg, Phys. Rev. Lett. 43,
1566 (1979); F. Wilczek and A. Zee, Phys. Rev. Lett. 43,1571 (1979)). The lowest
dimension of such an operator which violates the baryon number is six (four
fermions), leading, almost on dimensional arguments to
!proton =

Hence,

r'
p

(26)

274

K. C. Wali

To perform a detailed estimate, it is necessary (1) to renormalize the effective


Lagrangian written down at the mass scale Mx to bring it down to the hadron
scale of 1 Gey. One has to resort to RNG techniques, as in the case of the
coupling constant and mass renormalizations we discussed in Sections 1 and 2. (2)
We have to find the matrix elements of the effective interaction (which is in terms
of quark and lepton fields) between hadronic states. A variety of phenomenological hadron-physics techniques, SU(6), relativistic bag models, current algebra
techniques based on chiral symmetry, have been used to evaluate the matrix
elements of if eff between the hadron states such as proton and pi-meson.
Different methods give varying answers differing approximately by a factor of
25-30 in the proton lifetime. (For some of these details, see O. Kaymakcalan,
C. H. Lo and K. C. Wali, Phys. Rev. D29, 1962, (1984).)
The lifetime depends most sensitively on M x which, in turn, depends sensitively
on the QeD parameter A in the RNG Equations. The current uncertainty in the
value of the latter parameter leads to an uncertainty in the proton lifetime by
a factor of 10 2 . This situation may be improved in the future by the precise
determination of the masses of M w , M z and, hence, sin 2 8w .

5.

Baryon Asymmetry

As stated in the first section, astrophysics and the early Universe are the natural
arena for testing grand unified theories. In this and the next and last section, we
shall discuss two topics which have received a great deal of attention in recent
years to illustrate how astrophysics can and does provide guidelines for GUTs.
We live in a particle-antiparticle asymmetric universe. Lucky indeed! If there
was an equal amount of matter and antimatter in the Universe, we can infer from
annihilation processes that the baryon number density nB and the anti-baryon
number density ns being equal,
(1)

where ny is the number density of the photons. Instead, we find nR :::,: lO-Sn y,
which indeed is much, much larger than expected, suggesting an excess of matter
over antimatter in the Universe, which prevents complete annihilation.
How does this asymmetry originate? Three possible scenarios have been
suggested in the literature: (i) The asymmetry is an initial condition of the
Universe. The Universe was born with an asymmetry which was preserved
throughout its evolution. (ii) The whole Universe is symmetric, but it is spatially
nonuniform; some regions have an excess of baryons, other regions have an
excess of anti-baryons. (iii) The Universe was originally symmetric; the asymmetry
developed in the course of the evolution of the Universe.
The last alternative is regarded by most as the most satisfactory solution to the
problem. This requires, however, as noted by Sakharov a long time ago,
(a) Baryon number nonconserving processes,

Introduction to Grand Unified Theories

275

(b) CP violation,
(c) Departure from thermal equilibrium.
As discussed earlier, it is natural to expect ~B #0 processes in GUTs. Since CP
violation is observed in K-K system, it should be incorporated in a complete
theory such as a GUT. The third condition (c) is provided by the expansion ofthe
Universe. Hence, GUTs combined with the expansion and consequent cooling of
the Universe as embodied in Big Bang Cosmology can, in principle, provide an
answer to the long standing puzzle.
To see this in more detail, let us begin considering our Universe at temperature
kT =:: Mp =:: 10 19 GeV (Mp == Planck mass). At this temperature and above,
gravitational interactions dominate. They, along with other interactions, bring
about thermal equilibrium for all particles containing equal numbers of quarks
and antiquarks. At this stage, the baryon number of the Universe is zero.
The Universe expands; it cools and the temperature falls below that
corresponding to the Planck mass. Consider the existence of some superheavy
particles such as the gauge bosons X, Y, or Higgs particles that survive in the
process of symmetry breaking in GUTs. Let X be the generic name for such
a superheavy particle.
Now, the rates fx for X decay, fc for collisions, at a temperature T of the
universe, are given by

(2)
where N is the number of helicity states in the theory, M x is the mass of X, and
= g2/4re if X is a gauge boson with gauge coupling g,

Ci. x

if X is a Higgs boson (m 2 is an average fermion mass). Typically N ~ 100,


Ci. x =:: 10 - 2 ,Ci. H =:: 10 - S, 10 - 6.
The expansion rate of the Universe, R(t)/R(t) = H, the Hubble constant, obeys
the equation
_ [8reG ]112
H 3 P
,

(3)

where p, the proper energy density, is given by


p= N

(Tm) 30 T4
re2

(4)

N(m/T) is the number of helicity states of particles of mass m at temperature T.


(F or the derivation of these equations and other details of Big Bang Cosmology,
see Chapter 17.)
When the temperature T is just below the Planck temperature, fx, fc are much

K. C. Wali

276

smaller than H. There is no significant decay of X. They are maintained in thermal


equilibrium. The decays of X and X start becoming significant when r x = H.
Suppose this happens at a temperature TD, that is, TD given by
(5)

(where we have used (3) and (4) to obtain the expression for H(N(m/T) ~ constant)). Now, if TD M x , the X bosons will still be in equilibrium at TD and no
asymmetry will be produced. If TD M x , however, the X-particles decays upset
the equilibrium distribution. For this to happen,
(6)

Gauge and Higgs bosons with masses satisfying the constraint (6) do exist in
GUTs. Their baryon number violating decays can, therefore, produce the desired
asymmetry.
Let X decay into two such channels. If the branching ratio of the first channel is
r, by unitarity, the branching ratio of the other is (1 - r). Let the baryon numbers
of the two channels be B) and B 2, respectively. If the corresponding branching
ratios of X are l' and (1 - 1'), the average baryon number produced by the decays
of one pair of X and X is AB, where
AB

= HrB) + (1 - r)B 2 + 1'( -B)) + (1 1

= -(r r

1')( -B 2)]

(7)

1')(B) - B 2 ).

Note that for AB =f. 0,


B) =f. B2 => Baryon number is not conserved,
r =f. l' => CP is violated.
Further analysis shows that tree-level graphs are not sufficient to produce the
asymmetry. CPT theorem intervenes, cancelling the contributions from X and X.
It is necessary to have higher-order loop graphs (Figure 4).

Higher Order

Tree level
Amplitude Ao

-< =<t
Fig. 4.

Introduction to Grand Unified Theories

277

Then, we have a situation in which

x-

decay amplitude

Ao

X - decay amplitude

A~

+ A1 I(S + iI;),
+ MI(S + ie),

where I is the loop integral, Ao and A 1 are effective coupling strengths in the tree
and one-loop graphs. And
AB oc
= 4

f
f

dw[IAo

+ AJ(S + ie)1 2

dw[lm(AoAt) 1m I(S

IA~ + A!I(S + ilWJ

+ iB)],

(8)

which shows that in addition to complex couplings (CP-violation) one needs an


S-channel discontinuity to generate baryon asymmetry.
The observed asymmetry is determined by AB and the density of X-particles at
the time of decay. From the distribution function just before the decay, the final
formula is

45
Nx
nB/ny ~ -24 ((3)-AB,
n
N
where

(9)

N x = number of X-particle states whose decays violate Band C,


N = total number of particle states,
((3) = Riemann ( function.

In minimal SU(5), the gauge couplings are real. One needs to go to


higher-order loop graphs. The calculated asymmetry is too small. The color
triplet of the 5-plet Higgs is superheavy. It can decay into (quark + lepton) as well
as two anti-quarks. Hence, it is a good candidate. But it turns out that interference
between tree and one loop does not generate CP-violation. It is necessary to
consider three loop graphs. The result is that the net baryon violation
AB ~ 10- 18 , too small to account for the observed asymmetry. More Higgs
multiplets are necessary. Thus, one needs to go beyond minimal SU(5) in order to
obtain the observed value for nB/ny. However, when one goes beyond minimal
SU(5), the predictive power is lost. One can tune in the desired value by adjusting
the parameters. The situation is the same with other GUT models such as SO(lO).

6.

Phase Transitions in the Early Universe

As we have seen, in GUTs the symmetries of a Lagrangian invariant under local,


non-Abelian gauge transformations belonging to a group G undergo, in one or
several stages, spontaneous symmetry breakdown:

G ---+

. ---+

SU(3)c x SU(2) x U(l)

---+

SU(3)c x U(l)EM.

(1)

We may view this sequence of successive spontaneous symmetry breakings in the

K. C. Wali

278

Lagrangian describing the fundamental interactions as corresponding to successive phase transitions in the early Universe. Thus when M p , kT M x, the
Lagrangian has G == SU(5) symmetry; we may say the Universe is in SU(5)
symmetric phase. When Mw < T< M x , and G -> SU(3)c x SU(2) x U(l), the
Universe is in the H = SU(3k x SU(2) x U(l) phase, and so on.
The Higgs potential described in earlier sections acquires a temperature
dependence due to radiative corrections evaluated in the background of a hot
Universe. The effective potential Veff(<l>, T) takes the form

Veff(<l>, T) = V(<l>, T = 0)

+ V'(<l>, T),

where V(<l>, T = 0) is the customary Higgs potential and V'(<l>, T) contains the
temperature dependence. The relevant and the essential change is that
2

~ -> _ /leff = _ ~
222

+ (JT 2

(2)

'

where (J is a known function of all the parameters appearing in V( </J, T = 0). This
modification of the Higgs potential leads to the concept of a critical temperature
Te such that, if

<</J) = 0

G is unbroken,

<</J)=V

G->H,

with a phase transition occurring at T = Te' As Tfalls through T e, <l> will acquire
a nonzero vacuum expectation value. The absolute minimum of V, however, is
governed, not by the direction of </J, but only by 1<l> I. Consequently, the lowest
energy state is not unique. There are degenerate vacua and different parts of the
Universe may choose a different ground state. Fluctuations back and forth
between the unique symmetric phase above Te and ground states below Te tend to
produce spatial uniformity. However, this may be prevented by trapped
singularities of one kind or another. The whole phenomenon is analogous to that
of domain structure in ferro-magnetism and defects in solid state. It leads to
topological considerations and topological classification of singularities.
God, in creating the Universe and operating it, is not only a GUT-man, but
also a good condensed matter physicist; He is a topologist as well.

Classification of Singularities; Topology of the manifold M = G/ H


The equivalent, degenerate vacua form a manifold M = G/H, where G is the
unbroken symmetry group above the critical temperature To and H is the
surviving symmetry group below Te. When the phase transition occurs, the
universe must choose a point on the manifold M. The choice is random and may
be different in different regions of space, the correlations in different parts
extending up to the appropriate 'horizons'.
Consider the case when M has two or more disconnected pieces. This happens

Introduction to Grand Unified Theories

279

when a discrete symmetry is broken spontaneously. We shall consider two


examples of such cases briefly. Both give rise to surface singularities or domain
walls and put strong constraints on the models.
(A) Model of spontaneous CP-violation. (Zel'dovich, Kubzarev, Okun, Zh. Eksp.
Teor. Fiz. 67,3-11 (1974)). One starts with a Lagrangian which is CP-invariant.
However, the vacuum in the model Lagrangian is characterized by ) = 11. In
the two domains, CP is violated; the signs of all the CP-noninvariant effects are
opposite to one another in the two domains. The boundary between the two
domains is what is called a domain wall.
Such domain walls are seats of energy. For reasonable values of the parameters
in the model, it turns out that the walls carry massive amounts of energy
(11 ~ 100 GeV, otherwise the particle associated with the SSB becomes superlight,
contradicting the known decay properties of the K-meson. The domain wall then
stretched across the Universe would be 108 times more massive than all other
known matter.) Clearly, if the model of spontaneous CP-violation is to be valid,
there must be mechanisms which cause the domain walls formed to disappear at
a sufficiently early stage in the evolution of the Universe. Otherwise, there will be
contradictions between theory and some of the known facts about the Universe.

<

(B) Peccei-Quinn symmetry and its breakdown. Peccei-Quinn symmetry postulates a global, axial symmetry, U(1 )PQ to solve the strong CP-violation problem
that arises from the Lagrangian of quantum chromodynamics. However, when
this global U(1)PQ breaks spontaneously, it leaves a discrete symmetry Z(N)
unbroken (N is the number of flavors, QCD Lagrangian has the flavor symmetry
SUL(N) x SUR(N) x U v(1)). The vacua are characterized by rt. = 2kn/N, leading
to domain structures separated by walls which carry massive amounts of energy.
Unless appropriate constraints are put into force, contradictions between the
particle theory and cosmology arise.
Other types of singularities that can arise are line singularities (strings) or point
singularities (monopoles). Strings appear if M contains unshrinkable loops. And
point singularities appear if M contains closed two-dimensional surfaces that
cannot be shrunk to a point within M.
Topologically, the existence of singular structures in M requires that one of the
homotopy groups of M be nontrivial. The elements of the nontrivial homotopy
group then serve to classify the possible singularities - strings by the elements of
lll(M), monopoles by the elements of ll2(M) etc.
If the gauge group G is connected, llo(G), which counts the number of
disconnected pieces, is trivial, that is, llo(G) = 1. If it is simply connected (which

Structure
Domain
walls
Strings
Monopoles

Dimension of
singularity

Classified

n,,(M)

1
0

n 1 (M)
n 2 (M)

by

K. C. Wali

280

can always be done by working with the covering group), IIl(G) = 1. If G is


simple, there are isomorphisms between the relevant homotopy groups of
M = G/H and the unbroken symmetry subgroup H,
and

II 1 (H)

= ~llk

x K,

where k is the number of U(I) factors in H, 7L is the group of integers, and K is


a finite group.
From the above mathematical facts, it immediately follows that if H contains at
least one U(I) factor, monopoles must occur. Since we known that a U(I)EM has to
exist, it follows that monopoles must make their appearance when phase
transitions occur in the early Universe.
In the case G = SU(5) and
G -+ HI = [SU(3) x SU(2) x U(I)]/Z6
-+

H 2 = [SU(3)c

U(1)EM]/Z3'

There are two phase transitions and the survlVlng symmetries contain U(I)
factors. Monopoles do occur. Estimates show that
f

Number density of monopoles I


Number of photons
today

10

-6
,

which is an absurdly large number. Experimental bound for


the famous monopole problem.

fm

is 10- 24 . This is

Note added in proof: The above chapter was written in the summer of 1985. Since
then a lower limit of 10 32 years on proton lifetime has been set by various
experiments. This certainly rules out minimal SU(5).

Bibliography
I. Review articles
P. Langacker, Phys. Rep.
185 (1981); R. Slansky, Phys. Rep. 79,1 (1981); P. Nair and K. C. Wali,
AlP Conference Proceedings, No. 98; Particles and Fields Subseries No. 29, Edited by W. E. Caswell
and G. A. Snow. Contains a partially annotated bibliography.

nc,

2. Books
Recently several good books have appeared. I have consulted the following:
Graham G. Ross, Grand Unified Theories, Benjamin/Cummings Publishing Co.
T. P. Cheng and L.F. Li, Gauge Theories of Elementary Particle Physics, Oxford University Press.

3. For brief introductions to various topics and collection of reprints of original papers, Unity of
Forces in the Universe, Vol. I and II (A. Zee (ed.)), World Scientific.
4. For the last section, I have consulted articles by Shafi and Vilenkin in The Very Early Universe,
Edited by G. W. Gibbons, S. W. Hawking, S. T. C. Siklos (Cambridge University Press). Also, T. W. B.
Kibble, Phys. Rev. 67, 183-199 (1980), J. Phys., A. Math. Gen. 9, 1387 (1976).

13. Topology and Homotopy


B. R. SIT ARAM
Physical Research Laboratory, Navrangpura, Ahmedabad, 380009, India

1.

What is Topology?

The best way to understand the notion of topology is perhaps to use the
viewpoint of Felix Klein's Erlangen Programme. According to this programme,
various branches of abstract mathematics can be classified by means of the
groups of transformations which preserve the properties of objects studied in that
branch. For example, Euclidean geometry can be considered to be the study of
the properties of objects which are left invariant under the Euclidean group of
motions, i.e., rigid rotations and rigid translations. (For some of the important
theorems of Euclidean geometry, the ones dealing with similar triangles, it may be
necessary to also include uniform scalings.) Projective geometry, on the other
hand, can be considered to be the study of properties left invariant under
projective transformations. In this sense, topology is just the study of properties
left invariant under homeomorphisms, i.e., bicontinuous mappings.
Let X, Y be topological spaces. As far as we are concerned, all this means is that
X and Y have a certain structure, known as a topology, which allows us to decide
when a function f: X --> Y is continuous. Since most of the spaces that we shall
be dealing with in the course of this chapter have obvious topological structure
i.e., it is usually obvious as to what is meant by continuous functions on such
spaces, we shall not go into the details of this structure. Given two such spaces,
X and Y, a function f: X --> Y is said to be a homeomorphism if the following
are satisfied:
(i) the function f is continuous;
(ii) the inverse function f - 1 exists: this necessitates that f is one-to-one and
onto;
(iii) the function f - 1 is also continuous.
If there exists a homeomorphism between two spaces X and Y, we shall say that
the two spaces are homeomorphic to one another. Examples of homeomorphic
spaces are

(i) a circle and a square;


(ii) a coffee cup and a doughnut.

B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 281-286.
1989 by Kluwer Academic Publishers.

282
2.

B. R. Sitaram

Why the Recent Interest in Topology?

Studies in field theory conducted over the last two or three decades has shown the
existence of two kinds of conserved currents:
(i) Noether currents - conservation follows from the invariance of the
Lagrangian under some continuous group of transformations;
(ii) Topological currents - conservation is independent of the Lagrangian
and is dependent only on the topological structure of the spacetime.
As an example ofthe above, consider a scalar field cf> in 1 + 1 dimensions and the
currentjll = Bp.vovcf>. Note that the conservation law 0lljl' = 0 follows from the
structure of the current itself and not from any dynamics (we have not even
written down a Lagrangian!). Note also that the corresponding conserved charge,
defind by Q = Sjo dx = cf>( + (0) - cf>( - (0), depends only on the behaviour of the
fields at the boundaries.
Now assume that the field <jJ actually satisfies the Klein-Gordon equation. For
physical solutions, (e.g., those with finite energy), the boundary conditions are
<jJ( (0) = 0 which imply that Q = O! On the other hand, asume that <jJ satisfies
the sine-Gordon equation, 02<jJ + sin <jJ = O. The physical boundary conditions, as usual, are those for finite energy solutions, the fields should
asymptotically approach a zero of the potential energy. In this case, this can be
achieved by the conditions <jJ( (0) = 2nn, n = integer, leading to nontrivial
values for Q.
A clue as to why the conserved charge is nontrivial in the second case can be
obtained by considering the space defined by cf>, x and t in the two cases. In the
first case, the space is just R 3 , while in the second, it is R2 x Sl. The latter space
has nontrivial topological properties which, in a certain sense, can be considered
to be responsible for the nontrivial nature of Q.
It is thus of interest to ask the question, what topological properties are likely
to lead to such nontrivial topological conserved charges? In this chapter, we shall
consider two of the most commonly arising properties of topological spaces,
namely nontrivial homotopy groups and nontrivial Chern classes.

3.

Homotopy Theory

Let X, Y be given topological spaces and let 1= (0, 1) be the closed unit interval
considered as a subset of R. Let f, g: X --+ Y be two continuous maps, then we say
thatfis homotopic to g, denoted by f ~ g, if there exists a function F: X x 1--+ Y
such that
(i) F is continuous;
(ii) F(x,O) = f(x), for all x EX;
(iii) F(x, 1) = g(x), for all x E X.
We then have the following Lemma.

283

Topology and Homotopy

LEMMA. f

'::!:.

f;f

'::!:.

g => g

'::!:.

f;f

'::!:.

g, g

'::!:.

h => f

'::!:.

h.

What this lemma implies is that the relation is an equivalence relation. We now
use a standard construction in mathematics, known as defining factor sets for
a given equivalence relation. Pick up any map f: X ---. Y. Let (f) denote the set of
all maps f1: X ---. Y such that f '::!:. f1. Now pick up some other map g: X ---. Y
which is not contained in (f) and define (g) to be the set of all maps which are
homotopic to g. The Lemma stated above assures us that the two sets (f) and (g)
are completely disjoint, i.e., that they do not contain any common elements. Now
proceed in the same manner with another map h, until all maps have been placed
in one or other sets (-). Then, the set (X, Y) is to be the set ((f), (g), (h) ... ) and we
call this set 'the set of homotopy classes of maps from X to Y'. It is very important
to note that each element of this set is itself a set of maps!
For our later work, it is useful to use a more restricted definition, namely
homotopy classes of pointed maps. What we do is to assume that, in addition to
the spaces X and Y, we are given two special points Xo and Yo' with Xo E X and
Yo E Y. We shall call a function f: X ---. Ya pointed map if f is continuous and
satisfies f(x o) = Yo. Similarly, two pointed maps f, g are homotopic to one
another if they are homotopic in the ordinary sense and, in addition, the function
F: X x 1---. Y satisfies F(x o, t) = Yo for all t. The construction of equivalence
classes proceeds exactly as before, with the obvious difference that only pointed
maps should be considered while forming the sets (f), (g), etc. Denote the set
obtained by ((X, xo)' (Y, Yo)).
Unfortunately, in the case of a general X, this set does not have additional
structure. To bring in this structure, we specialize to the case where X = SI and
Xo is some fixed point, e.g., the North pole, (N).
We first try to understand what a pointed map means for this choice of (X, x o).
It is clear that the image of a pointed map f: X ---. Y is just a loop in the space Y,
starting and ending at Yo. Now loops can be multiplied with one another: given
two loops f, g on Y, we multiply them by defining h = g *f, where,
h(8)

f(28),

= g(28 -

2n),

8~n
n ~ 8 ~ 2n.

and where we have assumed that SI has been parametrized by an angle 8,

o ~ 8 ~ 2n with 8 = 0 corresponding to (N).

The important result of this definition is that this operation leads to a group
structure for the set ((s', N), (Y, Yo)): define (f)*(g) = (f*g). It is easy to see that
the following are true:
(i) There exists an equivalence class (1) of maps homotopic to the constant
map 1: S.l---. Y, defined by 1(8) = Yo for all 8 in S';
(ii) For every (f), there is an inverse class (f -1), where f -1(8) = f(2n - 8)j
(iii) The usual group rules are satisfied.

The set ((S', N), (Y, Yo)), endowed with the group structure, is called the
fundamental homotopy group of Y at Yo and denoted by n 1 (y, Yo) or just n 1 (y). Let

B. R. Sitaram

284

us now calculate the fundamental group for two spaces, Y = R2, the plane, and
Y = R Z - {O}, the punctured plane. In the first case, it should be obvious that all
loops on R Z are homotopic to the identity map, 1. Thus, the homotopy group
consists of only one element, the identity, (1). On the other hand, with
Y = R Z - {O}, consider a loop which circles the origin once in the positive
direction. It is clear that such a loop can never be homotopic to the identity - the
hole gets in the way! A little thought should convince us about the following:
(i) Any two maps that circle the origin the same number of times (where we
include the sense of circling in our counting), are homotopic to one
another;
(ii) Any two maps which are homotopic to one another circle the origin the
same number of times (have the same winding number);
(iii) The product of two loops has a winding number which is just the sum of
the winding numbers of the two loops.
All this implies that nj (RZ - {O}) ~ Z, the additive group of integers!
We can generalize all this to the case where X is the n-sphere,
The
generalization of the notion of equivalence classes is trivial, but the group
structure cannot be imposed in any intuitively clear fashion. Suffice it to say that
a group structure can be given and the resulting group is known as nn(Y).
We now look at an application of homotopy theory, in particular the third
homotopy group. Consider a SU(2) gauge theory on R4. Assume that we are
interested in solutions that are 'asymptotically pure gauge' - the field tensor
should be asymptotically zero. Under these conditions we ask, do there exist
solutions of the Yang-Mills equations which are different from one another in
a topological sense? We solve this by a homotopy argument. Note that the
asymptotic form of the gauge potential is known as soon as the asymptotic gauge
function X defined by

sn.

Ap

asymptotically

) X

apX

is known. Now, this gauge function is essentially defined on the 'infinity' of R 4 ,


which is just S3. Thus, our question can be rephrased as, do there exist nontrivial
maps X: S3 --> SU(2)? The answer is yes, as n3 (SU(2)) ~ Z! Solutions of the
Yang-Mills equations whose asymptotic behaviour is defined by asymptotic
gauge functions which are not in the same homotopy class, will never be
transformable into one another by continuous gauge transformations!
We will see later that the theory of Chern classes will explicitly give a current
and a conserved charge whose value will be different in different homotopy
sectors. We close this section with a list of homotopy groups taken from various
sources:
nj(SO(2))

nj(U(n))

nj(SO(n))

Zz,n

3;

Z,
ni(so(2)) = 1,

i> 1,

Topology and Homotopy


7r 3(SU(m))

7r3(U(1))

7r 3 (SO(4))

7r3(sp(k))

7r3(SU(1))

285
~
~

7r3(U(m))

Z,

1, m ~ 2,

1,

z x Z.

Chern Classes

4.

We shall adopt a completely pedagogical approach in this section while dealing


with Chern classes. Properly speaking, the theory is formulated in terms of
curvature forms defined on certain fibre bundles, but we shall, instead, deal with
them in the more familiar language of Yang-Mills theory. As a result of this
approach, a lot of the geometric flavour of the theory will be lost!
Assume that we have a Yang-Mills theory on a manifold M of dimension 4.
Let F = F~VXidxll /\ dx V be the field tensor, where Xi denote the generators of
the group in some matrix representation. There are two ways of looking at F:
(i) F is a Lie algebra-valued 2-form;
(ii) F is a 2-form matrix.
We shall adopt the second point of view in what follows. From our studies of
differential geometry, we know that 2-forms obey commutative multiplication
rules. As a result, it makes sense to calculate the determinants of 2-forms, using
wedge products instead of ordinary products. In particular, it is possible to
compute

C(F)

= det(1

+ 2~F).

For example, if the group happens to be SU(2) and the representation chosen is
the adjoint representation, then

and

while, if the 2 x 2 representation is chosen, then

B. R. Sitaram

286
and

In general, it is easy to see that


C(F)

+ C 1 (F) + C 2 (F) + ...


O-form + 2-form + 4-form + ...
Co(F)

C(F) is known as the Chern class of F, while the forms Ci(F) are known as Chern
forms. For a four-dimensional manifold, it is obvious that the only nontrivial
Chern forms are the first and the third. Now, there is a standard result in Chern
class theory which says that if we take the top Chern class (in our case C 2) and
integrate it over all of the manifold, then the result is an integer!
We apply this result to get a conserve:d charge for the SU(2) case discussed
above. We directly see that the integral over the top Chern class is a conserved
charged which is also integral! Further. it can be shown that this integer is
nothing but the integer which parametrizes n 3 (SU(2)) ~ Z! For example, the
choice of asymptotic gauge function X = 0 leads to the charge Q = 0, while the
't Hooft-Polyakov solution leads to the charge Q = 1, corresponding to the fact
that the corresponding asymptotic gauge function covers SU(2) exactly once.

References
A large number of examples of topological charges and currents are discussed in
R. Rajaraman, Instantons and Solitons, North-Holland, Amsterdam, 1982.
A good discussion of homotopy and Chern classes (using fibre bundle
language) is given in W. Dreschler and M. E. Mayer, Fibre Bundle Techniques in
Gauge Theories, Springer Lecture Notes in Physics, No. 67 (1977).

14. Introduction to Compact Simple


Lie Groups
N. MUKUNDA
Centre for Theoretical Studies. Indian Institute of Science, Bangalore 560012, India

The purpose of this chapter is to give a brief and simple-minded introduction to


Lie groups, and particularly compact simple Lie groups, since these are of
considerable importance in particle physics, grand unification etc. The basic
concepts of groups and group representations are assumed to be known - these
include the definitions of a group, subgroup, normal subgroup, factor group,
simple and semi-simple groups, Abelian and non-Abelian groups; homomorphisms and isomorphisms etc., irreducible representations, reducible decomposable or indecomposable representations, equivalence of representations,
unitary and real ones, etc. Even while considering a group in the abstract, it helps
to keep in mind some faithful matrix representations of it.
A topological group is a group which is also a topological space, with both
group multiplication and the taking of inverses being continuous operations. As
a special case of such a group, we have the concept of a local Lie group - a
topological group in which some neighbourhood of the identity is homeomorphic to a simply connected open set in R" for some n. [ntuitively, the elements
of the group are 'describable continuously' with n real independent essential
coordinates. A Lie group is a local Lie group which also obeys the second axiom
of countability, i.e. a topological group which is a local Lie group and, as
a topological space, 'can be covered by a countable number of open sets'.
In a down-to-earth and practical sense, the elements of a Lie Group G (in some
neighbourhood of the identity) can be labelled with the help of n real coordinates,
n being the order of the group: the element a has coordinates (JJ,j = 1 ... n; b has
coordinates [3i, and so on. By convention, we may assign coordinates zero to the
identity e in G. The multiplication law in G is then given by n real functions of 2n
arguments each:
(1)

It is a fact that in a Lie group, the coordinates can be chosen in such a way as to
have real analytic functions f j.
The main achievement of the analysis of a Lie group is that by exploiting
the group properties, knowledge of f(a; [3) for infinitesimal a and [3 leads to a
reconstruction of f(a; [3) for finite a and [3, at least in some neighbourhood of the
identity (and in a restricted family of coordinate systems). In fact, 'all knowledge'

287
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 287-292.
1989 by Kluwer Academic Publishers.

N. Mukunda

288

regarding j j(lX; {3) is contained in the structure constants:


I __ I _(iJ2!'(IX;{3) _ 02 j l(IX;{3))
C jk Ckj OlX j o{3k
OlXko{3j FP=O

(2)

In the course of the analysis, we see that associated with the Lie group G with
composition law j j(lX; {3) is a Lie algebra written as G. This is a real n-dimensional
vector space whose elements u, v, w, ... have components u j, vj, w j ... (say),
equipped with a Lie bracket operation:
u,

VE

=>

[u, v]

(3)

G.

This bracket has three properties:


(i) anti symmetry [u,v] = -[v,u],
(ii) Linearity [u + u', v] = [u, v] + [u', v],
(iii) Jacobi Identity [u, [v, w]] + [v, [w, u]]

+ [w, [u, v]] = O.

In a basis ej for G, when u = uj ej etc., these properties translate into the


antisymmetry evident in (2) and
(4)

A given Lie group G leads to a unique Lie algebra G.


However, several Lie groups G, G', and G", which are locally isomorphic, all
share the same Lie algebra G = G' = G" .... What the Lie algebra determines
uniquely is that Lie group, among all the locally isomorphic groups, which is
simply connected - it is called the universal covering group of all the others.
When dealing with matrix representations, we can again pass from the Lie
algebra to the Lie group. A representation of G with matrices D(a) leads to
a representation of G in which the basic elements ej are represented by linear
operators or matrices T j obeying suitable commutation relations:
(5)

The appearance of the factor i is the result of quantum mechanical conventions.


The T j are the infinitesimal generators of the representation D(a). At least, in some
neighbourhood of e, we have
D(a) = matrix representing a E G
=

exp(i x real linear combination of T)

(6)

A simple Lie group is one which has no nontrivial invariant subgroup.


A semisimple Lie group is one with no nontrivial invariant Abelian subgroup.
'Simple' implies 'semi-simple'. In the reverse direction, any semi-simple Lie group
is the direct product of simple non-Abelian factors.
A compact Lie group is one which is compact in the topological sense, in
addition to being a Lie group. Thus, every open cover of G has a finite subcover.

Introduction to Compact Simple Lie Groups

289

In particular, then, if JV' is some neighbourhood of the identity, we can find


a finite number of elements ai' a2 , . .. , aN in G such that

Ua
N

G=

(7)

y ll.

j= 1

From a 'physical' point of view, a compact semi-simple Lie group admits a right
and left invariant volume element, such that the entire group has a finite total
volume.
Here are some of the important facts concerning compact simple Lie groups,
based on a combination of algebraic and geometric methods mainly achieved by
Killing, Cart an, Weyl, Schouten, and Van der Waerden:
(i) All compact simple Lie groups G can be classified because their Lie
algebras G can be classified.
(ii) The irreducible representations of such groups are finite - dimensional
and can be made unitary; the set of all UIR's (unitary irreducible
representations) form, for each G, a complete set.
(iii) In any UIR, the generator matrices T j are Hermitian
(iv) The maximal number of simultaneously diagonalizable Hermitian T/s is
the rank of G (and of G); the rank can, of course, be defined directly and
intrinsically once G is given, though it is easier to grasp in the above sense.
The classification of compact simple Lie groups and their Lie algebras is by the
order n and rank I of the concerned group. There are four classical families of
groups and five exceptional groups, which are shown in Table I.
For low dimensions, we have some coincidences among the members of the
fourfamilies Al - Bl - C 1; B2 - C 2 ; A3 - D3 ; while Dl is Abelian and D2 is not
simple.
(Among the groups relevant in relativistic problems, we find that locally
SO(4, 2) ~ SU(2, 2); SO(3, 2) ~ Sp(4, R); SO(3, 1) ~ SL(2, C); SO(2, 1) ~
SL(2, R) ~ SUfi, 1) ~ Sp(2, R)).
Regarding connectivity of the groups Al ... DI as given by the defining matrix
representations listed in Table I, both A/ and C/, i.e. SU(l + 1) and USP(2l), are
simply connected, while B/ and DI , i.e. SO(n) for n = 3, 5, 6, 7 ... , are doubly
connected.
Let G be any of the above compact simple Lie groups. (Except for the
exceptional groups, we have seen the defining matrix representation in each case.)
The adjoint representation of G is a real orthogonal representation of dimension
equal to the order of the group. It is irreducible, and uses the structure constants
as matrix elements of the generators. If we write the adjoint representation
matrices as 9C(a), then in any UIR D(a) with generators Tj we have
D(a) TjD(a) ~ 1

9Ckia) Tk .

(8)

Thus, the generators always 'belong' to the adjoint representation. For SU(2) and

N. Mukunda

290
Table I
The Four Classical Families
Rank

Order

1= 1.2,3...

1(1

Name

+ 2)

Defining faithful matrix representation via


a Hermitian or bilinear invariant
SU(l + 1): (/ + I)-dimensional unitary
unimodular matrices
1+ !

xj Yj

= invariant.

j= 1

1 = 2,3, ...

1(21

SO(21 + I): (21 + I) dimensional real


orthogonal unimodular matrices,

I)

2/+ 1

XjYj

= invariant.

j= 1

1 = 3,4, ...

1(21

+ I)

USp(21):(21) dimensional unitary symplectic


matrices,
1

(X 1j - ! Y2j -

X 2j Y2j-l)

= invariant.

j= I

1 = 4,5, ...

1(21- I)

SO(21): 21 dimensional real orthogonal


unimodular matrices,
11

XjYj =

invariant.

j= 1

Exceptional Groups
Rank

Order

Name

Smallest UIR

2
4
6
7
8

14
52
78
133
248

G2
F4
E6
E7
E.

7-dimensional
26-dimensional
27. 27*-dimensional
56-dimensional
248-dimensional

SO(3), this is the three-dimensional vector representation, for SV(3) it is the octet,
and so on.
Each G possesses precisely 1basic or fundamental VIR's. Any other UIR is the
'largest' piece in the reduction of the direct product
(1st fundamental VIR)"! (2nd fundamental VIR)"2

... (lth fundamental VIR)"'


and so can be uniquely designated by a set of nonnegative integers (n l' n z, ... , nJ
We will describe the fundamental VIR's for AI' B I , CI andD I later.
If D(a) is a VIR of G, so then is its complex conjugate D(a)*. How are they

Introduction to Compact Simple Lie Groups

291

possibly related? There are three mutually exclusive cases:


(i) D(a) and D(a)* may be inequivalent: then each is called a complex UIR of
G.
(ii) D(a) and D(a)* may be equivalent and we may be able to bring D(a) to
a real form via a suitable unitary transformation - then D(a) is said to be
potentially real.
(iii) D(a) and D(a)* may be equivalent, but it may be impossible to bring D(a) to
real form - this can only happen if the UIR is even-dimensional, and D(a)
is said to be pseudo-real. This is the case for half integer spin UIR's of
SU(2).
It has been shown that only SU(/ + 1) for I ~ 2, SO(41 + 2) for I ~ 2 and E6
possess complex UIR's.
The system of commutation relations and Hermiticity properties of generators,
Equation (5) have a neat appearance for AI' BI and D I , but are more complicated
for C I For the former, we have
SU(I

+ 1):
(T~t = T~,

T~

= 0, a,b,oo. = 1,2,00.,1 + 1;

(9)

SO(n):

a, b, ... = I, 2, ... , n.
We saw that any UIR of any G is uniq uely characterized by a set of nonnegative
integers n 1 , n 2 , , nl We can form independent polynomials in the generators T j ,
such that their values determine the n's: they are the Casimir operators of G.
Within a UIR, we have I Hermitian generators, whose simultaneous eigenvalues
can be used to label the states of a basis. In addition, there is a need for
p = f(r - 31) extra state labels. For SU(2), p = 0, while for SU(3) for example,
p = 1.
We see that the rank plays many important roles.
(i) It is crucial in the problem of classifying all possible G.
(ii) For a given G, it is the number of fundamental UIR's
(iii) It is the number of Casimir operators i.e. the number of labels needed to
uniquely designate a general UIR.
(iv) It is the maximal number of simultaneously diagonal generators within
a UIR.
What do the fundamental UIR's look like in the nonexceptional cases? One finds

292

N. Mukunda

the following picture:

For Al == SU(l + 1), they are the defining (l + I)-dimensional (,vector') UIR; and
the UIR's of anti symmetric tensors of ranks 2, 3, ... , 1with respect to the defining
UIR. For BI == SO(21 + 1) they are the defining (21 + 1)-dimensional 'vector'
UIR; the UIR's of antisymmetric tensors of ranks 2, 3, ... ,1 - 1; and a spinor
UIR of dimension 21.
For CI == USp(2/), they are the defining 2/-dimensional 'vector', and antisymmetric 'traceless' (in the sense of the symplectic metric!) tensors of ranks 2,
3, ... ,1 - 1,1.
For DI == SO(2/), they are the defining 21-dimensional 'vector' UIR; the antisymmetric tensors of ranks 2, 3, ... , 1- 2; and two inequivalent spin or UIR's
each of dimension 21- 1.
We see that the real orthogonal groups BI = SO(21 + 1) and DI = SO(2/) are
characterized by the fact that they have spinor UIR's. The properties of such
UIR's are somewhat involved, they repeat in cycles of S. We can exhibit this in
a table going from SO(Sm) to SO(Sm + 7) for many integer (Table II). For BI there
is just one spinor, say A; while for DI there are two inequivalent ones, say A(1) and
A(2).

Table II.
Group
SO(8m)
SO(8m + 1)
SO(8m + 2)
SO(8m + 3)
SO(8m + 4)
SO(8m + 5)
SO(8m + 6)
SO(8m + 7)

Each real, dim 24m -

Real, dim 24m


Mutually complex conjugate dim 24m

Pseudo real, dim 24m + 1

Each pseudo real, dim 24m + 1


pseudo real, dim 24m + 2
Mutually, complex conjugate dim

24m+2_

References
A. Salam, Formalism of Lie Groups, Trieste Lectures (1963).
G. Racah, Group Theory and Spectroscopy, Princeton Lectures (1951).
R. E. Behrends et aI., Revs. Mod. Phys. 37, 1 (1962).
B. G. Wybourne, Classical Groups for Physicists, Wiley (1974).
H. Georgi, Lie Algebras in Particle Physics, Benjamin (1982).

Real, dim 24m + 3

Part III:
Quantum Effects in the Early Universe
and Approaches to the Unification
of Fundamental Forces
The dynamics ofthe universe is governed by gravity, the best classical description
of which is Einstein's general relativity discussed in Part I. As discussed in Part II,
the microscopic world - the constituents of all matter and the forces between
them - on the other hand is best described in the language of quantum field theory
which developed as a sort of unification of the principles of quantum mechanics
and special relativity. This unification also led to radically new concepts such as
that of electron spin, antiparticles, and so on. Though in all practical problems
discussed so far, the principles of general relativity and those of quantum
mechanics can each go their own way, for a consistent description of all
phenomena, one would like a unification of the two. Such a marriage is yet to be
and, in spite of a lot of hard work, one still does not have a quantum theory of the
gravitational field. In two concrete problems, at least, one would expect the
quantum effects of the gravitational field to playa very significant role. Firstly, in
the behaviour of the universe near the initial singularity and, secondly, in the
determination ofthe end state of the Hawking evaporation of black holes. In the
absence of a viable quantum theory of gravitation, 'one does what one can rather
than not do what one cannot'. And this leads to Part III of the book, in particular,
quantum field theory in curved spacetime wherein in the spirit of semiclassical
radiation theory, one quantizes the matter fields in the background of a fixed
classical gravitational field given by Einstein's equations. Canonical quantization
methods within this framework are presented in the article by B. R. Iyer. This is
then used to discuss particle creation effects in typical cosmological models and
define notions like conformal vacua and adiabatic vacua. As in the flat spacetime,
quantum field theories in arbitrary gravitational fields are also plagued by the
problem of divergences. Sophisticated regularization and renormalization
techniques are needed to extract finite answers and, in his article, D. Lohiya
expounds on one such technique in detail: the zeta function regularization and its
application to a variety of problems. The effective action in curved spacetime is
obtained and used to discuss phase transitions in the De Sitter Universe. One of
the exciting possibilities that emerged in recent years is the Inflationary Universe
scenario. This is discussed in detail in N. Panchapakesan's article which covers
the old, new and chaotic inflations, as also Hawking's constraints on inflationary
universe models. It concludes with a discussion of back reaction effects of
quantum particle creation in the early Universe using effective action techniques.

293
B. R. lyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 293-295.
1989 by Kluwer Academic Publishers.

294

Part III

Quantum gravity, as mentioned earlier, is as yet the stuff dreams are made of.
Another approach, much less ambitious, is that of quantum cosmology. It is to
quantum gravity what the Bohr model is to the full quantum mechanical
description of the hydrogen atom. In quantum cosmology, one attempts to give
a quantum-mechanical meaning to classical solutions of general relativity. This is
discussed in the article by T. Padmanabhan. The approach is illustrated by
quantizing only the conformal degree of freedom of the gravitational field, in
particular, the Friedmann-Robertson-Walker (FRW) models. And, as in the
hydrogen atom, the classical singularity of general relativity is avoided and one
has analogous stationary states in the quantum Universe. The section ends with
a model of the fundamental role that the Planck length may playas the universal
cutoff in all field theories, thus ridding the theory of ultra-violet divergences. Two
appendices introduce field theory in the Schrodinger representation and the
Schrodinger equation for quantum gravity, namely the Wheeler--De Witt
equation.
The above articles summarize the new viewpoints of the general relativists. But
what of the particle theorists? The unification of electromagnetic and weak
interaction and its subsequent experimental verification has given credibility to
the grand unified theories of strong and electroweak interactions. Can this
success of gauge theories be extended to include the fourth force gravity? An
approach dating back to 1920 is the ide:a of Kaluza and Klein. In a series of
articles, A. Maheshwari discusses this aspect. The section begins with a pedagogic
introduction to spin 1, 2 and 3/2 fields and then proceeds to introduce the
mathematical machinery of Vierbeins (tetrads) and spinors and their appropriate
generalizations to higher dimension. He then proceeds to discuss the old
Kaluza - Klein theory in five dimensions for unification of electromagnetism and
gravitation and then puts it in a modern perspective so that one can now
generalize the approach to unify arbitrary (non-Abelian) gauge fields with
gravitation. The internal and spacetime symmetries are unified by making the
internal symmetries as spacetime symmetries of 'unobservable' dimensions. This
necessitates the introduction of higher dimensions - eleven in particular to
accommodate the standard theory. But one has to face up to the fate of the extra
dimensions. Spontaneous compactification is one solution and this is treated in
detail, as also is the harmonic expansions necessary to obtain the particle
spectrum of Kaluza-Klein theories. The article ends with a discussion of the
problem of chiral fermions in these theories, which was first raised by E. Witten.
1. Samuel takes off from this point and in his article builds up the relevant
background to discuss applications of Kaluza - Klein theories to obtain higherdimensional cosmological models. The fate of the extra dimensions is governed
by dynamical evolution: dimensional reduction. A number of models and their
significant features are discussed, one, in particular, which 'explains' the
unexplained feature of high entropy in c:lassical cosmologies.
All successful theories in particle physics are gauge theories today. Can all the
forces be unified by a gauge group? Can we get a clue by studying gravity itself

Quantum Effects in the Early Universe

295

which is also a gauge theory obtained by gauging the Poincare group? The main
problems have been in the understanding of the role of the invariants of the Lie
algebra of the group if one has general covariance. One is led to theories more
general than general relativity in that, in addition to curvature, one also has
torsion. These and other aspects of gravitation as a gauge theory are treated in the
article by N. Mukunda, who in particular, critically expounds on the UtiyamaKibble approach.
The main stumbling block to incorporating both internal symmetries and
spacetime symmetries in a unified framework is the Coleman- Mandula theorem
that forbids the mixing of the two symmetries. This theorem whose proof depends
on the Lie properties of the algebra, is circumvented by the use of graded Lie
algebras where, in addition to commuting objects, one also has anticommuting
Grassman variables. Such more' general structures are discussed in the article by
B. Sitaram. This article introduces graded Lie algebras with examples and then
proceeds to discuss their representations and classifications. The extended
algebra acting as local fields, has the effect of transforming a fermion field into
a boson field and vice-versa and is, hence, called supersymmetry (SUSY). In
addition to theoretical elegance, if supersymmetry is extended to a local
symmetry one necessarily obtains general coordinate invariance, i.e. one gets
gravity for free! The various aspects of supersymmetry (SUSY) and supergravity
(SUG RA) are discussed in the article by R. K. Kaul. It also deals with
representations of the SUSY algebra, SUSY breaking Schemes, N = 1 SUGRA
in four, eleven and ten dimensions.
The last chapter of the book provides yet another approach to quantum
gravity. In recent years, the theory of superstrings (SST) has been a candidate for
the Theory of Everything (TOE). Strings are idealized one-dimensional extended
objects, a natural generalization ofrelativistic point particles. With SST, one may
have a fine quantum field theory whose internal consistency moreover requires
a unique number of spacetime dimensions 26 for bosonic strings and 10 for
superstrings. This is the subject of Sharatachandra's overview which proceeds
from dual models and Veneziano formula to a discussion of the relativistic string.
Light cone and Hamiltonian quantization is then followed by a treatment of
Lorentz covariance and the spectrum of string excitations. The field theory limit
of interacting strings leads to higher derivative corrections to the Einstein action.
It ends with a discussion of superstrings, current problems and future prospects.
By the time this book is published, much has happened in the exciting
arena of SST, e.g. new principles of conformal invariance, the relation of SUSY
and finiteness, the question of the reduction of idealized string theory in 10
dimensions to a realistic theory in four dimensions, Calabi - Ya u idealogy, and
orbifold compactification. Surprises will not cease. Even in the placid waters of
conventional canonical quantum gravity, there is ongoing excitement caused by
new developments such as the construction of spinorial variables which lead to
a more manageable set of cubic constraints. But all that, as every story teller
knows, is yet another story ....

15. Quantum Field Theory in Curved


Spacetime: Canonical Quantization
B. R. IYER
Raman Research Institute, Bangalore 560 080, India

1.

Quantum Field Theory in Curved Spacetime

The last decade has witnessed tremendous progress in the construction of


a unified theory of the forces of nature, e.g. the electro weak and grand unified
theories. The odd interaction out is gravity which, to date, resists quantization,
though hope-eternal appears in theories of supergravity, Kaluza~Klein and,
more recently, superstrings.
The Planck dimensions represent scales on which the quantum effects of the
gravjtational field become important. They are given by
Mp =

hC)1/2 = 2.3
(G

h (hG)l

Lp = - - =
Mpc

T =
p

C'

M c2

-p-

i2

c3

L
(hG)1
=----"=
5

tp

x 10 - 5 gm,

= 1.616

10- 33 cm,

/2=5.39xl0- 44 sec,

(hC
)1/2 =
5

Gk 2

10 32cK

(1)

One way to see this is to recall that quantum effects for a system are important if
its action is comparable to h [2], i.e,
(i) AG

h,

(ii) AG = -('416nG
...
8nG
(lll)
R = - 4 (p
c

f;=g

R d 4 x,

+ 3p).

For a system, say dust confined to a length scale L,

c4 8nGp 4nL 3 L 2npL4('


AG ~ 16nG '~'-3-'~ = --3-

297
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 297-314.
1989 by Kluwer Academic Publishers.

298

B. R. lyer

But

Consequently,

2nL 2C 3c 2
L 2C 3
AG ~ -3-' 8nG = 4G'
so that quantum effects of the gravitational field are significant, if
L 2 c3
4G ::::;

n,

i.e. L ::::; 2Lp

If one goes along the lines of QED and tries to perturbatively quantize the
gravitational field, then L~ - G appears as the relevant coupling constant
analogous to e2 /nc in QED. Unlike QED where e2 /nc is dimensionless (and
small), the coupling constant G is not dimensionless. The situation is reminiscent
of the four-Fermi theory of weak interactiobs you have heard about, where GF is
similarly dimensional. This leads to new and more virulent divergences at higher
order whose effects become comparable to those at lower order on scales smaller
than Planck scales. Thus, Planck scales mark the border at which a full theory
of quantum gravity (preferably nonperturbative) is essential. Still, one might
envisage scales much larger than Planck scales at which the quantum effects of
the gravitational field are negligible, but quantum effects of the matter field are
not. And the Planck scales are so much smaller that such a semi-classical theory
seems worthwhile. This is called quantum field theory in curved spacetime (QFT in
CST).
Quantum field theory in curved spacetime is analogous to the semiclassical
theory of electromagnetic radiation in spirit, where the background external
electromagnetic field is treated classically and given by Maxwell's equations,
whereas the atomic system is treated quantum mechanically. In QFT in CST, we
assume that the matter field is quantized but the gravitational field is classical and
given by Einstein's equations, where the source term is taken to be the expectation value of the energy momentum tensor for matter fields, i.e.
(2)

There is a problem with the above semi-classical Einstein equation that gives
the back reaction of the quantum field on the gravitational field. According to the
eq uivalence principle which lies at the foundation of all metric theories of gra vity,
all forms of matter and energy couple equally to gravity. This also includes the
gravitational energy according to the (very strong) equivalence principle, i.e. the
graviton is as much subject to an external gravitational field as a photon or any
other field. Hence, whenever a classical background gravitational field produces
significant effects involving real or virtual photons, for consistency, one must

Quantum Field Theory in Curved Spacetime

299

--------------------------

allow for equally important effects involving gravitons. Thus, quantum gravity
will enter non trivially at all scales of distance and time whenever interesting
quantum field effects occur and not only at Planck scales.
More precisely, (2) would not be expected to arise as the lowest approximation
to a QFT of gravity coupled to a matter field, because one would expect to have
<GIlV) = 8rr <Tllv ) exactly where GIlV is the Einstein operator and the state
implicit in the expectation value now includes degrees of freedom of the
gravitational field [3]. One can expect GIlV to be given in terms of the metric
operator by the classical formula GIlV = G/1V({J.p)' But GIlV is a nonlinear function
of gllv and, hence, we expect

In fact writing

where g~p is a classical solution of Einstein's equation and I the identity operator
and keeping terms quadratic in Y.P in G. P one finds that <G. p ) and G. p [ <gllv)]
differ by - 8rr<f.p), where f.p is given in terms of y.P very similar to that of T.P in
terms of a scalar field $. Hence, in the lowest approximation to a full QFT of
gravity coupled to matter, one would expect to get an additional term appearing
on the right-hand side and comparable to T. p). One can say that quantum back
reaction effects caused by gravitons are as important as that of any QF and
should not be neglected in (2). This can be neglected only if there are a large
number of other fields.
A difficult problem in QFT is that of divergences. Physically, single closed
loops represent infinite vacuum or zero point energy that, in flat spacetime, is
removed by subtraction or normal ordering since, in flat spacetime, only
differences in energy are physical. When gravitational fields are present, this can
no longer be done, since all forms of energy gravitate or are sources of gravitation.
Thus, a more elaborate procedure involving the dynamics of the gravitational
field is essential. This is the renormalization and regularization of the stressenergy tensor and will be discussed elsewhere in Chapter 16. Thus, unlike in flat
spacetime, field theory expressions will not be normal ordered, as this is too naive
in QFT in CST.
QFT in CST is quantum gravity truncated at one loop level. There are no
higher loops for free matter fields and, for consistency, the gravitational field
should be taken to one loop only. Thus, it is the first-order quantum correction to
general relativity. For self-interacting fields, there exist multiple loop matter field
diagrams so that, for consistency, one needs to include graviton diagrams with an
arbitrary number of loops and, hence, be faced with the nonrenormalizibility of
gravity. However, even in this case, there can be a domain of validity for QFT in
CST. If I is a typical length scale for the system, the graviton diagrams introduce
G, while the matter field introduces e 2 , say . If Gl- 2 e2 the higher loop graviton

<

B. R. /yer

300

contributions would be negligible compared to that of matter and one can treat
gravity at one loop level only.
(Linear) QFT in CST is described by fields satisfying generally covariant
(linear) field equations with specified boundary conditions. The fields and their
conjugate momenta obey canonical equal time or covariant commutation
(anticommutation) relations. The spacetime manifold is required to be globally
hyperbolic, ensuring the existence of global Cauchy hypersurfaces on which the
classical Cauchy problem is well posed. The main difficulty of the theory is that
there is, in general, an arbitrariness in the choice of the definition of positive
frequency field solutions (particles) and boundary conditions leading to alternative definitions of particle and vacuum states. Though physical considerations
can often be invoked to restrict the possible definitions, picking out the
appropriate physical definition of particle states is not, in general, unambiguous.
For instance, in a curved background spacetime, the notion of positive frequency
only makes sense for wavelengths smaller than the local radii of curvature, while
in a time-dependent field, the uncertainty principle leads to an intrinsic
ambiguity in particle number. In Minkowski spacetime, on the other hand,
a global definition of positive frequency is possible which is consistent with
Poincare in variance and a unique QFT results. Of course, even in flat spacetime,
one can have an alternative quantization appropriate to a uniformly accelerated
observed and this is inequivalent to that of the inertial observer. However, even in
CST if the spacetime is stationary, a natural quantization may exist via the timelike Killing vector which sometimes may not be globally time-like (Kerr ST). The
construction of quantization schemes is facilitated if there exist in- and out-like
regions (asymptotically flat regions) in which a natural definition of positive
frequency may be given. It also helps if the ST has some symmetries, e.g., conformally flat spacetime, De Sitter spacetime, etc.

2.

Canonical Quantization of the Scalar Field in CST

Let us begin by assuming that spacetime is a CCG n-dimensional globally


hyperbolic pseudo-Riemannian manifold. The differentiability conditions ensure
the existence of differential equations and global hyperbolicity ensures the
existence of Cauchy hypersurfaces (nonglobally hyperbolic spacetimes have also
been studied). The line element of the spacetime is given by
ds 2 = g~v dx ll dx v ;

j1,

v = 0, 1, ... (n - 1).

(3)

The signature is - (n - 2).


Formally, field quantization in CST proceeds as in Minkowski spacetime. We
start with the Lagrangian density

~(x) =

7 [gIlV(X)*(X).~(X).v

-- (m 2 +

~R(x))*],

(4)

Quantum Field Theory in Curved Spacetime

301

where (x) is a complex scalar field. m is the mass parameter of the field quanta,
and the coupling between the scalar and gravitational field is given by ~R*,
where ~ is a numerical factor and R(x) is Ricci scalar curvature and is the only
possible local scalar coupling of this kind with the right dimensions.
Note:

[] = em] =

L;

[R]

1
LZ

The action is
S

(5)

2'(x) d"x,

where n = dimension of the spacetime. Setting the variation of the action with
respect to * equal to zero, yields the scalar wave equation
[Ox

+ m 2 + ~ R(x)J<P(x) = 0,

(6)

where
Ox

;-:

-g

Similarly for *.
Two values of

ali(F{; g/lVDJ

(7)

are of particular interest

(i) Minimal coupling:

I'

~ =

C
(11)
Conlorma
couplIng:

0,

~(n)

1(n - 2)

== - --- .
4 n- 1

(8a)

(8b)

In the latter case, if m = 0, the action and, hence, the field equations, are
invariant under conformal transformations of the metric, as we shall discuss later.
Given two solutions 1 and 2 of Equation (6), it can easily be shown, using (6),
that
(9)

is conserved, i.e.
(10)

J:I' = 0.

Choosing the constant suitably that J 1'( , ) is Hermitian, we get


JI'

-ig/lV(Lz - !z.v)'

(11 )

Equation (11) induces a natural inner product on the space of complex classical
solutions - the Klein-Gordon scalar product
(z, d = - i

L ~./l
(l

~"f) "N dL/l'

(12)

B. R. Jyer

302
where

(13)
and dLIl is the future directed surface element of the (n - l)-dimensional
hypersurface L which is taken to be a complete Cauchy hypersurface for
Equation (6). If coordinates are chosen so that 1: is a constant t hypersurface, then
in four dimensions, e.g.
d1: i

0 (Ji.

1,2,3).

The above scalar product is conserved under displacements and deformations of


1: (i.e., it is independent of 1:). This results from the assumed self-adjointness of the
differential operator in (6) that yields a conserved current J Il. It follows by use of
Gauss's theorem and straightforward integration by parts. Note that the above
inner product satisfies, as required,
(14)

but is not positive definite.


The conjugate momenta are given by
n == 02

iN.t

~
gat *
2
.a,

(15a)

_ ~ t!JA-.
- -2- g 'I'.p'

(l5b)

* _ 02
= OA-.*
'I'.t

To quantize the field, one regards and n as operators (operator-valued


distributions) and imposes equal time canonical commutation relations
[D(x), D(x')]t=t' = 0;

[ll(x),ll(x')].="

0,

[D(x),ll*(x')]t=" = 0,

(16a)

[D(x),ll(x')]t=t' = ic5(x - x')

(16b)

and similarly for complex (Hermitian) conjugates.


The Hamiltonian H(t) is given by
(17)

For any function of F written as a power series in <1>, II and OJ<l>, one has the
Heisenberg equation of motion
[F,H],

of

dF

+ iai = i'dt,

( 18)

where the second term contains explicit time dependence not incuded in <1>, II and
OJ<l>. For ll*, one gets back the equation of motion (EOM) (6) while for <1>* one
gets (15a) and similarly for II and <1>.

Quantum Field Theory in Curved Spacetime

303

The canonical CR are propagated consistently by the EO M


d

dt [<1>,

nJ

nJJ + i en, [$, HJJ


i[H, en, $JJ = 0

i[<1>, [H,

(19)

Since there is freedom in choosing a coordinate system, it follows that if the


CCR holds on one space-like Cauchy hypersurface, then they hold on any other
such hypersurface. Thus, they hold in any coordinate system with constant time
hypersurfaces and, hence, the field algebra (16) and EOM form a generally
covariant system.
The field operators act on state vectors which describe possible states of the
system. The construction of a state vector space or the F ock space in CST is not as
straightforward or unambiguous as the construction of the field algebra. One has
difficulties in defining the physically relevant positive frequency solution. In
Minkowski space, there is a preferred set of modes that are associated with
natural Cartesian coordinates which are related to the Poincare group which
represent isometries of Minkowski spacetime. More precisely, a/at is a Killing
vector of the spacetime orthogonal to the spacelike t = constant hypersurfaces
and the modes are eigenfunctions of the Killing vector with eigenvalues
= iw(w > 0). Further, the Minkowski vacuum is invariant under the Poincare
group.
In CST, the Poincare group is no longer a symmetry group of the spacetime.
Indeed, in general there will be no Killing vector at all which can be used to define
positive frequency modes. In some special cases, there may be symmetries under
certain restricted transformations, e.g. rotation and translations or the De Sitter
group. In these cases, natural coordinates may exist which are associated with the
Killing vector, like the rectangular coordinates in Minkowski spacetime. But
even if such coordinates exist, they do not have the same central physical status in
QFT as their Minkowski counterparts. In general, no privileged coordinate
system exists and no natural mode decomposition of the field, based on the
separation of the wave equation in these coordinates, will present itself.
A typical situation that may arise is the following. Suppose there exist two
regions of spacetime 'in and out' in which there is a natural choice of two different
orthonormal basis sets {e i , en and U;,fn, e.g. early-and late-type basis
functions for collapsing black holes, basis sets with respect to T and t Killing
vectors in Rindler spacetime, basis states with respect to 1], and ~-quantization
schemes in eternal black-hole models. Then

(e i , e) = 6ij = U;,f),

(20a)

(et, ej) = - 6ij = Ut,fj),

(20b)

(e i , ej)

(20c)

0 = U;,fj),

where ( , ) is the inner product given by Equation (12).


Any arbitrary solution $ of Equation (3) can be expanded in terms of either the
basis {eien or {h,fn

B. R. lyer

304

(21a)
(21 b)

Here, i represents the set of quantities necessary to label the modes. Using (20) and
(21), it follows that

= (e i , <1,

(22a)

b; = -(er, <1,

(22b)

= U;, <1,

(23a)

ai

Ci

d; = - (fr, <1.

(23b)

The canonical commutation relations (CCR) therefore imply from (22)

[ai' a}] = bij = [bi' b}].

(24)

All other commutators vanish. Similarly,


[c i , c}]

= bij = [d i , d}].

(25)

All other commutators vanish.


The e and f vacua are defined by

= b;lO). = 0,

(26)

ciIO)/ = diIO)/ = O.

(27)

aiIO).

Now prepare the system in the state 10). and work in the Heisenberg representation. How does this state appear in the out region? Is it empty? The key
point is that the e-vacuum will generally have a nonzero number off particles and
antiparticles, i.e.
(28)

To see this, note that {e i , en is a complete set, so that any h may be expressed as
fi =

L (aije j + f3ijej),

(29)

where a and 13 are constants. Since both [eJ and {/;} are orthonormal sets, one
can show that the a's and f3's should satisfy
aa t - f3f3t = I,

(30a)

af3T - f3a T = O.

(30b)

From Equation (23), employing (29) and (22), we get

= L (at aj -

f3t b}),

(31a)

dI = "L... (a~I }b.} - f3~.a~)


I}}

(31b)

Ci

305

Quantum Field Theory in Curved Spacetime

Thus, the f-particle number in the e-vacuum is given by

-<01 NjlO)e == e(OlcIciIO)e

= f3af30 e(Olb,b}JO>e
= I lf3ijI2.

(32a)

Similarly,
(32b)
Thus, the old or e-vacuum contains new or f-particles-antiparticles, i.e. the
in-vacuum is a superposition of out-particle-antiparticle states. Note that iff
f3ij = 0, i.e. there is no mixing of positive and negative frequency solutions, the
vacuum is left unchanged. If f3ij \ 0, we have a Bogolubov transformation,
positive and negative frequencies mix leading to particle production. The
e-Wlcuum contains lf3ijl2 particles in the fith mode and the same number of
antiparticles. Thus, spontaneous creation occurs via production of particleantiparticle pairs. If e(OI Ni,dlO) e diverges, then 10) e and 10) f are not even related
by a unitary transformation, and the two representations are inequivalent.
In addition to particle states and Bogolubov coefficients, one needs various
Green's functions. The CST generalization of the Green's function equation, e.g.
the Feynman propagator, is
[Ox

+ m2 + ~R]GF(X, x')

= -

;--- (j"(x

-.;-g

- x'),

(33)

where
iGF(x, x')

= (01 T(cf>(x) cf>(x')) 10).

(34)

Equation (33) does not specify state 10), nor does it ensure that the solution has
properties of a time-ordered product. To fix the state and impose the time
ordering, boundary conditions must be imposed on the solution of (33). In flat
spacetime, this is done by the choice of the contour used in the integral
representation. In CST, specification of the boundary conditions will not be
simple and will depend on global features of the specific problem.

3. The Conformal Vacuum


A transformation
g/lV -+ g/lV

= Q2(X)g/lV

(35)

is called a conformal transformation of the metric. The coordinates of an event


are not changed by a conformal transformation of the metric and null vectors

B. R. Iyer

306

remain null in the transformed metric. (Light cones and, therefore, the causal
structure is unchanged.)
As mentioned earlier, the scalar wave equation with m = 0 and

~=~(~)
4 n- 1
is invariant under conformal transformations of the above type if, under such
a transformation, the scalar field is assumed to transform as
(36)
((2 - n)/2) is called the conformal weight of the scalar field. This is because, under
the transformation (35), one has

--2) R]
[is + -41(n-n-l

$ = 0-(n+2)/2

[1
2 ].
D + -n
--R
4n-l

(37)

A spacetime is said to be conformally flat if


(38)

gl'v = 02(X)'Il'v.

All Robertson-Walker models are conformally flat. If spacetime is conformally


flat, then identifying gl'v with 'II'V so that R = 0, one has from (37)
iS$

= 'II'V0I'0v$ = 0

(39)

which, using (36), implies


'II'V0I'0v(0(2-n)/2 ) =

o.

(40)

Equation (40) has the familiar Minkowski space form for the combination
so that the familiar positive frequency solutions are given by

0(2 -n)/2 ,

-ik'X

U d x)-(2w(2n)n-l)I/2'

0_

k -w.

(41 )

Relative to the original spacetime, these modes are thus of positive frequency with
respect to the time-like Killing vector of the conform ally related spacetime
(Minkowski spacetime) or conformal Killing vector. Thus, the field can be
expanded as
(x) = o(n - 2)/2

(al( 0" rtx)

+ a~ O";~ (x)).

(42)

I(

The vacuum associated with these modes, defined by

adO) = 0,

(43)

is called the conformal vacuum. Since our modes are just those of flat spacetime,
modes which are of positive frequency with respect to the conformal vacuum at
one time, remain so for all time. Thus, a field satisfying the conformally invariant
wave equation in conformally flat spacetime, if prepared in the conformal

Quantum Field Theory in Curved Spacetime

307

vacuum, remains so for all time and there is no particle production. (Note that we
always work in the Heisenberg picture.)
The k = 0 Robertson - Walker model is trivially seen to be conformally flat so
that the above result applies. For the k = 1 cases, however, the spacetimedependent conformal transformation that transforms the metric of the spatially
curved Robertson-Walker solution to Minkowski spacetime, is complicated and
has a singular point so that a similar conclusion cannot be derived on the same
lines. However, the same conclusion results as can be seen below.
The line element in these cases is
(44)

where Yij dx i dx j is the line element of the three-dimensional space of constant


curvature (+ve or -ve). By transforming to new time coordinate '1 (the
conformal time), one has
ds 2 = a 2 (tHd'12 - Yij dxi dx j ],

where
'1 =

dt'

aCt')'

(45)

(46)

Choosing the conformal factor (12 = a2 (t), one sees that the conformally
transformed metric is static.
The Klein-Gordon equation can be solved exactly in static RobertsonWalker spacetimes. The static spacetime with k = 0 is, of course, Minkowski
spacetime, while with k = 1, one has the closed Einstein static universe. k = -1 is
a static universe with hyperbolic spatial sections. We do not need to solve the
complete problem for our purpose, since we can note that, for static spacetime,
the conformal time is related to the time coordinate t by '1 ,= t/a and also that,
since '1 is a Killing vector, one would have mode solutions that go like exp( - iw'1).
Thus, one has solutions of the Klein-Gordon equation which are of positive
frequency with respect to the Killing vector a/at and a/a'1. Since the space times
are static and admit the global time-like Killing vector, the definition of particles
is no more ambiguous than that of particles in Minkowski space and, consequently, there is no mixing of positive and negative frequency solutions. An
analysis using particle detectors confirms that a detector at rest in the comoving
frame registers no particles. Going back to the general Robertson--Walker metric
by the conformal factor, one thus finds that there is no creation of particles
obeying the conformally invariant wave equation in the k = 1 RobertsonWalker spacetimes.
Note: (1) The equations governing photons, massless neutrino, and Dirac
particles are all conformally invariant with different conformal weights. Hence,
an isotropic expansion of the universe does not create these particles. Renormalization effects do not alter this result.

B. R. Jyer

308

(2) Yang-Mills equations are also conformally invariant, but since renormalization is more complicated in this case, the no-creation result for Yang-Mills
quanta may be altered.
(3) For gravitons treated as perturbations on a Robertson-Walker background, the linearized Einstein's equations are not conformally invariant. In fact,
each independent polarization satisfies a minimally coupled scalar wave
equation. Thus, one does have graviton production in an isotropic expansion.
(4) If expansion of the universe is anisotropic at early times, then particles
obeying conformally invariant field equations could be created, e.g. consider
(47)

This is, in general, not conformally static so that one cannot use the conformal
invariance of the field equation to identify a preferred set of positive frequency
solutions during the expansion.
(5) An important result of interest in cosmology is the following. It has been
shown in model calculations that the back reaction of the created particles tends
to equalize the expansion rates in different directions, thereby bringing about an
isotropic expansion from an initially anisotropic one dynamically.

4. A Toy Model With Particle Creation


Consider a two-dimensional Robertson-Walker Universe with a line element
(48)

In terms of the conformal time,

I
I

'1 =

dt'
a(t')

(49)

and identifying a 2 ('1) = C('1) as the conformal factor, the spacetime is conformal
to Minkowski spacetime.
(50)

where A, B, p, are constants. Then, in the far past ('1


-+ + (0), the spacetime becomes Minkowskian

-+ -

(0) and the far future

(tf

(51)

Let us now study a massive minimally coupled scalar field in this spacetime. In
two dimensions, ~ = 0 so that minimal and conformal coupling are equivalent.
The curvature of this spacetime is spatially constant with a temporal variation
that, at large times, vanishes exponentially as required.

Quantum Field Theory in Curved Spacetime

R(I])

309

8Bp 2

t/---> OC)

(A B)2 exp(=f2pl]).

(52)

The metric is independent of x and so the field equations would have translation
invariance with respect to x. Thus, one can separate variables in the modes as
Uk(l], x) =

'k

;;;-=: e'
y2n

(53)

Xk(I]).

The scalar equation in this case is


(54)

so that substituting from (53) yields

[~(a; + k

2)

+ m2

Jx = 0,

i.e.

d2 X

dl]2 + (k 2 + Cm 2 )x

o.

(55)

It can be shown that the above equation can be solved in terms of hyper geometric
functions. They can be chosen as normalized modes that behave like positive
frequency Minkowski modes (exp(- iWinl])) in the remote past (I] --+ - 00;
t --+ - (0). These will be denoted as Uin(l], x). Similarly, modes that behave like
positive frequency Minkowski modes, as
I]

--+

00; ( ~ exp( - iW out 1]))

will be denoted by Ukut(l], x). Neglecting overall normalization factors

Ukn(l], x)

x1F 1 (
~--)'O>exp(ikx
t/---> - CJJ

U kut (I], x)

~P-

exp ikx - iw + I] -

iw - iw_.

iw in .

1+--,~-,I--'2(I+tan

h)

PI])

- iWinl]),

~ exp [ikX X

In 2 cosh(pl])

(56a)

iW+1] -

i:_

In 2 cosh(pl])

Jx

iW out 1
)
iw _ iw - .
2F 1 ( 1 + --, --,1 + --,2(1 - tanh PI])

~~----3 ex p( ikx
t/ ---> CJJ

iW out

I] ),

(56b)

310

B. R. lyer

Win

[k 2 + rn 2(A - B)] 1/2;

W = !(Wout

(56c)

Win)'

(56d)

uin and uzut are not equal, hence, they may be related by a Bogolubov
transformation. This can be achieved by using well-known properties of the
hypergeometric function, e.g.

F(a, b, c; z)
= f(c)f(c -

a - b)F(a, b, a + b - c

+ 1; 1 -

z)

(57a)

c_a_bf(c)f(a+b-c)
.
.
f(a) f(b)
F(c - a, c - b, ( - a - b + 1, 1 - z),

+ (1 - z)
F(a, b, c; z)

= (1 -

z)<-a-b F(c - a, C - b, c, z),

(57b)

so that one obtains

Uin(l], x) = IXk u~ut(l], x)

+ 13k U~HI], x),

(58)

where

lXk)
13k

(wout)1/2
=

Win

r(
r( ~

1-

~ ) r( ~: iW;ut)
i: )

r(

(59)

1 :f i: )"

Comparing with our general formula


(60)

Thus,

113 12 =
k

Sinh 2nw - I p
Sinh(nwinl p) Sinh(nwoutl p)

(61 )

and similarly for IIXkl2 = 1 + l13kI 2.


Apart from normalization factors an initial plane wave solution exp(i(kx -

Winl])) evolves into the superposition

IXk exp (i(kx -

W out 1]))

+ 13k exp (i(kx + w out 1])).

Thus, a wave packet moving in early times in the J( direction, as a result of the
expansion, at late times splits into two parts: one moving in the J( direction and
the other in - f One can say that a wave moving in the J( direction has grown in
intensity from value 1 to IIXkl2 = 1 + l13k 2 and a backward wave with intensity
113k 12. If the mean number of particles moving in the J( direction is 1, then after
expansion, the number is 1 + 113k 12and in the - J( direction l13kI 2 Thus, l13kl 2pairs
have been created by the expansion for each particle present early (stimulated
particle creation). Therefore, there must be a spontaneous pair creation. Vacuum

Quantum Field Theory in Curved Spacetime

311

fluctuations correspond to zero-point energy iliw for each mode and will produce
the same effect as if the mean number of particles in mode k was i. Hence, in the
vacuum state there will be i 113k 12 pairs produced by the zero point energy of mode
k and that for mode - k produces il 13k 12 pairs. Hence, starting from vacuum, the
mean number of particles at late times in mode k is If3kl 2
Thus, as discussed in the general case, prepare the system in a state 10>in defined
via u~n. In the remote past, spacetime is Minkowkian and all inertial particle
detectors will register an absence of particles. Thus, un accelerated observers
would identify this quantum state with the physical vacuum. As the curvature
builds up, particles are created by the imposed gravitational field. In the out
region, spacetime is also Minkowskian and the system is still in state 10>in' but in
contrast to the situation in the in region 10>in is not regarded by inertial observers
in the out region as the physical vacuum. This role is given to 10\ut defined
relative to U kut Thus, unaccelerated particle detectors will register the presence
of quanta with the expected number given by (61). In mode k, this quantum
evolution is the creation of particles in mode k as a result of cosmic expansion.
The initially empty Universe acquires, at late times, a background of massive
scalar particles such that each pair has a zero net total momentum (as required by
the spatial homogeneity of the metric).
Note: (1) In the massless limit w_ ---> () so that (61) gives no particle production
(conformally trivial situation).
(2) The particle production takes place when conformal symmetry is broken by
the presence of mass that provides a length scale in the theory. The production
can be regarded as caused by the coupling of the spacetime expansion to the
quantum field, via the mass. The time-dependent gravitational field provides the
requisite energy.
(3) The particle number is a useful concept only if the creation rate is small or
the particle mass itself is very high.
(4) From Equation (61), for the 13 coefficient, one can see that (i) 1131 2 ex B2 ---> 0,
i.e. the total amount of expansion ---> and the Minkowski space limit is obtained.
(ii) The particle creation falls more rapidly to zero if the expansion rate ---> 0. The
rate is parametrized by p and, as p ~ 0, one obtains an exponential decline

If3kl 2 ---> exp ( -2

n;in)

--->

0.

The slowness parameter is p/w in which becomes small if p k or m. Physically,


expansion will excite modes for which w (expansion rate). For larger w,
creation is exponentially suppressed and excited inefficiently. Thus, production of
high-mass particles is exponentially small.
(5) If a high energy in mode is to remain empty in the out region, it must be so
in the expansion epochs. However, this is meaningful only if the motion of the
detector is specified. One can show that the comoving detector fulfills this
intuitive requirement.

B. R. Iyer

312

(6) The above remarks coming from the specific example apply to any
Robertson-Walker spacetime with smooth COO scale factor. The rapid decline of
quanta in k -+ Cf) modes is a general feature.

5.

The Adiabatic Vacuum

In space times with static in or out regions, comoving detectors with high
probability detect no quanta in high energy modes in the in or out vacuum states.
In spacetimes without such regions, one would like to select those exact modes
that come closest to the Minkowski space limit. The idea of the adiabatic vacuum
is an attempt to construct a state with minimal particle production in a changing
geometry and which involves a high mass expansion of the field modes. Consider
the spacetime
(62)

where C(IJ) is a Coo function of conformal time IJ. The modes of the Klein-Gordon
equation are
Uk

= (2rr)(1-n)/ZC(Z-n)/4 eik'X Xk (IJ); k = IKt

(63)

where for conformal coupling one has Xk satisfying

d Xk(lJ)
2
()
0
(f,]"2
+ wdlJ)
Xk IJ = ,
2

w~(IJ) = k 2

+ C(1J)m 2 .

(64)
(65)

Equation (64) represents a simple harmonic oscillator with a time-dependent


frequency whose solution in the WKB approximation can be written as
(66)
where W k satisfies the nonlinear equation
(67)
If spacetime is slowly varying, the derivative terms in (67) are negligible (<<wl)
so a zeroth order approximation is
(68)

This solution tends to Minkowski modes as C(IJ) -+ constant. Solutions may be


obtained by iteration using W~ as the lowest order. To clarify the slowness
property, introduce a parameter T, the adiabatic parameter. Replace IJ by

313

Quantum Field Theory in Curved Spacetime

'11 = '1fT in all expressions and take T = 1 at the end. The adiabatic limit of slow
expansion is then given by T ---> 00 and (64) becomes
(69)

Trivially,
(70)
so that as T ---> 00, C('11) and all its derivatives with respect to '1 vary infinitely
slowly. Thus, effects of slowly varying C('1) is given by a large T approximation.
A term of the order T -" in inverse powers of T is called the nth adiabatic order.
From (70), the adiabatic order is equivalent to the number of derivatives of C. The
Ath order adiabatic approximation of Xk is written as xt and the associated
modes as V;,4.
In general, one can write the following relation
V [= k
IJ(A) ('1) VA
[

+ p(A)
k'

VA[

(71)

defining an exact field mode in terms of the adiabatic approximation V~A).


Clearly, IJ(~A) and p~A) must be constant to order A, since
and Vr are solutions
of the field equations to this order. Make the particular choice

vt

IJ(kA

)('1o)

= 1 + O(T-(A+l)),

PkA)('1o) = 0 + O(T-(A+l))

(72)

at some fixed time '10' Then IJ( and p are given by (72) for all time. The modes
defined by (71) and (72) are then said to be 'adiabatic positive frequency modes' to
adiabatic order A.
Note: (1) The modes V [are not adiabatic approximations but exact, and they are
quantized. The associated vacuum is a good candidate for the vacuum state only
'less equal' than the adiabatic vacuum of (A + 1)th order.
(2) No unique Ath order adiabatic vacuum exists, as the matching may take
place at any '10' These will differ by terms of a higher adiabatic order and are
equally respectable. All these will have a similar high energy behaviour but differ
in the content of low energy modes.
(3) In static regions, all terms of an adiabatic order greater than zero in VA
vanish. Hence, such modes are of adiabatic positive frequency to infinite order.
Thus, the p Bogolubov coefficient must fall off faster than any inverse power of
T as T ---> 00, so that the particle number associated with such modes is an
adiabatic invariant during the expansion.
(4) The adiabatic vacuum is the best that is available if the spacetime has no
static in or out regions, as in the real Universe.

314

B. R. lyer

References
1. N. D. Birell and P. C. W. Davies, Quantum Fields in Curved Space, Cambridge Univ. Press,
Cambridge (1982);
L. Parker, in F. P. Esposito and L. Witten (eds.), Asymptotic Structure oj Spacetime, Plenum, NY
(1977);

2.

3.
4.
5.
6.

G. Gibbons, in S. W. Hawking and W. Israel (:ds.), General Relativity - An Einstein Centenary


Survey, Cambridge Univ. Press, Cambridge (1979);
C. 1. Isham, in M. Papagiannis (ed.), Eighth Texas Symposium in Relativistic Astrophysics;
V. P. Fralov, Sou. Phys. Uspekhi 19, 244 (1976).
1. V. Narlikar, in A. R. Prasanna, 1. V. Narlikar and C. V. Vishveshwara (cds.), Gravitation and
Relativistic Astrophysics, World Scientific, Singapore (1984).
R. M. Wald, General Relativity, Univ. Chicago Press, Chicago (1984).
L. Parker, Fundamentals oj Cosmic Physics 7, 201 (1982) and references therein.
C. Bernard and A. Duncan, Ann. Phys. NY 107, 201 (1977).
See N. D. BireH and P. C. W. Davies, op. cit.

16. Zeta Function Regularization and


Effective Action in Curved Spacetime
D. LOHIYA
Department of Physics and Astrophysics, Delhi University,
New Delhi 110007, India.

In most quantum field theories (QFTs), we encounter irritating problems of


divergences. Several techniques have been developed in the past to separate out
a finite 'physically sensible' part from the divergent mess. The mess is usually
absorbed into the theory by a suitable renormalization ofthe coupling constants.
Over the years, a number of elaborate methods have been developed to carry out
this procedure. Simple normal ordering which removes a divergent zero point
energy in flat spacetime cannot be justified in general relativity. Unlike in the flat
spacetime case, the absolute value of energy does have a meaning in general relativity. To be honest, although methods like mode-sum cutoff, De
Witt-Schwinger proper time expansion of the Green's function, adiabatic
regularization, covariant point splitting, dimensional regularization, etc., are
a lot more than mere psychaedelic sounding expressions, I personally am very
biased and committed to the use of the zeta function regularization scheme. This
bias is perhaps due to my own subjective notion of elegance and compactness that
this scheme offers. The plan of this article is as follows:
(1) Properties of Riemann zeta function (its 'regularization', i.e. its analytic
continuation).
(2) Applications to calculate vacuum energy (Casimir effect).
(3) Renormalization of the effective action.
(4) Conformal (trace) anomaly and the evaluation of the stress energy tensor.
(5) The effective potential of a quantum field in a de Sitter universe.
1.

The Riemann Zeta Function


1

00

s' S=(J+it;(J,tER.
n= 1 n
This is a uniformally convergent series of analytic functions converging for
(J;::: 1 + J. This is a special case of ((s, a) for a = 1,(0 < a ~ 1)
((S) =

00

((S, a) =

I
n=O

(~+)S'
a

315
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 315-341.
(g 1989 by Kluwer Academic Publishers.

316

D. Lohiya

Now

=(a

+ n)-S

to

yS-l e-Ydy = (a

+ n)-sr(s);

0'>0, arg(x) = 0 (xER).

Therefore,
r(s)((s, a)

Lt

N~ro

Now for x

{f

OO

xs-le-ax

l-e

0, eX

-x -

foo

x S " l e-(N+l+a)x

l-e

-x

}
dx

1 + x. Therefore the

Isecond integral I

~ fooo x,,-2 e -(N+a)xdx


=(N + a)I-"r(O' -1)

-->

0 as N

--> 00

for 0'

1+ b.

Hence, in this domain

'(s;a) = r(s)

foo
0

x s - 1 e- ax
1 _ e -x dx.

For this domain, consider a contour in the complex plane enclosing 0, not
containing the points 2nni(n = 1,2,3 ... ). Let the contour be a flexible string
and stretch the point P to infinity (see Figure 1).

Fig. 1.

Consider

f( -z )

s-1

1 -e

-az

-z

Using r(s)r(1 - s)

= [

i1[(s-l) _

-i1[('-I)]

fro x e-axd
s-1

-e

-x

x.

= nisin ns implies

_
r(1-s)
(s,a)- 2'
m

f(-zY-l e-az
1

-e

-z

dz.

(A)

But this integral is an analytic function of s for all values of s. Therefore,

317

Zeta Function Regularization

singularities of ((s, a) are at the singularities of r( 1 - s), i.e. at s = 1,2, ... Since
((s, a) is analytic for a ~ 1 + <5, the only singularity of this analytic continuation
of ((s, a) is at the point a = 1.
The integral (A) offers a representation of ((s, a) valid over the whole of the
complex s-plane, except for the imaginary s-axis at a = 1. Now

f [ f

az

1 e_ e

1
c = 2ni 2ni

dz .

Therefore, by the Cauchy theorem, the expression in the brackets is the residue of
the integral at z = O. But the residue is 1, since
ze -az
Lt--=1.
z~o l-e- z
Thus,
Lt
s~l

((s,a) = _ 1.
f(l- s)

As r(1- s) has a simple pole at s = 1 with residue -1, therefore ((s, a) has
a simple pole at s = 1 with residue + 1. Finally, consider
1
- 2ni

f (- zyC'

e - az

-z

dz.

The integral is single valued and analytic except at


poles. Therefore
__
1
2ni

f __ f

2nni where it has simple

C'

2ni

C'

(see Figure 2).

('
CO

(2N +1)n

+----i==~==~/

Fig. 2.

Rn and R n" being residues at 2nni and - 2nni, respectively,


Rn = (2nn)s-1 e-(1/2)"i(s-1)e- 2n"ai,

D. Lohiya

318
Now for 0 < a :::; 1, there exists K such that

le- az (1- e- )-ll <

K independent of N (N large) with Z on C. Thus

I ... \< ~2n_In


\~
2m

C'

=k{(2N
Therefore, for

I {(zN

+ 1)n}SeiSOld8

l)n}"enlsl--+O on N

--+ OCJ

for

(J

<

o.

< 0,

(J

_ 2r(1 - s) {~ 2 sinsn/2) + 2nan)} .


(2n)1-s
n~l
n1 S

(s, a) -

Thus, for a = 1,

ns
(s)2 1 -Sr(s)cos 2 = nSW - s).
Both sides being analytic function of s (except for isolated values of s where they
have poles), this equation proved for (J < 0, holds for all s except those isolated
values. This expression can be used to determine ( for negative arguments, given
( for a positive argument where it converges. Special values of ( are
( -2m) =

(0)

-1;

W-

Lt

s- 1

2m) = (1)mBmI2m,

{(s) - _I-I}
S -

n = 1,2,3 ... ,

y.

These expressions and the proof of the result that for s = - m, (s) is the
coefficient of Zl-s in the expansion -l)Sz)!(e= - l)(m) are given in Whittaker
and Watson [1].
To conclude, we have not given a prescription to sum a divergent series.
However, we have seen that the series under consideration has a parameter s.
There is a domain of s for which the series converges and is expressible as an
analytic function which has a convergent value, even for values of s outside the
domain. We conjecture that whenever ~~n-S for s < + 1 occurs in a physical
problem, we could reformulate the problem in terms of the analytic continuation
described. We now demonstrate some applications in the following problems.
2.

2.1.

Applications

Casimir Energy Density in a Flat R1 :< Sl Spacetime Due to a Massless Scalar


Field

The spacetime is the usual two-dimensional Minkowski spacetime with a closed


spatial section, the points x and x + L being identified. The effect of spatial

Zeta Function Regularization

319

closure is to restrict the field modes to a discrete set


_

Uk - (2Lw)

- 1/2'

exp l(kx - wt),

k -

2nn
_
L'
n-

0, 1,

+ 2 ...

(Note that a general plane wave mode in a higher-dimensional space is


Uk ~ (2W)-1/2 exp i(/Zo X - wt) with the normalization having a factor of(2n)-1/2
for every continuum projection of k and a factor of L -1/2 for every periodic
k; = 2nnJL.)
Consider the expression
Ttt -- Txx -- .Li,2
2'+'

+ LI.2
2 '+' x .

Expressing
(t, x) = IJa kU k(t, x)

+ a;: ut(t, x)]

and using
(ak,a k.) = (aLa!-) = 0,
(ak,a!-) = bkk ,

gIVes
1

<OI7;tIO) = 2L

L Ikl =
00

-1j

2n

oc,

Ln.
0

This may be re-expressed as a ((s)-function analytically continued back to


s = -1.
The vacuum energy density is
2n
n
L 2 ((-I)=-6e'

thus a cloud of negative vacuum energy is distributed uniformly throughout this


R 1 X Sl universe with total energy -- n/6L.
2.2.

Casimir Effect

Consider two large parallel, conducting plates at a distance a L. We want to


calculate vacuum energy per unit surface area of the conductor. Its derivatives
would be force per unit surface area. By dimensional argument, this is
proportional to nCja 4 .
The total energy is

f d x<OI TttIO) = e a<OI TttIO).


3

Since we will show that <01 TttIO) is independent of x,


1

Uk = (2W)1/2 2na 1/2 exp i(/Zo X - wt),

kt = nn

320

D. Lohiya

(Recalling two factors of 1/(2n)1/2 for continuous values of k projections along the
conductors.)
Repeating the same steps for <01 T,tIO), we get
E -_ -IiC

= Ii~

fL -d(2n)2kll [[ L kll + (nn)2]1/2


-a
_
2

00

00

ff

L2

J1

~;~~ [Ikld + 2

(kfl + (;y)1/2

The first term is independent of a and is thus not relevant for calculating force per
unit area. Further, in the second term, k can be integrated over to yield
'" IiC L 2 (2n) ~
E
2 (2n)2 3

00

{2 (nn)2}3/2IOO
k

0 .

Now we hand-wave a bit and note that for large n (small wavelength), the perfect
conductor approximation would break down when A ::s size of atom. So n has
a natural upper limit. This means that as k --. 00, E gets a contribution
independent of a. The only surviving a-dependent term is
E

IiC U n 3 2
= -- - - 3 -

Ln
00

2 2n a 3 n= 1

The infinite sum can be analytically continued to ( - 3)


regularized energy per unit surface area is
E

L2 =

= 1/120.

Thus, the

IiC n 2

-~ 720

and the force per unit area becomes

- va

(E)

liC

n2
013 dyn
L2 = - 240a4 = (al'm)4 cm 2

(Refer to reference [2] to see how complicated other methods are.)

2.3.

Vacuum Energy in Einstein Static Universe

Our next application of zeta-function regularization is to compute the vacuum


energy of a massless conformally coupled scalar field in a closed static Einstein
universe described by

a2 L hij dx i dxi;
hij dx i dx i = (I - r2)-1 dr 2

ds 2

= dt 2

+ r2(d8 2 + sin 2 8 d4>2)


::dX2 + sin 2x(d8 2+ sin 28d4>2), 0 < X ~ 2n.

321

Zeta Function Regularization


It can be shown that the time dependence is separable as

xk =

1
(2w k )

1/2 exp(-iwkt/a) ,

where wk = Ikl = 1,2,3, ...


Each such mode has an angular degeneracy J, the quantum number J going
from zero to k - 1. Each angular mode has a further azimuthal degeneracy, rn,
going from - J to + J. Evaluating

E = ff<0IToI0)d3X.hl/2
after expressing the T llv in terms of the field as before, we get

E=

all modes

=-

th

(Wk)
= I I(2J + 1) ((240~)-1)
a
k

k3

I -.

2 k =I a

The sum is again analytically continuable to (( - 3) = 1~0. Thus, E = 210a.


The proper volume of the space is 2n 2 a 3 Thus, the energy density is
p = (480n 2 a4 ) - 1. The stress tensor due to a scalar field has an anomaly (to be
shown) proportional to surface terms and derivatives of the scalar curvature.
These vanish for the closed Einstein universe. Therefore, the complete renormalized stress tensor is
<OIT;IO)

= 480: 2 a4 diag(1, -1, -1,-1).

This was first obtained by Ford in 1976 using a mode sum cut-off method.

3.

Path Integral Formulation for QFT in CST

Let us now come to the prescription for doing QFT in curved spacetime (CST).
We shall be interested in globally hyperbolic manifolds. These are manifolds
which contain a Cauchy surface, that is space-like submanifolds such that each
causal curve in a manifold intersects it once and only once. Now the basic
prescription that carries over from fiat spacetime analysis to our case is:
(a) Replace fiat space measure d" x for an n-dimensional spacetime by
( - g(X))1/2 d" x.
(b) Replace (j"(x - y) by ( - g( y))1/2 (j"(x - y), i.e.

d" x (j"(x - y) = 1

d" x( - g(X)1/2 (j"(x - y)( - g( y))1/2 = 1.

D. Lohiya

322

(c) In fiat spacetime, the Feynman Green's function satisfies (for a scalar field)
(Ox + m 2 - ie)GF(x - y) = -<5 n(x - y).
If we formally introduce a symmetric matrix KXY with continuous indices
.X,y, s.t.

KXY

-[GF(X,y)-l.

Then it follows that

KXY

[Ox

+ m3 -

ie]<5 n (x - y).

[<5 n(x - y) is treated as a diagonal matrix.]


In curved spacetime, we have
KXY

= -

KXY

[GF(x, y)r 1

or

- GF(x, y) = K;/,

with

[Ox

+ m2 - ie + ~] <5 n(x _

y)( _ g(y))1/2.

We shali also be interested in Euclideanizing by making a clockwise rotation in


the complex t-plane. The prescription then is to remove is and the negative sign in
(_g)1/2.
The second-order operator would be different for fields of various spins.
The starting point ofthe path integral approach to any quantum field theory is
for the amplitude
Z = <g2,4>2,S2!gl,4>1,Sl)
to go from a state with metric gland matter fields 4> l' on a surface S 1 to a state
with metric 9 2 and matter fields 4>2 on a surface S 2' This is expressed as a sum over
ali field configurations g and 4> which take on the given values on the surfaces S 1
and S2' More precisely,
<g2,4>2 S2!gl,4> l Sl) = f D[g,4>]eX P[il(g,4],
where D(g, 4 is a measure on the space of ali field configurations g and 4>. I(g, 4 is
the action of the fields and the integral is taken over all fields with the given values
on S 1 and S 2' For our purpose, g is strictly fixed if we restrict ourselves to QFT in
a fixed curved spacetime, rather than doing quantum gravity. For real Lorentz
metrics (i.e. metrics with signatures - + + +) and real matter fields 4>, the
action is real and so the path integral wili oscillate and not converge. In ordinary
QFT in fiat spacetime, this is dealt with by rotating the time axis 90 clockwise in
the complex t plane, i.e. t ~ ir, introducing a factor of - i in the volume integral
for the action I, e.g. for a scalar field of a mass M
C

1= fdYX[-t4>,a4>,bgab-tm24>2]
implies

I E = ifdYX E.[_~A.
A. gab + ~m2A.2].
2'V,a'V,b
2
'V

Zeta Function Regularization

323

Thus, the path integral becomes

D[rjJ] exp[ -f(rjJ)].

f is greater than or equal to zero for fields in the space (r, x, y, z) and, thus,
a convergent integral results. At the end, we analytically continue back to the
Lorentzian section by rotating anticlockwise in the complex t plane. In the
presence of gravity, the gravitational part of the action does not remain positive
definite and, in general, the Euclidean section also does not exist. These and
related problems will be discussed by Padmanabhan in Chapter 18.
Now that we have expressed the amplitude to propogate from an initial
configuration Ii, t;) at time ti to a final configuration If, t f>' where t i , tf label the
surfaces S 1 and S 2' it is tempting to point out, as a passing remark, an important
use of the Euclidean section, viz. to construct a canonical ensemble for the field in
question. In the Schrodinger picture, the amplitude is expressible as

<jl exp( -iH(t z -

t1)li>

Putting t2 - t 1 = - i[3, i = j and summing over a complete orthonormal basis of


configurations In> gives the partition functions Z = exp( - [3En) of the fields at
a temperature T = p-l (En being the energy of the state In. Thus, the partition
function is expressible as a path integral over all fields that are real on the
Euclidean section and are periodic in imaginary time coordinates with period p.
Now the Schwarzschild metric is
ds 2 =

(1 _

2~) dt 2 + (1 _ 2~r

dr2

+ ,2 d02.

Putting t = - ir converts it to a positive definite metric for , > 2M. We


know that r = 2M is just a coordinate singularity. Under a transformation
X = 4M(1 - 2M/r)1/2, the metric becomes
ds2

(~)2
dr 2 + (~)2 dX 2 + r2 d0 2.
4M
4M2

The coordinate singularity at , = 2M gets transformed to X = 0 as a conical


singularity. Thus, regularity of the manifold at X = 0 is assured by identifying
r with a period 8nM. Thus, the Schwarzschild solution is perodic in imaginary
time with period = 8nM. It is a classical solution to the Euclidean equation and,
hence, contributes as a stationary phase-space point in the path integral for the
partition function of an ensemble with temperature T = 1/8nM. This is the
essence of the black-hole radiance discovered by Hawking.
From the expression ofthe amplitude in terms ofthe path integral, one expects
the dominant contribution to the path integral to come from fields which are an
extrema of the action, i.e. a solution to the classical field equations. This must be
the case if we are to recover the classical results in the limit of a macroscopic
system. Neglecting the question of convergence, the action can be expanded in
a Taylor series about the classical field configurations.

lEg, rjJ] = f[go, rjJo] + I 2[g, $] + higher order terms,

D. Lohiya

324

where
gab

+ gab'

gOab

<P

<PO

+$

and 12(g,<p) is quadratic in the perturbation g and $. Ignoring the higher-order


terms, we get
In Z

= -

1[go, <Po]

+ log D[g, $] exp[ - 12]

This is called the stationary phase, WKB or one-loop approximation. The first
term is the contribution of the classical (background) fields to log Z and the
second term is the one-loop term representing the effect of quantum fluctuation
around the background fields. To study a quantum field in curved spacetime, we
have to consider <Po = o.
12 {g, $) is the gravitational and matter field quantum fluctuation respectively,
12 [g] + 12 [$].
The one loop term for the matter field can be expressed as
12[<p] =

1f <PA<p(go)1/2d 4 x.

A is a differential operator depending on the background metric. For boson


fields, A is second-order differential operator. (For a scalar field, for example, it
was just 0 + rn 2 + ~R.)
Let {An' <Pn} be the eigenvalues and the corresponding eigenfunctions of A with
<Pn = 0 on the boundary of the manifold.
The eigenfunction can be normalized so that

<Pn<Pm(gO)1/2 d 4 n

= c5 mn

An arbitrary fluctuation which is required (due to our boundary conditions) to


vanish on the boundaries S1 and S2' is expressible as a linear combination of <Pn

The measure on the space of fields is then expressible as


n

(/1: normalization factor with dimension of mass or L - 1). Then

Z1>

= f D[<p] exp[ -

12(<p)]

Ij f /1 dYn exp( -1Any;)

TI (21t/1 2 An- 1)1/2

Zeta Function Regularization

325

This is how the determinant of infinite-dimensional operators make their


appearance in QFT. And this is where our problems begin.
In the above, we have written the product ofthe expectation values (e. v.) of A as
det A. To explicitly exhibit the result as a determinant of a matrix, we introduce
Kxy

= Ax c5n(x - y) (g( yn- 1/2 .

(Recall Ax

= Ox + m2 + ~R for scalar fields.)

GXY = -K;/.
This makes
12

=~

d n x d n y(gO(XW/2(go( y))1/2 (X)( y).

Introducing a space of vectors Ix) normalized by


<x I x') = c5 n(x - x')[g(x)] -1 /2,

we define operators K, G
Kxy

= <xIKly),

GF(x,x') = <xIGFlx').

In terms of these quantities, the amplitude can be written as


Z ex [det( - GF )]1/2.
(Note
12 '"

L (x)Kxy ( y) '" K

x,y

with x, y continuously labelling the matrix element.) Expanding in terms of the


eigenfunctions of K gives
Z", [detJ.l- 2 Kr l/2 '" [det -J.l 2

Gr /2 ,

I.e.
log Z = tTr(ln( - J.l2 GF ))
(tr(M), where M is an operator on the space, is given by

d n X(g)1/2Mxx

d n x(g(X))1 /2<xIMlx).

This can be used to prove that


tr[ln(-G F)] = log[det(-GF )].
The determinant and trace of the operator A and of GF related to it, diverge as
their eigenvalues increase without bound. One therefore has to adopt a regularization technique. We form a generalized zeta function from the eigenvalues of the
operators
CG(v)

= LA;",
n

where An are e.v.'s of G.

D. Lohiya

326

Formally, this equals tr (operator)-v. To prove this, say for GF , recall

K- 1

L Im~<ml,

= -G F =

Am

where

Llm><ml = 1.
Then

KVln> =

A~ln>,

Thus

LA,;;vlm><ml,

(- GFt =

Tr( - GF)V =

d YX(g(X))1/2 LA';;v <xlm><mlx>.

Using completeness gives


Tr( - GFt

LAA';;v =

((v).

We shall now study representations of this function. It can be shown that this
function can be analytically continued to a function of ~' with poles only at v = 2
and v = 1. It is regular at v = 0
nv) = - LIn An [).nr V ,

nO)

= -

L In An .

Therefore, one can formally define the regularized value of the determinent of an
operator A to be
Det A

exp( -

(~(O)).

The essence of the regularization procedure, as we saw for the Riemann zeta
function, lies in one's belief that instead of defining the amplitude in the way we
described, which, as we saw, gave rise to the divergent determinant of the
operator ( - GF ), it is possible to formally define the amplitude as a suitable
analytic continuation which gives a convergent result.
At this stage, we may find it useful to recall the prescription to switch from
Euclidean to Lorentzian metrics. The i in the action may be recalled, the wave
operator A can be replaced by its Lorentzian counterpart with a small parameter
it: sitting with the mass. In this section, it is useful to introduce the concept of
effective action.
Recall that we are looking for a theory based on Einstein's field equation
R llv

tRg llv

+ A9 llv

= -

8nGTllv ,

with T llv replaced by <Tllv >- a quantum expectation value of fields treated
quantum mechanically, the metric treated classically, and A, G replaced by some
bare constants An, GB These equations are derivable from the action S = Sg + Sm

327

Zeta Function Regularization


by the condition
2(_g)-1 /2

JS

Jg/L V

'

S = f(_g)1 /2 _l_[R - 2A]dYx


16nG B

-----,--=

'

JS

/LV - (_ g)1/2 Jg/Lv'

Thus, we need a quantity W which we would call the effective action for quantized
matter fields such that
2
JW
( _ g)1/2 Jg/LV

= <T/Lv)
=

D[ 4>] T/LV eiSm []/ D[ 4>] eis []

Recall Z = <OoutIOin) = D[4>] eiSm []


implies

JZ =

ifD[4>]JS

m iSm [].

Therefore

i.e.
Z[O]

= eiW , as

d
.
log( _11 2 GF ) = Lt -d (-11 2 GF )'
\1-+0
v

Taking trace of both sides and using Tr (- GFl"

= (GF(V) gives

As in the Riemann zeta function, the introduction of the generalized ( function


is understood as a purely formal operation. LmA';;v does not converge in general
for all values of v, but by analytic continuation from regions where it does

D. Lohiya

328
converge, we can make sense of the (-function. Recall
r(s)n-S

= Loo x s- 1e- nx dx.

By analogy, generalize (can be proved for matrix operators)


K-vr(v)

= Loo (isY- 1 eiKs d(is).

In particular,
K- 1 =

Loo e- iKs d(is) =

-G F,

therefore,
K- V =(-GFt

= [r(V)]-l Loo (ist- 1eiKs ids

and, therefore,
tr(-GFt = (v) = fdnX(-g(x))1/2<xl(-GF)'IX)
=

roo (ist- 1<xl e -iKSlx)i ds .

[r(v)] -1 fd YX( - g)1/2

But
GF(x,x') = <xIGFlx') = -i Loo ds<xle-iKSlx').

This, as we shall see, gives us the expression for <xle-iKSlx) by looking at the
coincidence limit of Green's function G (x,x') as x ---> x'. To consider this limit,
consider Riemann normal coordinates yI' at the point x with origin at x'. By
definition
gllv(x') = YiIlV'

a"gllv(x') = 0,

+ tR Il vp y' yP - iRIl.v(J;YY yP yY +
+ [ioRIl.vp;y~ + is R.llp).R\v~] y' yP yY l + ...

gllv(x) = Yillv

all quantities being evaluated at the origin (x'). This is as far as one can go with
general considerations and we now go over to a special case.
For a massive scalar field for which Green's function satisfies

[Ox + m2 +

~R(X)]GF(X, x') = - (--

define
'F(X, x') = (- g(X))1/4 GF(X 1 Xl)
= (270)-4 fdYke-ikY'F(k)

g(x)) -1/2 bl'(x - x'),

Zeta Function Regularization

329

as its 'localized momentum space' Fourier expansion. <F(k) can be solved for any
order of derivatives of the metric, e.g. to the first order
GF(x,x') = <F(X,X') =

(2~)4

f dYxe-ik-Y<F(k).

Operating by Ox + rn 2 , the left-hand side is just the four-dimensional b-function


to the first order.
<F(k) = (1/k2 - rn) = GF(k) to the first order. To the second order, our wave
equation reads
[('1 Il V

+ }RIl.vpY yP)O!iov + rn 2 + R]G(x,x')

= - [ - g(x)] -1/2 bll(x - x').


Now the diagonal terms of gllv are

(- 1 + tRo.op Y' yP, 1 + th 1 1P Y' yP, 1 + }R 2 2P Y' yfI, 1 + }R 3 3P y' yP).


The off-diagonal terms being 0 (y2). Thus, the determinant (-- g) (taking due care
of the metric sign) is just ~ 1 + ty. yP R. p ,
<F(X,X') = (1

+ nR.p y' yP)GF(x, x'),

<F(k) = GF(k) - nR.pOOPGF(k),

with o = %k.
Similarly
GF(k) = <F(k)

Now in the limit y'

+ /2 R.p o OP <F(k).
-+

0, taking the Fourier transform of the G(x, x') equation.

Thus,

( -"61) R(k

<F(k) = k 2 _ rn 2 +

- rn) -2 .

Proceeding along these lines, with more algebra, we can get the expression up to
the fourth-order derivatives of the metric.
<F(k) ~ (k 2 - rn 2) -1

+ (

_ i)R(k 2 _ rn2) - 2

+ 1i(i -

)R 1.o(k 2 - rn 2 ) - 2 - ta.poo P(k 2 - rn 2) + [(i - fR2

+ fa AJ(k 2 -

with

a.p -_.1(
2 <;;
-

.1)R
6
;ap

~R
30

AR AP

1 R
+ ITO
;.p - 40lR .PoA A + ~RK
A R + 60
1 RAIlK R
60
Ii KA

A/LKp

rn 2)-3,

D. Lohiya

330
Substituting this in the Fourier expansion of G(x, x') gives

'F(X, x')

~ f (~:> e - iky [ao(x, x') -+- a 1(x, x') ( + a2(x, x') (

a!l) -+-

- a!2 y}kl - m l )-1

with

aleX, x') =
As ya

--->

0,

a1

= (i -

H! -

+ ia\.

~)2Rl

~)R,

a l = I~ORapyJWPyJ - loR ap W P -

i(t - 0

+ Hi -

~)lR2.

Using the integral representation

(k 2 - m 2 -+- ie) -

= - i

f:

ds eis(k L

",'

h),

the d 4 k and ds integration can be interchanged and the former can be performed
explicitly, absorbing (is) in k2 gives (is) - Z from d 4 k, the Gaussian integrals give n Z ,
squares are completed by a factor
exp(! yayP/2is)

== err /2is .

This gives
'F(X,

x')

i(4n) - 1 foo i ds(is)-2 exp[ - inzs


i

+ ~l F(x, x'; is),


2~

where

F(x, x'; is)

= ao(x,x') + a 1(x,x')is + az(x,x')(isf + "',

GF(x, x')

= -i( _g(x))-1/4(un)-Z ooi ds(is)-lF(x, x'

is) e-im2s+rr/Zis

(xle-iKSlx')

i(4n)-Z(_g(X))-1/4 e -im 2 s+5/2is_ 2F(x, x', is),

taking Lim x ---> x' and putting in the expression for

[( _g)I/4
((v)

1,

(J

~(v)

0,

= K[r(v)]-1(4n)-Z f d4 x( _g(X))l/l
x F(x, x' is) e - im2s i ds.

too (isf-s x

gives

Zeta Function Regularization

331

Recalling that m2 is understood as (m 2


performed for Rev> 2 to give
((v) = r(v

-i(4n)-2

l)(v _ l)(v

+ 2)

is), three integrations by parts may be

d 4 x( _g(X))1/2

fn
0

3
(is)" a(is)3 x

x [F(x, x is} e -im 2s]i ds,


((0) = i(4n)-2

d 4 x( _g(x))1/2[1-m 4 - m2a 1(x)

+ a 2(x)].

Further, ('(v) can be obtained to give


('(0) =

~(4n)-1 (v -

-f

Df
f:

d 4 x( - g)1/2

d 4x(_g)1/2{1m4 - m2a 1(x)


In(is)

+ a2(x)}-

a(~)3 [F(x, n, is) e - im1s]i ds.

Looking at the form of the gravitational action, it is clear that the m4 and m2 terms
merely renormalize the cosmological and gravitational constants. The factor
a 2 (x) is of tile fourth order in derivatives of the metric and, thus, represents
a higher-order correction to the general theory of relativity - which only contains
terms up to the second derivatives of the metric. Now the DR term is a total
divergence which would not contribute to the field equations under a metric
variation. Also

is a topological invariant (generalized Gauss-Bonnet theorem due to Chern).


Therefore, one may only consider the R2 and RapR'P terms. In case we wish to
reproduce Einstein theory, all that we have to do is introduce these terms in the
original Lagrangian with a bare coefficient and then the loop matter terms could
be absorbed to yield renormalized coefficients whose values would be determined
by experiment. In principle, there is no reason why one could not set these to zero.
This leaves the second term in ('(0). Using

W=

f( _g(X))l/Z Lefdx) d4x

implies

Leff = -

~
64n

+ any

CN

In(is)

aa(zs)
3

[F(x, x, is) e -im 2s]i ds

finite multiple of (/o,a1,a Z '

CO MMENTS. (1) The choice of the scale 11 changes the renormalized constants.
The changes are finite. In practice, one chooses a fixed 11 and uses the results of
calculations with this value of 11 to calibrate the instruments used to measure A,

D. Lohiya

332

G and the coefficients of the quartic terms. Once these are measured, further
calculations using this value of J1 and the measured value ofthe constants, can be
used to make predictions about the outcome of experiments. If J1 is changed one
has to recalibrate the instruments or change A, G, etc. The effect of these changes
will leave invariant the predictions of the experiments. Such an analysis of
rescaling J1 in interacting field theories in M4 gives the renormalization group
equations.
(2) Leff has been obtained from an asymptotic expansion of F. Therefore, it
cannot be regarded as a complete Lagrangian associated with the physical
renormalized <T/1v)' We shall come to this later.
(3) Note that we did not require any explicit infinite renormalization of the
coupling constants, as is required in all other methods. The analytic continuation
converts a manifestly infinite series into a finite result. Obviously an infinite term
has been cleverly discarded in this formal procedure. The precise term can be
exhibited by following precisely the same procedure as in the Riemann zeta
function case done earlier. We note in passing that if, instead of representing the
logarithm in

we represented it as
Lt (v- t (-J1 2 GF ),

V-i),

V---,O

we would have

The second term is the same as before, but the first term requires infinite
renormalization. However, once this is done, the renormalized effective action is
the same.
(4) For other (higher) spin fields, the procedure goes through without much
change. From the expansion of the metric in terms of the normal coordinates and
from the wave operator of the field, one obtains the coincident limit of Green's
function. This then completely determines the, -function and, hence, the one-loop
term. The expansion of the Green's function (the ao' at, a 2 ) for spin! spin fields is
exhibited in Birell and Davies. [3].
Finally, the in and out vacuum states enter here only formally. If one is
interested in <Ooutl T/1v 100ut) or <0inl T/1v lOin), then one can simply expand
< out) in terms of the in-states or vice-versa, e.g.

<0. IT
In

/1V

10. ) = <Ooutl TllvlOin) In


<OoutIOin)

-i'IAijTllv(Ut.,i, Ut.,j)
i.j

333

Zeta Function Regularization

with

Aij = -

iL PkPik

I.

p, IX are the Bogolubov coefficients discussed by Iyer in Chapter 15. The final term
is finite so that the divergences in <O~':) Tl'vIO~~t) are the same as those for
<Ooutl Tl'vIOin). It is interesting to note that the path integral for fermions, treated
as anticommuting Grassmann variables, yields a determinant of the operator in
the numerator rather than in the denominator as for boson fields. (For bosons we
have n).; 1/2 in every Gaussian integration determinant in the denominator.)
Thus, in the effective action, the traces occur with opposite signs, e.g.
= -

W(O)

ti Tr(ln( - GF )),

= + ti Tr(ln( -

W I /2

GFO ))

This implies that if there are an equal number of boson and fermion spin states,
the leading divergences due to ao (proportional to the volume of the space) cancel.
Those due to a l would also cancel if the masses of bosons and fermions obey
a simple relation. (Divergences are
1 , the a l is the same for boson and
fermions, the only difference is the sign.) In practice, all m's are zero in
supergravity. The surviving term is just on account of the a 2 term. This
cancellation of divergences is perhaps a good enough reason for taking the
supersymmetric theories seriously.
To summarize, the (-function gives a prescription of evaluating a finite effective
action yielding

-mMa

Rllv - tR9l'v

+ Ag/l v + aH~v + bH;v

= -8nG <Ooutl TllvlOin)


<OoutIOin)

where
I

__
1_

_ b_

HI'V-(_g)I/2 bg /lV

f_
(

g)

1/2

R dx

and
2 -

H llv -

___
1_

_ b_

(_g)I/2bgI'V

f ( _g)

1/2.p

R R./ldx,

respectively.

4.

Conformal Anomalies

Consider a field theory which at the classical level is conformally invariant.


A conformal transformation is defined by
gll.(x) -+ Q2(X)gl'v(x) = g(x),

D. Lohiya

334

I.e.
15g~V

= _2Q-l

15Qg~v.

From the definition of functional differentiation

and

T llv == ( _ g(X))1/2

15S

15g~V

we get
S[g] = S[g] -

f( _g)I/2T/[g]Q-

1(x)

15Q(x) d 4 x.

Therefore

Thus, if the classical action is invariant under conformal transformations, the


stress-tensor is traceless. Conformal transformations being essentially a rescaling
oflengths at each spacetime point, the presence of a mass or a length in the theory
will break conformal invariance. Thus, one looks at the massless limit of the
regularization procedures.
Consider the massless conformally invariant (~= i) scalar field. Under
a change of scale, it can be easily argued that the effective action

transforms to

Thus,
Q(x) 15Wj
p(x) 15W[PJj
<T,/(x) = - (_g)1/2 15Q n=1 = -- (_g)1/2 ~ ~=1'

In the massless limit

giving

Zeta Function Regularization

335

for

k,
a 2 = lO [R,/JybR'fJyb - R,pR'fJ - DR],
~ =

<T/(x)

2880n 2 [R, py,jR"PY - R,pR'P - DR].

Again, this expression is much more easily derived by this method than it is by
other methods. The trace for higher spin theories is again simply derived from the
scaling behaviour of the determinant. The trace in arbitrary even dimensions is
<T/) = -a nI2 /(4n)n(2, e.g. for two dimensions, <T/) =-ad4n = -R/24n
(in two-dimensions, ( = 0 for conformal coupling). The trace anomaly becomes
particularly important if the background spacetime is conformally flat. For
a conform ally invariant quantum field, the anomalous trace determines the entire
stress energy tensor. Recall that under a conformal transformation,

usmg
and
2

bW

<T/LV ) -- (_ g)1(2 -bg/LV


gives
W[g]

W[g] - f(_g)1/2<T/@)O-lbOd 4 x.

Using
-pa

g bgP

_
(1

pfT

g bg P

(1

gIves

Thus, if we know the trace of the stress energy tensor in a given background and
also the stress energy tensor, then the stress energy tensor in a conformally related
background is simply given by the variational integration over the conformal
factor. In a general spacetime, such short cuts are not available and one has to
resort to brute-force methods. An example is the Einstein static Universe for
which the anomalous trace vanishes.

D. Lohiya

336
5.

Phase Transition in a De Sitter Universe [4]

According to the inflationary Universe scenario, the universe passed through an


exponentially expanding phase provided by the vacuum energy of the Higgs
scalar field in a grand unified theory. We shall now show how vacuum energy of
a gauge theory is calculable to one loop in de Sitter space which is a 4-sphere of
radius a = J3;A. The curvature tensors are
1
Rapyo = 2(gavgpo - gaogpv)'

As shown earlier, for a constant background matter field rPb' the amplitude to one
loop is given by
log z = - S[gb' rPb]

+ log f d[qi] exp -

S2[qi].

Defining an effective potential as


exp[ - Q Veff(rpb)] == Z,
with Q being the volume of the 4-sphere = (Sn 2 a4 )/3, and noting that for
a constant background field the classical action is just the potential, e.g. for
a scalar field
S[gb'bJ =

f[~(VPb)(aurpb)

V(~)b)JJgd4x =

V[rpbJQ

If Q is the second variational derivative operator (== - 0 + V"(rpb)) for the scalar
field), then following the previous analysis (expressing the fluctuations qi in terms
of the eigenfunction of the operator Q and performing the Gaussian integration),
we get

Veff(rpb) = V(rpb) -

~IOg[det!l-2Q(rpb)]1/2

the being for fermions and bosons, respectively and!l is the scale that enters in
the definition of measure.
The determinant of Q can be written in terms of eigenvalues )'n of Q (with
degeneracy gn)as

TI (2n)9".
This infinite product is given a sensible meaning by defining
((z)

==

00

n=O

gnA;Z

337

Zeta Function Regularization

following analytic continuation procedures, this is defined for all z except for
a finite number of poles.
('(z)lz=o -

DetQ

L gn log An'

exp[ -('(z)lz=oJ,

Det(I1Q) =

11,(0)

Det Q.

etc.
On S4, the eigenvalues and multiplicity of the various spin operators have very
simple forms which are given in Table I.
Table 1.
Spin (L)

Eigen function

CPn

l
2

I/t~

.i
2

Condition

=0

A~

V.A~

1jJ~(1

V" I/t~a = 0
}'(tIjJ~(l =

h/, = 0
V.h:' = 0
g, = 1(2L+ l)(n

An

-V" V" + X 2
y.r}P - X

a- 2 n(n + 3) + X,

-V"V"+X '

a- 2 (n 2

f"a"-X

+ia- (n+3)-X

- V. V. + X,

a- 2 (n ' + 'In + 8) + X 2

+ia- 1 (n + 2) + X

+ 5n + 3) + X,

h/ v = hnV/1

h~v

Operator

+ l)(n + L + 3/2)(n + 2L + 2).

Note that the fermion eigenvalues occur in complex conjugate pairs. When
multiplied to get the determinant, they would be 'squared up'. The general
eigenvalues and multiplicity are

+ (2L + 3)n + X
= (n + L + 1 + fi)[(n + L + 1) - fiJ; ~
gn = t(2L + l)(n + l)(n + 1+ 1)(11 + 2L + 2).
in = n 2

= (L

+ 1) -

The constant X is determined by the mass of the field and by the precise form of
the operator in question. Thus,
((z) = t(2L

+ 1) L

(II

+ l)(n + L + 1j(n + 2L + 2) x

n=O

+ L + 1+ firZ[n + L + 1 Replacing n + L + 1---> n, this identically equals


x

[n

((z) = t(2L

+ 1)

L
00

n=L +(3/2)

(n 3

n(L

fir".

+ W)n- 2z

~)-z .
n

1 - 2'
\

338

D. Lohiya

Defining the power series coefficients in the expansion


(1 - X)-Z

L
00

CkX k

k=O

= 1 + zx + Z(Z + l)x 2 +

z(z

+ l)(z + 2)
6

x3 +

... ,

We get

L
00

,(z) = t(2L + 1)
-(L

k=O

Ck

.1k ['R(2z

+ 2k - 3, L + 1) -

+ W'R(2z + 2k -1,L + 1)],

where 'R(Z) = 'R(Z, 1) and 'R(z,a) has a pole at z


Near the pole
1
'R(z,a) = z _ 1 -"'(a)
",(z)

= 1.

+ o(z - 1),

= dz (log r(z.

Thus

+ 1) ['R( - 3,L + 1) - (L + W 'R( -1,L +!)- t(L + W.1 + !(L + t).12],


l,a) = -ta 2 + ta - fl,
3,a) = - ta 4 + ta 3 - ta 2 + no.

'(0) = t(2L
'R( 'R( -

Consider scalar electrodynamics described by a real vector field All and


a complex scalar field cp. The Euclidean action is
S[AIl,cp]

= f[tFllvPV + t{(OIl- ie.1 Il )cf>}*{(oll- ie.1 Il )cf>} + V(cf]dV,

V(cp) = t(m 2 +

~R)cp* cp +

:!

(cp* cp)2.

e, A, ~ are dimensionless, A = self interacting coupling constant, ~ = coupling to


curvature. Defining W = - gllv D + R llv and Q = - D gives the following form
to the one loop Z:
Z

= [Detll-2(W + e2 cf>t)]-1/2Det[Il- 2(Q + V"(CPb))]-1/2.

Using Rllv = (3ja 2)gIlV and the Eigenvalues of - 0 on vectors as a- 2(n 2 +


+ 3), the eigenvalues of Ware (n 2 + 5n + 6)ja 2 Repeating the algebra
described gives

5n

log Detll-2(W

-nl,t -

+ e2 cf>t)
a 2e 2cf>t) -

W,t -

a2e2cf>~)log(1l2a2).

339

Zeta Function Regularization

The one-loop effective potential is then


Veff(<Pb) = V(<Pb)

+ 2n[log

Det[/1-2(W+ e22)] +

+ log Det[/1-2(Q + V"(<p].


The first determinant comes from single closed gauge field loop with coupling
= e2 (See Figure 3). The second comes from single scalar loops with coupling = A
(see Figure 4). If is large enough for the second set to contribute, then the
one-loop approximation breaks down (Coleman and Weinberg [5J). Thus, we
assume A = 0(e 4 ) and ignore the second set.

'6

q+4+

Fig. 3.

&+~

Fig. 4.

The effective potential is therefore


Veff

V(<P)-2~

[('(1,t-a 2e2 2)+

+ w,t - a2e22)log(/12a2)].
For a large a, this gives
Veff = V(<p)

2
/1

e- 22- + -3e- 2 a 2 2 [log


16n

as a ---t

00,

J+

O(a- 4 Iog a)

this gives the Coleman-Weinberg result.

Veff(<P) = V(<p)

For m2

3J

4 [
e <P
+ 3e64n2
log7 - 2 +

3J

<p2
+ 64n 2 e4 <p4 [e
logy
- 2 .
2

0, this flat space potential has a minima at = o, where

D. Lohiya

340

In terms of CPo, the mass of the vector boson in flat space M = ecpo. We get

3e4

Veff = 64n 2 cp4 log

e2 cp2
M2

2IJ .

In general, in terms of a dimensionless parameter


32n 2 ~ 8n 2 A 3
p=-------e2
ge 4
2
We display the form of the effective potential.
For P > - i, for different radii a (see Figure 5) at at' the two minima in the
potential have equal energy and, thus, a first-order phase transition can take
place.
P ) - 1/6

.~

INCREASING
aM

Fig. 5.

For P < -;, see Figure 6.


There is a second-order phase transition. For a> ac , the Universe slides down
Veff(<p), but as it does, the effective cosmological constant decreases and
a increases thereby continuing the slide to the flat space.
What I have sketched is just the tip of the iceberg and might be of use in
discussions on the inflationary universe. I strongly recommend a pedagogical
familiarity with the zeta function and I hope I have demonstrated its wide use.

P (-1/6

INCREASING 1aM)

.a M
c

64 TT2 V 1<1

Fig 6.

Zeta Function Regularization

341

References
1.
2.
3.
4.
5.

E. T. Whittaker and G. N. Watson, A Course of Modern Analysis, CUP, Cambridge (1940).


C. Itzykson and 1. B. Zuber, Quantum Field Theory, McGraw-Hill, Singapore (1985).
N. D. Birell and P.C.W. Davies, Quantum Fields in Curved Space, CUP, Cambridge (1982).
B. Allen, Nucl. Phys. B226, 228 (1983).
S. Coleman and E. Weinberg, Phys. Rev. D7, 788 (1973).

17. Inflationary Cosmology and Quantum


Effects in the Early Universe
N. PANCHAPAKESAN
Department of Physics and Astrophysics, Delhi University,
New Delhi 110007, India

In this chapter we briefly introduce the Friedmann~Robertson-Walker (FRW)


model of cosmology and discuss some of its difficulties, like the horizon and
flatness problem. We then describe the inflationary model of Guth (1981) [7] and
its modified version ~ the new inflationary scenario which attempts to solve these
problems with some success. The Coleman~Weinberg potential [5] which is
necessary for the slow phase transition, is also introduced in the context of grand
unified theory of elementary particles (SU(5) GUTS).
We then present the alternative Linde 'chaotic inflation' model which is also
called 'primordial inflation'. Hawking has generally discussed the whole problem
in the context of primordial, as well as GUT inflation, and has obtained the
constraints placed on such models by the observed entropy of the Universe and
the observed isotropy (10- 4 ) of the relic microwave background radiation
(MBR). We briefly describe this next.
In discussing the quantum effects in the early universe, we first come across the
difficulties in handling quantum fields in curved space, characterized by the
divergence of the stress-energy tensor evaluated in the curved space, viz. Tllv )'
We discuss methods to calculate this by relating it to the effective action W (or r)
which, when functionally differentiated, gives <Tllv )
The relation

<

W= -ilnZ[O]

-~Tr[ln(-GF)]= f.J=-g

Leff

dnx

enables us to use the De Witt~Schwinger expansion of GF (Green's function) to


isolate the divergent parts of the effective action W (or, correspondingly, those of
Leff ).

The divergent part of Leff is purely geometric and so serves to renormalize and
redefine the Newtonian and cosmological constants G and A, as well as two other
constants which can exist but are experimentally known to vanish. The
renormalization is discussed using the method of dimensional regularization. Leff
is also the effective potential that determines the nature of the phase transition. So
this approach leads to the identification of Coleman~Weinberg [5] potential in
a curved space whose knowledge is essential for the inflationary models.
343
B. R. lyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 343-371.
1989 by Kluwer Academic Publishers.

N. Panchapakesan

344

Finally, as an interesting aside, we indicate how nongravitational interactions


---+ 2y) in this case can bring a length scale which can avoid the singularity of
cosmology, by considering a simple model due to Birrell, Davies and Ford [3].

(nO

1.

Quantum Field Theory (QFT) in Curved Space Time (CST): A Short History

In the sixties, it was believed that field theory was dead or would fade away like an
old soldier. Somehow, it was at that time, around 1969, that QFT in CST had
a renaissance. Earlier, in the 1930s, Schrodinger had studied field theory in CST
and there had been isolated attempts by others. In 1969, Parker calculated
particle production in the early Universe in a systematic way and it is generally
claimed that the semiclassical theory got its final shape around 1978. There was
a group of people, mostly from Princeton and the UK., who did most ofthe work.
But, as is usual in physics, the catalysts have been two (glamorous) specific
developments. The prediction of Hawking radiation in 1974 and the 'inflationary
Universe based on GUTs' in 1981 (and, more recently, 'supergravity and
supersymmetry' in a distant way). By 1974, thanks to 't Hooft and asymptotic
freedom, field theory was back in the picture (with a vengeance) and thanks to the
success of gauge theory, people are even now talking of taking on gravity.

The Standard Model

The Big Bang Model has been with us since 1965, and it is here to stay, as no other
satisfactory explanation of microwave background radiation is available. It is
quite successful in explaining nucleosynthesis and other details beginning from
about a second onwards. The metric used, the FRW metric, is derived on the basis
of the cosmological principle (CP), i.e. the isotropy and homogeneity of the
observed Universe. The MBR shows an anisotropy of 10 - 4 only. Another feature
of the Universe is its density (p). There is a large amount of uncertainty about p.
n == PPc- 1 = 10- 2 to 1 (0.02 to 1). Even so, this is a very surprising value, as we
will see.
We write first the FRW metric

Using Einstein's equation,


GllV

= 8nGT'lV,

and the perfect fluid approximation


TIlV

_pgllV

+ (p + p)UIlU V,

biflationary Cosmology and Quantum Effects

345

we get the equations of cosmology

d
dt(pR 3 ) = - 3pR2.

IfpexR- 4 ,

R=

J8nG
1
-3-' R2'

i.e. R ex t l/2 .

If

p = constant, R/ R = constant or R oc

2.
2.1.

(I)

eat.

Problems in Standard Cosmology

Horizon Problem

For a light beam ds = 0; let it also be travelling radially, dB = 0 = d<p. Then

(to

dr

dt

~= Jt R(t)"

If we are at r = 0 and present time is to, then the above equation is satisfied. We
then have two possibilities.

(i) If the right-hand side is finite as t ~ 0 (and it is finite for R(t) ~ t n with n < 1),
then only signals from a finite r will be received and we have an horizon.
(ii) If the right-hand side diverges, we can always find a t for any given rand
there is no horizon.
In the early Universe Rex t ll2 , so we have an horizon and the horizon distance
is
h(t) = R(t)

IJ
r

dr
1-

kr 2

= R(t)

It
0

dt
- = 2t.
R

Earlier
Now
Fig. 1.

N. Panchapakesan

346

The size of the Universe is

S(t)

R(t)r

rt l / 2 .

Therefore, while S ex t 1 / 2 , h ex t.
Going back in time, S decreases much slower than h
Interaction is limited to the horizon h (See Figure 1). So how was S made
isotropic? This is the horizon problem.

2.2.

Flatness Problem

If values at present time are denoted by subscript O. Define


Ho ==

R2

~It=to'

+k=

Pc ==

~:J,

8n

-GpR2

gives
2

Ho

8n

+ R5 = 3

Gpo '

which leads to
3

Pc

+ 8nG R6 =

po

If
Po > PC'

k > 0 => closed Universe,

Po :( Pc'

k:( 0 => open Universe.

and
Now
Po - Pc

(8~~)' ~r

At any other time

3k 1
P - Pc = 8nG' R2'

In a radiation dominated Universe,


1
P ex R 4 '

Inflationary Cosmology and Quantum Effects

347

Therefore

P - Pc oc R2
P

or
1

1--ocR

or
(0 - 1) oc R2 oc t.
Also

P - Pc _ 0
1_
k
- P - - - - H2R2
0

or
Roc HoI0-lI1/2.

If 0 ~ 0(1), then at 10- 35 sec, (0 - 1) is 10- 35 or 0 = 1 + 10- 35 .


Such fine tuning that is required is called the flatness problem. It is also related
to the entropy problem. Since energy density P oc T4 and dS = dQ/T, the entropy
oc T 3 , entropy density soc NT 3
N = No. of photons/cc = 10 2 /cc,

Temp.

3K

s = 10 2

3 3 = 10 3/cc.

Hence, total entropy


SE = R3 S = 10 84

10 3 = 10 87

as R ~ 10 28 cm.

We expect SE ~ 0(1), but SE ~ 10 87 .

2.3.

The Monopole Problem (Figure 2)

This is rclatcd to the horizon problem. In a monopole, the Higgs field is


correlated. Correlations cannot exist outside the horizon distance. Since at the
GUT epoch, the Universe was made of a large number of causally disconnected
regions, it must have had many monopoles. Since they cannot decay, there must
be a lot of monopoles now. They are not seen but that is not the main problem.
Their mass is ~ 10 16 Ge V and, hence, they lead to 0 ~ 3 X 1011. In that case, the
Universe must have collapsed long ago on time scale ~ 3 x 10 5 yr.

N. Panchapakesan

348

was

Fig. 2.

3. Inflation
Meanwhile, the theory to solve all these problems had been born and is the
familiar GUT, which implies a phase transition to a symmetry-breaking ground
state. The present broken symmetry is restored at high temperature and so, in the
early Universe, symmetry was unbroken.
What Guth [7] added was the idea of supercooling. While supercooling, the
vacuum energy causes an exponential expansion now called the inflationary
scenario (cf. Equation (I)). What kind of potential does one require? A typical one
is indicated in Figure 3. The system has to tunnel into the broken symmetric
phase. After transition, < is released as latent heat and raises the temperature
T to 10 14 GeV.

p>

T ) Tc

,.,--- .......

()

=0

15

()=10 Gev

()

Fig. 3.

Inflation solves almost all problems - the horizon problem and flatness
problem, in particular.

Constant

T4

(10 14 Ge V)4.

If R = eHt with

the resulting space is called a De Sitter space, and FR W metric can be


approximated by k = 0 to give
dS 2 = dt 2

R2 dx?

Inflationary Cosmology and Quantum Effects

349

We assume expansion takes place for time M and then phase transition occurs
instantaneously.
R

= eHdt = 10 29 = e64

is required, as S ex R3
As

= 1087 .

1
Rex HolO _ 111/2'
so, if R is large, 0 ---> 1.
The horizon problem is solved as the early Universe was much smaller and well
within the horizon. The present universe (R = 10 28 cm) was R = 10 cm at
T= l014 GeV.

RT = 10 28

10- 4 X 6
R = 10 15 /10 14 = lOcm
X

10- 9 GeV = 10 15 GeV/cm = Constant,


at T= 10 14 GeV.

Before inflation, the size was R/eHM ,= 10- 28 cm or 10- 14 GeV- 1. The horizon
size is

hex

t =

10 19 GeV)2
( 10 14 GeV
x 1O-4~

= 10- 34 sec =

1O-24 cm.

As the size ~ 10 - 28 cm, the horizon is l0 4 times greater. Hence, there is no longer
an horizon problem anymore.

4.

Free Lunch

In the U.S., they say 'there is no free lunch'. The exponential expansion seems to
violate this saying. Let us see how this happens.
As the false vacuum is Lorentz invariant, it must have the form
T llv = Pogllv = -pgllv

= Pogllv

+ (p + p)UIlU

if p = -Po and P = Po

A conservation equation of the type

dU univ

dV

~=-Pdt

gIves

and, hence, is positive.


So energy increases because of negative pressure. As there is no asymptotically
Minkowskian space, no conserved total energy can be defined.

N. Panchapakesan

350

Problems with the Guth model. Phase transitions take place through a bubble of
the new phase which is formed on a nucleus and expands and combines with
other bubbles. The fast expansion of the Universe does not permit this. Bubbles
do not collide and the Universe will be full of bubbles. We have a swiss cheese
or 'Punched paper' Universe.
The Guth model of inflation, now called the 'old' inflationary model, presents
a difficulty in that nucleation does not make the bubbles fill all space. So the phase
transition is not instantaneous, but takes a very long time and leaves the Universe
very uneven or anisotropic, like a punched paper.
This problem of a graceful exit forces us to consider the modification given by
Linde [llJ and Albrecht and Steinhardt. [lJ

5.

The 'New' model [6]

This model increases the expansion time so that one bubble now occupies more
space than the whole observed Universe. Inside the bubble there is complete
isotropy.

v
False
vacuum

Fig. 4.

To enable this, a modified potential of the type shown in Figure 4 is necessary


so that the Universe is hung up in the false vacuum for a long time. Such
a potential is provided by the Coleman and Weinberg [5J (CW) potential, which
was constructed to have a potential where symmetry is broken by radiative
corrections. We consider the usual potential for a scalar field

V=

I!.-2 <p2 + ~ <p 4.


2

4'

02 VI
Ocp2 qJ~O

= f.l ,

If f.l2 > 0, only one minimum occurs at

<p

<Po

f.llfl-.

1 t3 4 VI
A
"6 Ucp4 qJ~qJo = .

ql = 0. For

f.l2 < 0, the minimum is at

Inflationary Cosmology and Quantum Effects

351

If /1 2 = 0, then the radiative correction at the one-loop level can provide the
symmetry-breaking term. If the coupling constant e (coupling to vector particles)
is ), (the 4>4 coupling), then one-loop correction is dominated by vector
particles. If /1 2 = 0, then the second derivative is zero and we have a flat potential.
In SU(5),
V() = U:x2[4In(2/(T2)

+ !(T4 _ 4)],

:x == g2/4n = 1/45 is gauge coupling.


The minimum of V is at = (T. Thus, potential parameters have to be finely
tuned; instead of V" ~ 1029 Gey2 we have V" = (10 9 Gey)2.
The phase transition (increase in < takes place in an expanding universe
which is a De Sitter Universe if k = 0. (In any local region, k = 0.) Thus,
calculations have to be done in a background gravitational field and we see the
need for QFT in CST.
When the temperature is high, we have an advantage in that SU(5) couplings
are small (due to asymptotic freedom), but as the Universe cools, this is no longer
true and calculations of nucleation becomes difficult. Here again, Hawking
radiation helps. A De Sitter universe will have radiation at temperature
T = H/2n = 10 9 Gev, and its temperature cannot go below this value. Thus,
couplings stay small but the price one has to pay is to work in De Sitter space.
Briefly, we expect to have some thermal and/or quantum fluctuations. When
they grow and become large enough, we should be able to describe them by
classical equations of motion. At the time this happens, the value is, say, i' then
the field rolls down the potential, but on a timescale which is slow compared to
the exponential expansion rate. This single fluctuation region becomes larger (or
much larger) than the whole universe which is inside this region.
We assume that the initial region was homogeneous on a scale of H~ 1 =
1O~ 24 cm. This has to expand to 10 cm. So the expansion must continue until it is
> 10 25 or e 58

6.

Evolution of the Scalar Field

The equation of motion is


-D

= V'();

N. Panchapakesan

352
The last term on the right can be neglected when R is large

;p + 3H + V'(4))

and we have

..

8V

2S

4> + 3H4> = - 84> = 4 a 4> In (4) /u ).


The running coupling constant is

a(Q2)

4n/4f In (Q2/A 2).

Therefore

8V = -b4>3
8cp
We neglect the
2

4> (t)

= -

with b = 0(1) =

;p term if Ht

t.

1 then we have

3H
2).({3 _ t)

H- 1 , called the 'effective particle horizon', is an important length scale during


evolution. It sets the scale of microphysics.. H- 1 ~ constant during the De Sitter
phase, except towards the end when reheating takes place. H - 1 = R/R oc t in the

FRW stage.
The density variations and perturbations can be characterized by comoving
wavenumber k and comoving wavelength ). and the physical values are then
given by
Aphys.

= R(t)A.

R ~ eR ! during the De Sitter phase and ,,/i during the FRW phase. So Aphys. will
grow very fast and become much larger than the horizon H- 1 during the De
Sitter stage.

Fig. 5

353

Inflationary Cosmology and Quantum Effects

This trend is reversed in the FRW stage and Aphys re-enters the horizon
H- 1 ( oct). The time when they leave the horizon to, the change to the FRW stage
takes place, t*, and they re-enter the horizon tH are shown in Figure 5. Let
Po = e and Z = <HR)0)2 e. At

(jpI

(independent of scale ). set by microphysics).


During to < t < t*, Z = constant,

During t* < t < tH' again Z = constant,


1
e = (HR)2 oc t

at tH' HRA = H)'Phys = 1. So again, Zll = EH. Only at reheating t*, Zo ~ yZo. ZH
is independent of )0 (the scale).
The fact that H is constant during the time the perturbations are created, is
crucial to this result. Now
Po = V(cP)

a2

+ 2 + p"

Po = - V(cP) +

a2

p,

+ 3'

where a is group theoretic factor

(jp = aA

and

Co ==

~ 0(1).

((jp)
P

As V(cP) is flat

= ao A = ZOo
to

Po

The evolution of Z obeys a differential equation similar to that of cP. If is


assumed to be nonzero only near t* and effective for time MG';t, and takes cP from
o to M Gut , then Z obeys the equation
Z

),,2

'I'

Zo 6 == yZo,

M4

Gut ~

Po'

therefore

= ZoPo

15 '

acpo Acp Po aAcp


Z = - - --:z = - . - ;
Po
fPo
fPo

(CPo == cp(t o

Now Acp in the comoving frame is related to the Hubble frame by Acp

= H AfP

and

N. Panchapakesan

354

!J.cp

H /2n on length-scale H - 1. Therefore

H2
Z~--.

21[(Po

According to Hawking
V'
bcp3
.
cp = - 3H = 3H'

where
CPo ~ 10 9 GeV (~THawkJ,

H ~ 10 9 GeV,

as compared to

(-[)p)

P MBR

= 10

-4

(=ZMBR)'

A potential which gives H and so has a cjJ which is larger, will give a better
result.
7.

Linde's Chaotic Inflation [11]

A sufficiently fiat potential can cause an expansion. Consider a theory where


V( ) = A4 with A 1. A clear description is possible only after the Planck time
1

where p <

tp ~ - ,
mp

If A is sufficiently small at

mp.

t ~ tp '

there is no reason why

0 everywhere.

m:

(V() ~
due to the uncertainty principle.) Quantum gravity effects are not
important.
In an open infinite Universe, there will be domains of size I
1. Consider
the evolution of this field. This domain expands with scale factor

m;

R(t) = R

(G
H =

e,

with H

Ht

~;'P =

= (8n

V(cp),H 2 =

(~nA)1/2 2 .
mp

V(cp)\)1/2
2

mp J

83n GP).

'

355

Inflationary Cosmology and Quantum Effects

The equation of motion of the field is

which implies that at 2 m~/6n,

Therefore, at A 1, the typical time

during which decreases considerably.


So the Universe expands and

65 if rh >.: 3m
So eHtlt >.:
C/" e
'1"0 C/"
p.
This is possible if V<p~ ~
For

8.

m: is also maintained, i.e. if A ~ 10-

Hawking's Limits on Inflationary Models [9]

In any model, there must be an order parameter which changes its value from
the period of exponential expansion to the period of normal expansion. If is
a vector or tensor, it would have a directional dependence and one would not
expect an isotropic exponential expansion. So it is reasonable to take as a scalar
(may be with many components in some internal space). (In models with
higher-derivative quantum corrections, one can take to be a function of scalar
curvature.)
One can normalize so that its effective Lagrangian has the standard form
!i' = - J.g~Vj~jv - V().

(A trace over internal indices is understood.) V may contain three or more


derivatives but their effect will be negligible. (V ~ 0 and V = 0 at = o its value
at present.) All models assume an early stage that was locally homogeneous and
isotropic with T~v dominated by . The field equations are

~ = 8nap.
(1!)2
R + R2
3'

R
1
H=-,a=R
mp2

(1)

N. Panchapakesan

356
Gravitation:

(2)
~------~-----~

Hamiltonian
Matter:

..

+ 3H

VZ

-Z -

V () = O.

(3)

In order to get exponential expansion, there has to be a 1 such that V( 1)


dominates on the right-hand side of Equation (2). Then
H

(snV)l/Z
3mpZ

(4)

'

i.e. z V.
We have exponential expansion, R oc eHt . After a short time the VZlRz term
in Equation (3) can be neglected. If Ht 1, the (fi term can be neglected and we
have
3H=-V'

or
V'
= - 3H t

+ constant.

Let
=

$ + 1

V'
- 3H(t - t1)'

Substituting in the equation (fi

+ 3H = -

(5)

V'(),

or
(6)

neglecting the last term.


.
If V" < - 3H z, (Figure 6) $term can be neglected and $ grows exponentially
on a short timescale compared with H- 1 and the time is not long enough to have

Inflationary Cosmology and Quantum Effects

"

V (-3 H

357

Fig. 6.

V") 3 H2

Fig. 7.

a significant expansion. If (Figure 7),


V" > 3H 2,

lP =

exp -

V"ll/2t -. 0,

it reaches the local minimum and then it tunnels into <P = <Po. This leads to
an inhomogeneous Universe. So for a reasonable model we require
1V"(4?l)1 <

j3H = 8~ V(~l).
y3

In that case, we can neglect

lP + 3H$ =

mp

lP in

Equation (5) and

or

~$
lP

= - 3H dt;

$=

e - 3Ht,

at t = tllP = 0; <P = 4?1' therefore c = 0, then

(7)

358

N. Panchapakesan

and

<jJ

=-

m V'
(24:V)1/2.

(8)

The exponential expansion ends in a time Llt when ILl VI, and the change in V is of
the order of V. Then
or

Then

I.e.

Llt> 64/H,
or

or

or

IV'I < (~)1/2 ~.


8

mp

If (10) is true, then is smaller compared to Vin Equation (2). A lower bound on
I V'I is obtained from the requirement that density fluctuation 15p/ p < 1O~4 (from
MBR data).
Quantum fluctuations of the <jJ field give rise to a scale free spectrum
( 11)

where 15<jJ is the r.m.s. quantum fluctuation of <jJ on the scale of horizon H~ 1
during the period of exponential expansion. If Equation (7) is satisfied and
IV"I < 8nV/J3m~, then m is negligible and 15<jJ = H/4n 3 / 2 (Ref. (2)). Then
15p

16V 3 / 2

J3m~1 n

(12)

Inflationary Cosmology and Quantum Effects

From Equations (11) and (12)

6p

2J2(24n)li2 Vl/28n V

(4n)3/2m~V'

As

I V'I < (~)1/2 ~,


8

mp

V 3/2 m

V 1/ 2

< 10- 4

__ J= __

m~

m~

or
V<5x10-1m~,

i.e. H < 6 x lO- smp. So H cannot be of the order of mp.


In chaotic inflation
or

.Ie

-4 qJ .

Equation (10) gives

IV'I < (~)1/2 ~.


8

For V =

111 >

mp

!m 22, as

mp

2A ~

V' =

m2, we have

3m p .

Equation (12) gives


16 V /
_
_ _ < 10- 4
3 2

j3IV'lm~

or

or
m2 < m~ .10- 4 ,

~ 3mp

from above. Therefore,

m<

10-smp~

10 14 Gey.

359

N. Panchapakesan

360

Is there any connection to GUTS? When V = ).,rjJ4/4, Equation (10) gives

or

and Equation (12) gives

16

J3 4 3/2

p /2rjJ6
-4
m~).,rjJ3 < 10 .

Therefore

As rjJ

6m p '

).,1/2

9.

< 10- 6 or )., < 10- 12

10- 11 .

Quantum Effects in the Early Universe

Here 'Early' refers to the pre-inflationary era (assuming that there was an
inflation). These studies had begun even before the inflationary scenario was
discussed.
It had been suggested that the initial anisotropy in the Universe leads to the
creation of particles and their back reaction smooths out the Universe and makes
it isotropic. One should be able to calculate the back reaction ofthe Universe (the
metric) due to particle production.
The methods used for this study also enable us to write down the potentials in
curved spacetimes which are needed for the study of the 'inflationary scenarios'.
The question of back reaction is one that leads to divergences in field theory
and the main problem is to be able to handle the divergences in a consistent way
and get finite expressions for quantities of interest.

10.

The Fundamental Problem

The main difficulty is that <01 T!,vIO) diverges. This is the general divergence
associated with the values of operator products (here (01 rjJ210) or quadratic
operators.

Inflationary Cosmology and Quantum Effects

11.

361

De Witt-Schwinger Expansion of Green's Function

Consider Green's function for a scalar field defined in the usual way by

[Ox + m2 +

~R(X)]GF(X, x') = - [ -

g(X)] -1/2 bn(x - x').

(13)

Recall
iGF(x, x')

= <01 T4>(x)4>(x') 10).

When x -> x', we find divergences in the expression for GF(x, x'). Using Riemann
normal coordinates at the point x', x' - x = y

+ t RIl,vp y'yP - i RIl,vp;y y'yPyY +


R i.
] , P y 6 + ... ,
+ [ 201 R ll,vp;y6 + ~R
5 'lliJA
yv6 Y Y Y Y

gllv(x) = IJllv

where IJllv is the Minkowski metric tensor and the coefficients are evaluated at
y = O.
Define
~F(X, x')

= (_g(X))1!2G F(X, x')

and its Fourier transform by

~F(X,X') = (2n)-n fdnke-ikY~F(k),


where ky = IJ'P k, y p' One works in a localized momentum space. Solving Green's
function in Equation (13), in this localized k-space after expanding in normal
coordinates and taking the Fourier transform of Green's function in k space, we
get
~F(k) = k 2

m 2 - (-6 -

- ta'a p(k 2 - m2 )-2


~

(x x')
,

0 W __ m2 )2 + 2(6 +

[(i - ~)2R2

2 -2

()R"a (k - m)

+ ian(k 2 _

m2)-3,

dnk
'
f __
e- 1ky x
(2n)n
x [ao(x, x')

+ al(x, X')( -

a~2) + a2(x, x') (a:;z YJW - m 2)-1,

where, to adiabatic order four (four derivatives of gllv),


ao(x, x') = 1,
at (x, x') =
a 2 (x, x') =

(i -

hi -

~)R ~)2 R2

t(i -

~)R;, y' - ta,p y'yP,

+ ta\,

- i)R,p + l~oR,p - ioR,p;/ + 601 Rk, A/1 R kA + 601 RAllq, R Ailq/J'

a,p = t(~

loR~R,p

N. Panchapakesan

362
We now use the integral representation
(k 2 _ m 2 + ie) -1 = - i

fo

OO

ds eis (k 2 -m' +i)

and interchange the order of dnk and ds integrals. Then the k integration can be
done and we get
(F(X,

x') = - i(4n) -n12

LX) i ds(is) -n12 e[-im S+J/2iS)]F(x, x'); is),


2

where
O'(x, x')

= t y, y' (half proper distance)

and
F(x, x'; is) ~ ao(x, x')

As GF is related to

(F'

+ a 1 (x, x')is + a2(x, X')(iS)2.

we have the Oe Witt-Schwinger (OS) representation

G~S(x,x') = -iI1 1 / 2(x,x')(4n)-n I 2 >:

x IooidS(iS)-n/2 exP[--im 2 S

+ 2~sJF(X'X';iS)'

where
l1(x, x')

= -det

[o,/\O'(x, x')][g(x)g(x')r 1/2.

In the normal coordinates, we use 11---> (_g(X))-1/2. To all orders, we can write
F as
00

F(x, x'; is) =

a}x, x')(is)j

j=O

with ao(x, x') == 1.


The OS representation is exact for the Feynman propagator. But the
expansions of F are asymptotic approximations in the limit of large adiabatic
parameter T.
Using the expansion, we can integrate over s to give

where 0' ---> 0' - ie, and H(2) is Hankel function of the second kind.
As global boundary conditions have not been used, the vacuum state in the
defining equation is not determined. But that is not important for high frequency

Inflationary Cosmology and Quantum Effects

363

behaviour. Recall
[det( - GF)J 1/2 = exp[t trln( - G F )].
Now the field equation is
G~v

+ Ag~v

= -8nGT~v'

In a semiclassical theory, we substitute < T~v) so that T~v (matter part) is treated
quantum mechanically but gravitation is treated classically.
The action is S = Sg + Sn.
2

;-=g

JS

----=0
Jg~V

gives the field equation given above, while a variation of Sg gives the left-hand side
and that of Sm gives the right-hand side.
We seek a quantity called 'effective action' for the quantum matter fields
which, when functionally differentiated, gives < T~v)

JW

~ ~ ~v = <T~v)'

v -g

ug

The generating function Z is given by

Z[JJ = f g[ J eXP{iS n [ J + if J(x)(x) d 4


=

x}

<outOIOin).

J is the external field and can produce particles. In flat space when J = 0,

Z[OJ = <0 I 0) = I
but this is not true in CST

JZ[OJ = i g[ JJS m eiSm[<I>l

= i<out OIJSmlO in),


I.e.

JZ[OJ

r-:. ~ = I <

-g ug

So if Z[OJ
W

.
out 01 T~vIO Ill).

= eiW then

= -iln <out 010 in)

and
<out 01 T/lvIO in)
<outOIOin) .

N. Panchapakesan

364

We now change to

d"x

->

~ d"x; b"(x - y) -> b"(x - y)J -g(x),

d" xJ - g(x) b"(x - y) J - g(x) == 1,

Z[O] ex [det( - GF)]l/Z,


W

= - i In Z[O] = - ti Tr[ln( - GF )].

If
KXY = (Ox

GF -- - K -

+ mZ 1 -

+ ~R) b"(x -

ie

-'
I

en

e-

iks

y)J - g(x),

d s,

If GF is an operator which acts in the space of vectors

I x> ,

<xix'> = b"(x - x')[ - g(x)] - 1/2,

IX! exp( - iks)(is)-li ds =


Ei(x)

= ')' + In( -

x)

Ei( - iAk)

(exponental integral function)

+ O(x).

Thus

IX! exp( - iks)(is)-l ids = In( - G

F)

-In(k)

and
<xlln( -G~S)lx'>

= -

foo G~S(x, x') dmZ,


m2

the integral over dm z brings down extra power of (is) -1 in ( -In K)


W=

~fd"X~
2

In,

roo dm2G~S(x, x'),

x~x Jm2

interchanging order of integrations

W is called the effective action (also denoted by r). If


W ==

f err dnx ==

fJ -

g(x )L err d"x,

Inflationary Cosmology and Quantum Effects

365

then

The divergence as x -> x' makes Leff diverge at lower vaues of s.


In four dimensions, the divergent parts of Lerr are
L div = - Lt

,:11/Z(X, x')

32n

x'-x

x [ao(x,x)

frye ds

-::r exp s

(z
(J)
m s - -2 x
s

+ isal(x,x') + (is)2 az(x, x')],

The remaining terms (a 3 ,etc.) are finite as x' -> x(s -> 0) as they cancel S3 in the
denominator. The expressions for ao' aI' a2 given earlier, are entirely geometrical,
depending on RI'W1r and its derivatives and contractions. They probe the local
geometry of the neighbourhood as they arise from the ultraviolet behaviour of
modes.
We can consider L div as a contribution to gravitational rather than quantum
matter Lagrangian. This is not true for the remaining finite portions of Leff which
depend on the large-scale structure of the manifold as well as the quantum state.
12.

t:

Renormalization

W=

G~S(x,

dm z

f dnxF9G~s(x, x'),

x') = - i,:1I/2(x, x l )(4n)-ni2

x ex{ - ( mZs
,:11!Z(X,X')

Leff = Lt ---n""'/2:-x-x' 2(4n)


X

00

j=O

oo (. \.i-I-(n/2)

IS,

t()

ids(is)-n/Z x

+ 2~S) ]F(X, x';

is),

alx, x ) x

e -i(m

s-(<7/2S))'d

s.

The integral diverges at the lower limit s -> 0 as the damping factor (J12s in the
exponent vanishes as x' -> x. So the first (tn + 1)-terms are divergent.
If the number of dimensions is treated as a variable and analytically continued
throughout the complex plane, we have (for (J = 0)

= t(4n)-n!2
where aix, x)

Jo

aj(x),

aix )(m Z )(n/2)-j

r0 -~),

366

N. Panchapakesan

We wish to retain Lerr as (length)-4, even when n #- 4. So we have to introduce


an arbitrary mass scale Jl and write
Lerr

=!(4n)-n/2(~)n-4 .f
Jl

j=O

aj(x)m 4 - 2j

r(j - '2!.).

As n -+ 4, the first three terms diverge because of poles in the

r(-'!.)= 4 (_2
r(l _'!.) =_2_ (_2__ Y) +
r(2 -'!.) =_2__ +
2

n(n - 2) 4 - n

2-n 4-n

4- n

L div =(-4n)-n /2

r -function

-Y)+O(n-4),
O(n _ 4),

O(n - 4),

{n~4 +~[Y+ In (::)J} x

4m4ao
2m 2a 1
x(
---+a
n(n - 2)

n- 2

2'

where
( ~)

n-4

Jl

= e(n-4) In(mM = 1 + !(n -- 4) In ~ + O(n - 4)2


Jl2

and we have dropped the terms that va.nish, as n -+ 4,

aix)

Th R a/l y6 R a/ly6 - Th Ra/l Ra/l


+ !(i - ~)2 R2.

i(! -

~) 0 R

[These are the Meenakshisundaram-De Witt-Hadamard coefficients, sometimes abbreviated to 'Hamidew coefficients'.]
So Ldiv is a purely geometric expression and we can absorb it into the
gravitational part of the Lagrangian
L =R-2AB=>=_R_ _ AB_
g
16nG B
16nGB 8nG B'
L

=-

(A + ~)
+ (B + _l_)R
8nGB
16n GB

Inflationary Cosmology and Quantum Effects

367

with

If A B, GB refer to 'Bare' values, then A and B renormalize AH and GB.

A = AB + 8nG BA,
G = GB/(l + 16nG BB).
The az(x) term is of adiabatic order 4 and is fourth-order in the derivative of the
metric. In general relativity, we have only the second derivative. When this term is
included, we have, on the left-hand side of Einstein's equation,
R/l V -1 R9/lv

+ Ag/l v +a(l)H/l v + {J(Z)H/l v +y(3)Hllv ,

where
(l)H

/IV

==_1_ _6_f ~gRZdnx


~ 6g/lV V - fJ
= 2R;/lv - 2g/l vDR .-

(2)H

/lV

!g/lV Rz

+ 2RR/l v'

== _1_ _6_ f ~gR'p R d"x


~

6g/lV

V - Y

ap

= R;/lv -19/lvDR - DR/lV -19/lv RaP RaP

+ 2RapRap/l v

= 2Ra/l;va - DR/l v -19/lv DR + 2R\ Rav -tg1lv Rap RaP'


(3)H

/lV

== _1_

_ b_
6g/lV

l..~gRaPYo R apyo d" X

V -

-19/lvR,pY/'JRaPyo +2R/lapvRvaPY-4DR/lv+

+ 2R/l v - 4R)la R\ + 4R'P R a/l/3v.


For n = 4, the generalized Gauss-Bonnet theorem states that
f d 4 x[ - g]l/Z[RapO"oR"PO"o

+ RZ-

4RapR"p]

is a topological invariant (called Euler number), so its metric variation will vanish
identically
(3)H

ILV

= _ (l)H

I1V

+ 4(2) H

JlV'

So we can consider the original Lagrangian to have these terms with coefficients
aB, bB' C Bwhich, with the addition of a, {J, y terms, become a, b, c. As only two are

368

N. Panchapakesan

independent (for n = 4), we can take c = O. These renormalized a and bare


determined experimentally (like G and A) and seem to be vanishing
a exp = 0 = bexp .

What we have used is called dimensional regularization.

13.

Other Methods

A second method is the zeta-function technique (described by Lohiya in Chapter


16 of this volume) which has the merit that analytic continuation automatically
removes all the infinite terms. A third method is point splitting, where we take

t l1 t l1 =
(J

26 2 I,

+ 1( -

1)

time-like (space-like),

1
L div CX 4
6

We then average over all directions til.

14.

Example of Back Reaction [3]

The back reaction on the gravitational field due to interacting fields, is seen in the
following example. Normally nO -+ 2y in 10~ 16 sec. In the presence of a gravitational field nO and 2y are produced, which later becomes 4y. We can take nO
mass to be zero when curvature is > 10 26 cm - 2 or IR(t) ~ 10 - 13 cm which is
greater than m n .
5e

= f3* FllvP v<p =

f36"py~k" Apk Y A~.

f3 has the dimension of length = 1.26 x 10- 16 cm.


For k = 0, the RW metric is conformally flat space and given by
ds 2 = a 2 (lJ)(dIJ2 - dx 2).
With a conformally coupled pseudo-scalar field, only the coupling breaks
conformal symmetry and this we handle in perturbation theory.

E = i"
L.

k.A

W)1/2 e (a e,kx
. ( -2V
k
U

at e-'kx)
H

'

Inflationary Cosmology and Quantum Effects


B

W)1/1

( -2V
= i"
~
k.)'

k /dk (a

k).

369

.
.
e'kx - atk)' e-'kx),

(k1 A1; k1' A2 ; k3 IS(1)IO>

= 4ifJ d 4 xyCga- 4 (k 1A1, k1 A1; k 3 1(E' B)rf>IO>.


Substituting E, ii, rf> and integrating over a space-like surface orthogonal to the
conformal Killing vector 0/0"
(k1 k1 k3 1S 1 10>

W' W

= - 2ifJ ( 2~w:
x

)1/1

(51<,+1<2+/(3.0 X

[e k ,(A 1)'(I<1 /\ ek ,(A 1)) + ek2 (A 1)K1 /\ ek ,]

I(Sl>I Z is the number of particle triplets per unit proper volume. Using

L [e

k ,'

K1 /\

ek , + ek2 (K1

/\

ek2 )]Z = 2(1

- k1 k1)1.

).,).,

The number of particles/unit proper volume is

4fJz

" = ~

a V

"

1<,1<21<3

w 1 w1

--(1 W3

,..,..

K1 KZ)

where

Passing to continuum and integrating over k3'

where
W3

= Iwi + w~ + 2K1 Kz

1/ 1 .

By inserting a factor in this integral (W1 + w1 + w3)/a, we get p (total energy


density created. There are additional oscillatory terms in (Too> which vanish in
this case.

370
15.

N. Panchapakesan

Applications

Consider a model a('1) = A2 + ('1/rlO)2 with A and '10 constants which represents
a universe contracting to a minimum value A 2 of scale factor and expanding
again. In the asymptotic region, it behaves like a matter-dominated Friedman
universe. Then

P = 16

'102/3 2
4 2 4
n A a

f d3k1 d 3k2 [

W 1W 2
- - ( W l +W 2
W3

x (1- kl(2)2exp - 2n'1oA(w1

+ W 3) x

+ w 2 + W 3 )]

= 105/3 2 /[4(2n)12 '18 A 10 a4 ].


Introduce RB (scalar curvature at bounce), where
a(O)

A2

For metric a('1)

= aB.

= A2 + ('1/'10)2,

the scalar curvature is

12

= ['16(A2 + '12/'15)3];

12

RB

= '15A6

Using
1
'1~
p

RBA6

12

and

A2

= aB' /3 = 12

10- 16 cm,

= 105/32 R~ a~ = 2 x 10- 47 R 3(a B)4 gm ,

Pnow
and
a

a n: w

(2n)12 6912 a 4

= Nk~MBR = 10-34 gm
c

(t
=

tn:w

cc

)2/3

2
; R B =-3
tB

Hence,
(pt 8 / 3 )now ~ 1O- 48 /t10/ 3 ,

with
tB

= 10- 26 cm.

cc

Inflationary Cosmology and Quantum Effects

371

So back reaction, due to particle production, avoids singularity and gives


a bounce.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.

A. Albrecht and P. Steinhardt, Phys. Rev. Lett. 48, 1220 (1982).


N. D. Birrell and P. C. W. Davies, Proc. Roy. Soc. A360, 117 (1978).
N. D. Birrell, P. C. W. Davies, and L. Ford, J. Phys. A13, 961 (1980).
N. D. Birrell and P. C. W. Davies, Quantum Fields in Curved Space, Cambridge University Press,
Cambridge (1982).
S. Coleman and E. Weinberg, Phys. Rev. D7, 1888 (1973).
G. W. Gibbons, S. W. Hawking, and S. T. Siklos, The Very Early Universe, Cambridge University
Press, Cambridge (1983).
A. Guth, Phys. Rev. D23, 347 (1981).
1. J. Halliwell and S. W. Hawking, Phys. Rev. D31, 1777 (1985).
S. W. Hawking, Phys. Lett. 1508, 339 (1985).
B. L. Hu and T. Shen, Phys. Rev. D31, 2000 (1985).
A. Linde, Phys. Lett., 1298, 177 (1983).
L. Parker, Phys. Rev. D183, 1057 (1969).

18. Quantum Cosmology


The Story So Far
T. PADMANABHAN
Theoretical Astrophysics Group, Tata Institute of Fundamental Research,
Homi Bhabha Road, Bombay 400005, India

What is that, Lord, which being known,


all these become known? - Mundako Upanishad.

1.

1.1.

Introduction
Quantum Gravity - a Distant Dream

While commenting about the state of quantum gravity, Lee Smolin has said in
1979: 1 " ... while there has been a lot of interesting and imaginative work ...
nothing which could be definitely called progress has been accomplished in this
time - say, two decades ... ". In the course of this chapter, I will try to convince
you that the situation' remains just as bad today.
Those who have doubts about the state of the art of quantum gravity need only
to ask themselves any of the following questions 2 : Did the universe have an
origin? What determines the mater content of the universe? Why was there local
thermodynamic equilibrium in the early universe? What happens to the matter
that falls into the black hole? Can quantum physics determine the topology of the
spacetime? [If so, how and, if not, why not?]. Can quantum physics change causal
relationships between spacetime events? Do the zero-point fluctuations contribute to gravity? How can one compute high-energy gravitational scattering
cross-sections? Why is the cosmological constant zero? All these questions are
linked with quantum gravity, either directly or indirectly. Today, after nearly
three decades of research in quantum gravity, we do not have clearcut answers to
any of the above questions.
The troubles with quantum gravity are partly mathematical and partly
physical. It is not known how to compute observable quantities in quantum field
theory based on an arbitrary Lagrangian. The only systematic method - based on
perturbation theory -- will work only if the Lagrangian belongs to a special set
[called 'perturbatively renormalizable theories']' Unfortunately, the gravitational Lagrangian does not belong to this subset 3 . The simplest route to quantum
gravity, therefere, is blocked.

373
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 373-404.

1989 by Kluwer Academic Publishers.

T. Padmanabhan

374

What are the alternatives we are left with at this stage?


(i) One may assume that gravity is special and need not be quantized. It is,
however, very unlikely that such is the case. Strong arguments, (bordering on
a proof!) can be given 4 to show that th,~re are compelling reasons to quantize
gravity.
(ii) It is possible that Einstein's theory of gravity is wrong or, at least,
incomplete. One can look for viable extensions of the theory which will be either
perturbatively renormalizable or perturbatively finite. Investigations in supergravity, string models, higher derivative gravity, etc., fall under this class. None of
these attempts have yet produced a viable, meaningful alternative which is free of
problems 5
(iii) One can attack the conventional gravitational Lagrangian by nonperturbative techniques. This approach probably has not received the attention it
deserves; whatever work that has been done so far, falls short of providing a clear
picture of the high energy behaviour of gravity6.
(iv) One can attempt to give a quantum mechanical meaning to some classical
solutions in gravity. By restricting oneselfto situations that possess a high degree
of symmetry, the mathematical complexities can be handled leading to a 'makeshift' theory.
The subject of quantum cosmology, which we shall now describe, is based on
the philosophy of (iv) above.

1.2.

What is Quantum Cosmology?

The role of 'quantum cosmology' in the framework of quantum gravity is


indicated in Figure 1. The correct,justifiable route to quantum cosmology should
be via the solid lines in the figure: combining quantum field theory and gravity,
one should obtain quantum gravity; thereafter, by specializing to cosmological
models one obtains the structure of quantum cosmology. Lacking in the ability to
achieve the former, one adopts a 'back-door entry' into the subject, indicated by
dotted lines in Figure 1.
The procedure involves restricting oneself to a particular form of the metric,
described by afinite number of functions qA(t) [A = 1,2 ... N, say] of some time
coordinate t. [All homogeneous cosmological models can be thus described 7 .]
Instead of handling the infinite number of degrees of freedom in gik(X, t), we
restrict ourselves to those classes of metrics which are describable by a finite
number of functions of time. This gigantic swindle [which usually appears in
published literature under the glib phrase 'restricting to the minisuperspace'! 8]
allows one to replace quantum field theory by quantum mechanics with a finite
number of degrees offreedom. The quantum theory of qA(t) constitute a 'quantum
cosmological model'.
Thus, all approaches to quantum cosmology provide a quantum mechanical
meaning to a classical cosmological solution. There is no guarantee that the

Quantum Cosmology

375

1
1

1
1
1

L __________ _

Fig. I. The routes to quantum cosmology. The solid lines indicate the proper route to quantum
cosmology-I, and require a knowledge of quantum gravity. All known models take the route
indicated by the dotted lines and reaches quantum cosmology-II. It is a pious hope of the quantum
cosmologist that versions I and II have at least something in common!

'quantum cosmology-II' thus obtained will agree with the correct 'quantum
cosmology- 1'. The equality or otherwise of I and II is difficult to prove, since we
have no handy version I! In other words, the validity or otherwise of any
quantum cosmological model can be verified only after a complete theory of
quantum gravity is available. How can one, therefore, choose between different
quantum cosmological models?
In this chapter, I shall emphasize one single criterion of selection above all
others: The model should provide a working formalism which will give clearcut
'yes/no' answers to questions regarding cosmology. Further, as much as possible,
one would like to keep the methodology free of conceptual problems.
It should be clear from the above discussion that each quantum cosmological
model is characterized by (i), the choice of variables qA(t), and (ii) the method used
in quantizing these variables. Both of these steps are nontrivial and are burdened
with conceptual problems. Instead of discussing these problems in a general
context, we shall concentrate on one specific model 9 . Comparisons and
digressions will be made wherever appropriate.

2.
2.1.

Minisuperspace of Conformal Degree of Freedom


Quantizing the Conformal Part

To motivate the particular approach to quantum cosmology we are interested in,


it is necessary to recognize one major conceptual problem in quantum gravity:
that of fluctuating light cones.
In standard, nongravitational field theory, quantum fields propagate from one
space-like hypersurface to another. The concepts of space-like, null, and time-like

T. Padmanabhan

376

separation of events is a property of spacetime and is independent of the fields. In


quantum gravity, this stuation is drastically altered. The light cone structure of
spacetime is determined by the metric, which itself has now become a dynamical
variable. Light cones and causal relationship between events are no longer fixed
a priori, but arise out of dynamics; on the other hand, dynamics cannot be
formulated until light cones are given!
How can one make headway in such a situation? In a full theory of quantum
gravity, this problem has to be faced and solved, but is it possible to avoid this
problem by a judicious choice of variables qA(t), in the 'minisuperspace' models?
The answer is 'yes'. Consider the class of all metrics gik(X, t) of the form,
(1)

in which gik is some specified fiducial metric and a 2 (x, t) in some c2 -function. It
can be easily shown that all the metrics gik(X, t) [for various choices of 0] have the
same light cone structure as gik 10. Conversely, a (given) light cone structure for
the spacetime will determine the equivalence class ofmetrics in (1) which differ by
the choice of a. Thus, any metric can be separated into a 'conformal part' a 2 (x, t)
and a 'light cone part' gik. In the spirit of ,mini supers pace' approach we may now
consider the quantization of the conformal degree of the metric, viz. the
quantization ofa(x, t). As along as quantum fluctuations leave gik fixed, the light
cone structure is not altered.
In addition to the advantage of bypassing light cone fluctuations, the
conformal quantization also possesses the following attractive features: (i) The
separation in (1) is generally covariant. Under a coordinate transformation,
n transforms as a scalar and gik transforms as a second-rank symmetric tensor. (ii)
All homogeneous, isotropic spacetimes, considered in cosmology, are conformally flat 11 . In other words, a conformal degree of freedom is the only true
degree of freedom in Robertson-Walker cosmology. (iii) The Lagrangian
describing the dynamics of the conformal factor a(x, t) is quadratic in 0,
facilitating nonperturbative calculations. We shall say more about these aspects
later.
The mechanics of quantizing a(x, t) works easiest if we approach quantum
gravity using Feynman's path integral approach. In this approach, the probability amplitude for transition, from a particular three-geometry G1 to another
three-geometry G 2 , is given by

K(G 2; Gd =

f ~gik exp iA[gik]'

(2)

where A(gik) is the action for gravitational field and matter


A(gik)

= 1j(16nG)

fR~

d4 x

+ A(matter).

(3)

The 'sum over paths' in (2) is taken over all metrics which are consistent with the
three geometries G2 and G1 at the boundaries of the spacetime domain under

377

Quantum Cosmology

consideration 12 . Let us now suppose that we are only interested in the metrics of
the form in (1). For these metrics [with a fixed gik]' the action A becomes
a functional of O(Xi). Using the facts that
(4)

(here R is computed from gik and R is computed from gik = O-Zgik' etc.) we get 13

A(O, gik) = - 3/(8nG) f(OiOi -

iROZ)~ d x +
4

+ A(O; matter).

(5)

Treating O(Xi) as a quantum field described by the above action, in a given


background spacetime gik' we are led to the following transition amplitude
K(Oz(x), t z ; 0 1(x), t 1) = f !00(Xi) exp iA(O, gik)

(6)

Here A(O, gik) is given by (5) and the 'sum over paths' is over all O(x, t) which
satisfy the boundary conditions
(7)

The kernel Kin (6) is interpreted as the probability amplitude for the metric to
make the transition from Oi(x) gik at t 1 to O~(x) gik at t 2
One major difference between the kernels in (2) and (6) is the appearance of the
'time label' t1 and t2 in (6). Why is it absent in (2) but present in (6)?
Let us look at Equation (2) first. Formally speaking, the time variable does not
appear in (2) because ofthe dogma 'three geometry carries the information about
the time'. This statement is perfectly correct in classical physics. Using the ADM
separation of Einstein's action, it is straightforward to understand how this
notion arises [see, e.g., ref. 14]. This concept is less well-defined in quantum
gravity, and it is not easy to interpret the 'timeless transition' of (2) in
unambiguous terms. One way of attacking this problem would be to include
a 'clock field' C(x, t) in the matter action. Then one may ask the question: 'Given
that the three-geometry was G 1 when the clockfield was C l' what is the
probability amplitude for the three-geometry to be G z when the clockfield is C z1'
If a field model for the clock, which is sufficiently (i) semiclassical as to be not
affected by quantum fluctuations, (ii) regular and monotonic, and (iii) sturdy to
withstand arbitrarily large tidal gravitational forces, can be constructed, then the
above procedure will agree with our intuitive concept of time. However, it is not
at all clear whether one can, indeed, construct such a model within conventional
physics. If it cannot be, then the operational notion of time has to be drastically
modified in quantum gravity15. On the other hand, if such a model can be
constructed, then it is completely equivalent to the original time label! Instead of
K(G Z,C Z;G 1 ,Cd, we may as well talk about K(G Z,t Z;G 1 ,t 1 ), assummg an
infinitely rigid structure for the clock field C(x i ).

378

T. Padmanabhan

In Equation (6), the situation is different. The variable Q(Xi) cannot specify the
three-geometry completely and, hence, there is no question of 'time' being already
specified. As long as Q(Xi) is treated as a quantum field in a given background 9ik'
the time label in (6) is necessary and, of course, consistent. For operational
purposes, one mayor may not introduce a 'clock' in (5); this exercise has no deep
significance within the present minisuperspace formalism.
Coming back to the formalism in (5) and (6), we still have to obtain a wavefunction(al) from the kernel. For a moment, assume that Q(x, t) is independent of
x. Then, the kernel in (6) propagates the wavefunction t/J(Q, t) by the usual
relation,
(8)

In other words, given the initial wavefunction t/J[Q 1 , t 1 ], the kernel determines
the wavefunction t/J[Q 2 , t 2 ] for any other time t 2 The wave function t/J[Q, t] will
satisfy the Schrodinger-like equation

(9)
where H is the Hamiltonian corresponding to the action in (5). If Q depends on
x as well as time t, then the Schrodinger equation will become a functional
differential equation and the ordinary integration in (8) has to be replaced by
functional integration. These aspects are discussed in Appendix I.
There is one aspect in conformal quantization, which has led to some confusion
in literature 16 ; it is probably worthwhile commenting on this aspect.
In the metric signature (+ - - -) used in our discussion, an ordinary scalar
field should have the 'kinetic energy term',
(10)

while the conformal factor appears in (1) with the 'wrong sign' (-!QiQJ What
do we make of this?
There are situations in field theory in which an unphysical (or gauge) degree of
freedom will appear with the 'wrong sign' in the kinetic energy. One may be
tempted to draw an analogy and declare Q(x, t) as an unphysical/gauge or 'ghost'
mode. Such a conclusion would be wrong. A gauge degree of freedom is
something which can be transformed away without changing physical variables.
Physical variables in gravity (e.g. scalars made out of curvature tensor), will
change under conformal transformation. As we said before, FRW spacetimes are
all conform ally flat. If conformal mode is unphysical, then FR W spacetimes
should be physically the same as flat space; nothing could be farther from the
truth. It is simply incorrect to consider n as unphysical.
What is wrong with a negative kinetic energy term, anyway? Let us take

379

Quantum Cosmology

a closer look at a mode q(t) with the following Lagrangian,


L=

_tq2 - V(q).

(11)

To begin with, note that nothing can be said about (11) until the form of V(q) is
specified. For example, if V(q) is (-1 (02 q2), then (11) is identical to the
Lagrangian of a simple harmonic oscillator except for an, unimportant, overall
minus sign. On the other hand, suppose V(q) = + 1(02 q2. Then (11) corresponds
to the Hamiltonian
(12)
which is unbounded from above and below. In other words, the Hamiltonian in
(12) will not have a stable ground state. All solutions will be of 'run-away' type.
Such a situation is taboo, because normal physical systems - crystals, solids,
earth, stars, galaxies, etc. - are seen to be stable, and do not exhibit 'run-away'
behaviour. They must possess stable ground states.
On the other hand, there is no assurance whatsoever that the physical universe
has a stable ground state! All evidence, in fact, shows that the universe is probably
an unstable mode of a quantum fluctuation. Thus, one will be predeciding the
issue if the minus sign in (24) is not taken into account in quantum cosmology.
Even though a stable ground state may not exist for the Hamiltonian in (12),
there is no difficulty in obtaining a path-integral kernel. Once the kernel is
obtained, any initial wavefunction can be propagated in time by the kernel. Thus,
quantum theory can be formulated using path integrals, even for Hamiltonians
like (12)17.
The only discernible problem with (11) is that one cannot define a 'Euclidean
version' of the theory with positive definite action. That is too bad for any true
believer of'Euclideanization', but it does not pose any unsurmountable problems
in real time. Notice that an ad-hoc prescription 18, changing q to (iq) and t to ( - iT)
will give a positive definite Euclidean action. However, there is no assurance
whatsoever that such a model describes the same physics. For example, if
V(q) = q2

+ q4,

(13)

then the Euclidean version will have (with q


V(q) = _q\

iq).

+ q4.

(14)

The relative change of sign between q2 and q4 terms alters the physics drastically.
Such prescriptions must be treated with caution.
2.2.

Wheeler-De Witt Equation and the Conformal Factor

The basic formalism dealing with conformal minis pace has been described in the
previous section. As we shall see, this formalism differs from other conventional

T. Padmanabhan

380

models in some crucial ways. We shall illustrate these points of departure by


using a simple example: that of a massless scalar field <p(t) in a k = + 1 FRW
universe. Such a system is described by the metric,

d~- - r2(d8 2 + sin 2 8 dcp2) ]


ds 2 = 02(t) [ dt 2 - 1 - r2

(15)

(16)

and the action

which can be written as


(18)

(19)

Here, the over-dot denotes differentiation W.r.t. coordinate t and the prime
denotes differentiation w.r.t. r. The variable lJ(t) is related to <p(t) by
'1

(4nG/3)1/2 <p.

(20)

The action in (18) or (19) describes a dynamical system with two degrees of
freedom '1(t) and O(t). The crucial feature to note is that A in (18) contains the full
classical dynamics of the (0, '1) system. Varying O(r) and '1(r) in (19), say, we
obtain the following equations
0"

20+

(0' )2

+1

02

(21 )

-3('1),

1 d
3
03 dr (0 IJ) = O.
I

(22)

These are identical to the 'space-space' part of Einstein's equation and the field
equation for the scalar field. Defining a function
h(r) = 00 '2 - 0 3'1 '2 + 0 =

03[(01~2+ 1 -

'1'2 ]

(23)

and using (21) and (22), we can get


dh

hi = dr = 0 => h = constant.

(24)

Quantum Cosmology

381

There is no way the dynamics can determine the value of h(O). Dynamics merely
states that h(,) = h(O).
In this crucial aspect, the system described by (18) or (19) is more general than
the full gravity-scalar system of (17). Had we worked with the full set of Einstein's
equations, we would have obtained, in addition to (21) and (22), the constraint
equation

(25)

O.

Equations (25) and either one of the pair (21) or (22) will imply the other equation.
However, as we have said before, (21) and (22) leads only to (24) of which (25) is
a special case.
One can obtain (25) by including one more degree of freedom. Consider, for
example, the metric
(26)

for which A becomes


A= -

~~

d,(00'2 N- 1 - ON - N- 1 0 3 0'2).

(27)

Varying 0, t/ and N will now give (21), (22), and (25). The constraint (h = 0) is
merely a statement about the 'reparametrization invariance' of the theory under
the coordinate changes,

, ...... ,' = ,'(c).

(28)

Let us now pass on to the quantum version of the model, starting with (18). The
'Schrodinger equation' corresponding to the dynamical system in (18) is

.al/l

lat [0, t/, t]

1 (a 2
1
2m a02 - 02

a2 )

at/2

1/1 -

20

1/1,

(29)

where m = 3n/2G.
The version of the Schrodinger equation (29) is by no means unique. An action
of the form (19) will, in general, pose factor-ordering problems. By switching back
to (18) and assuming 0 to be in the range [ - 00, + 00], we have pushed the
factor-ordering problem under the rug. [A different version would be obtained,
for example, if we take 0 to be in the range (0, (0).] This pro blem, however, is not
very serious; especially because it is not specifically related to quantum gravity.
We separate out the t and t/ dependence by choosing the stationary state
ansatz
1/1 EP(t, t/, 0) = e + iE! eipq F Ep(O)

(30)

with F satisfying the equation


1 d2F
2m d0 2

p2

m0 2

+ 2m02 F + -2- F = EF.

(31 )

T. Padmanabhan

382
We can write the most general solution to (29) as
ljJ(t, '1, Q) =

L C(E, p)F

Ep

(0) e iEt eip~.

(32)

E,p

The coefficients C(E, p) have to be determined by boundary conditions at, say,


t = O.
All these can be equivalently described by the path integral kernel

K(02 K 2t 2;01'1l t d= f@o f@'1 eX PiA[O,'1,],

(33)

with the propagation equation

H 0 2' '12,t 2] =

(34)

dOl d'11 K (2, 1)1jJ[01' '11,t 1].

The inter-relationship between K(2, 1) and the stationary states is given by the
Feynman~Kac formula
(35)
Equation (29) or Equations (33) and (34) provide the quantum description of the
dynamical system in (18).
The quantum theory described so far is based on the action in (18) and, hence, is
more general than the gravity-scalar system based on the original action (17). Can
one introduce (25) into the quantum theory?
Noticing that h is proportional to the classical Hamiltonian, the extra
constraint (25) has the operator equivalent in quantum theory
H.I, =
'I'

~ (iJ21jJ

2m a0 2

~ a21jJ )
0 2 a'1 2

~02.', =
2

O.

(36)

'I'

Or, more simply, we can put E = 0 in (31), (32) etc. Thus, the constraint equation
chooses the 'zero energy' wave function among the stationary states of(29). Once
E is set to zero, IjJ is independent of time and is trivially invariant under (28).
The zero-energy wavefunction can also be interpreted in a straightforward
manner in terms of the solutions of (29). We know that,
L+CXJOO dt IjJ [0, '1, t]

L+oooo dt

dE e iEr IjJE(O, '1)

IjJ(E=O)(O, '1).

(37)

Thus, a particular solution (36) can be obtained from the solution to (29) by
integrating over the time coordinate.
Equation (36) is known as the Wheeler~De Witt 19 equation (WD) [see
Appendix 2]. We shall call (29) the 'Generalized Wheeler~De Witt' equation
(GWD). All conventional approaches to quantum cosmology demand that the
wavefunction satisfies the WD equation. We shall now compare GWD and WD
and argue that there are situations in which GWD is more appropriate than WD.

Quantum Cosmology

383

To begin with, notice that GWD is the result of quantizing the two degrees of
freedom 0, and 1]. If we knew nothing about Einstein's theory and were given the
action (18) or (19) as the starting point, we would have only reached GWD. The
WD is obtained when we work with three degrees offreedom 0, 1] and N. Thus, in
a 'strict minisuperspace' of (0, 1]), the GWD is the appropriate equation.
As an immediate consequence ofGWD, one loses the invariance of the theory
under time relabelling. This seems to make some physicists panic, even though
there are many reasons for sacrificing manifest time invariance in quantum
gravity:
(i) The complete theory of quantum gravity, probably, should be invariant
under the relabelling of coordinates. But one is certainly entitled to violate this
symmetry when only selected degrees of freedom are quantized. In the standard
minisuperspace models, a special choice of spatial coordinates is already made.
The description is specifically tuned to this coordinate system. In proceeding
from WD to GWD, we are accepting a special choice oftime coordinate as well.
(ii) At some stage, one has to interpret the wave function t/I in terms of
measurements and observables. A time label is both necessary and natural in such
a description. Measurement of any physical variable by a 'participatory
observer'zo introduces a time label if it is not already present; once again,
t/l[0, 1], t] is better suited here than t/I[O, 1]].
(iii) In spite of many attempts, the physics of the WD solution t/I [0, 1/] still
remains obscure 21 with many pecularities: (a) instead of ' evolution', one is forced
to talk about 'correlations', (b) no concept oftime emerges naturally from GWD,
and (c) no probability interpretation is possible for t/l[0, 1]]. By retaining
1/1[0,1/, t], we can bypass these problems.
(iv) It is well known that even the ordinary field quantization has an intrinsic
coordinate dependence, especially on the choice of time coordinates 22 . Since the
observer's role is more crucial in quantum cosmology than in field theory, an
intrinsic dependence on a particular time coordinate should not a priori be ruled
out.
Lastly, it must be remembered that GWD is more general than WD. One can
always recover the WD wavefunction by integrating over time, as suggested in
(37).
Does all these mean that one completely ignores the constraint equation (25)?
Certainly not, that would be disastrous because one will not obtain the correct
classical limit. We shall incorporate (25) into the classical limit, by choosing the
wavefunction to be peaked at the classical evolution. That is, (25) will be satisfied
in terms of the expectation values
(38)
Notice that (38) is in terms of <0)' etc., and not in terms of <0'). We shall say
more about it in later sections.

T. Padmanabhan

384

3.
3.1.

Quantized FRW Universes


Conformal Factor and the Singularity

In this and the next two sections, we shall study the effect of quantizating the
conformal mode in a standard Friedmann-Robertson-Walker universe filled
with dust or radiation. Since most of the details are available in published
literature 23 , we shall only stress the highlights.
We begin by asking the question: What do we expect from quantum
cosmology? In other words, when should one consider a quantum cosmological
model as successful?
To merit attention as a working model, a theory should satisfy the following
two criteria:
(i) The model should be free of singularities [and, if possible, should be free of
horizons]. Singularities are simply unacceptable in physics; the relativist is living
with the singularities of classical gravity only because of the hope that quantum
gravity will remove them one day. If quantum cosmology does not remove the
singularity, then we will probably be forced to modify the classical gravity itself.
(ii) The model should describe the classical limit of the theory correctly, in
terms of the expectation values of physical observables. In a sense, this criterion
operates as a boundary condition on the choice of the wavefunction of the
universe, and incorporates the constraint equation.
These criteria suggests the comparison between the physical universe and
hydrogen atom 24 , shown in Table I. The physical reason for the existence of
a ground state in a hydrogen atom has to do with the quantum fluctuations. If the
electron is confined to within a distance r from the nucleus, then quantum
fluctuations provided an energy
E fluc

~ 2~ (:r Y :~ r12

(39)

Table 1. The analogy between hydrogen atom and the universe [details are developed in
Ref. 23]
Hydrogen atom

Universe

(i) Constituents

Electron + proton;
Coulomb fome

matter;
gravity

(ii) Classical description

q(t); 'trajectory'

Q(t); 'expansion factor,

(iii) Source of
trouble

q(t)

--> 0, in finite
time; singular

Q(t) --> 0 in finite


time; singular

(iv) Quantum description

if! [q, t]; wavefunction

if![Q, t]; wavefunction

(v) Main new


feature

Quantum fluctuations
lead to nonsingular
behaviour

Quantum fluctuations
lead to nonsingular
behaviour.

Quantum Cosmology

385

in addition to the coulombic energy


Ecoul

(40)

= - q2/ r .

The minimum of (Efluc + E cou1 ) is at a nonzero radius r :::0 r Bohr ; if the fluctuations
were not present, then the minimum of Ecoul will be at r = O.
Similarly, we may expect the quantum gravitational fluctuations to playa role
in stopping the collapse of a classical universe. To quantify this idea, consider the
FRW model
ds 2 = gik dxi dx k = dt 2 - Q2(t) {I

~:r2 + r2(d{l2 + sin 2 (l d<P2)}.

(41)

where Q(t) denotes a given classical solution to Einstein's equation (with, say,
radiation as source). How does quantum conformal fluctuations (QCF) evolve in
this spacetime? To study this, consider the class of all metrics of the form
(42)

and treat 0 (or </J) as a quantum variable. In the present epoch the universe is
classical, and 0 :::0 1 to a large accuracy; therefore we may take the present
wavefunction 1/1[0, t now ] to be a sharply peaked Gaussian around 0 = 1
(43)

where (Jo denotes the present uncertainty, (Jo 1. The wavefunction at any other
time t can now be found by using (8) and (5). Omitting the details 25 , we merely
state the result: The wavefunction has the same form as (43),

HO, t] = [2n(J2(t)] -1/4 exp { -

(~~(~r}

(44)

where (J(t) - the spread at time t - has the asymptotic behaviour


(J(t)

:::0

[A/Q(t)],

as Q(t)

->

O.

(45)

In other words, quantum fluctuations (J(t) around the classical mean value
diverge as the singularity (Q = 0) is approached. This allows one to draw three
important conclusions:
(i) The average value (0) (= 1) has no meaning when the mean-square
deviation
!1)2) is divergent. Thus, the classical solution is drowned in a sea
of quantum fluctuations as the singularity is approached.
(ii) The analogy with hydrogen atom is strengthened. Quantum fluctuations
dominate the behaviour near the singularity.
(iii) The expectation value of gik in the quantum state (44) is nonsingular! We
see that, if we define,

0 -

(46)

T. Padmanabhan

386
then
lim

Q;verage

= lim (1 + a 2(t))Q2(t)
t-oo

t-O

= hm
1-->00

A2
-2

-2

Q (t) = A

;6

(47)

O.

Thus, the universe does avoid the singularity in the sense of the expectation value
of gik.
The above conclusions are actually of more general validity than indicated by
the present discussion. Indeed, it can be shown that, QCF diverge at a singularity
in any spacetime [see ref. 26].

3.2.

Stationary States of the Quantum Universe

Encouraged by the success of the previous section we shall proceed further to


develop a model of quantized FRW universes. Ultimately, the dynamics of both
Q (x, t) and gik must be describable in terms of quantum (or at least, semiclassical)
cosmology. This question will be addressed in the next section. As a preliminary,
we shall take a closer look at the k = + 1 FRW model, filled with radiation. We
first rewrite (41) as
2
2 - 1r
[
ds 2 = Q2(t)
dt d
_ r2 - r2(d8 2 + sin 2 8 dq/) ] .

(48)

[The time coordinate t in (41) and (48) are, of course, different.] In considering the
class of conformal spacetimes to (48), we have to take into account all functions of
the form (Q2(t) Q2(t). Clearly, we can treat QQ as a single quantum variable and
consider the class of all spacetimes of the form
ds 2 = (Q2(t [ dt 2 - 1

~2r2 -

r2(d8 2

+ sin 2 8 dcp2) J.

The action governing the dynamics of Q(t) has the form [see (18) with '1

-l.m
2

l2

3n
dt(Q2 - Q2). m = '2G

(49)

= 0]
(50)

[We have assumed that matter part ofthe action is conformally invariant and is
independent of Q.] The classical solution to the equations, bA = 0, is

Q(t) = Q o sin t
which is singular at t

= Q4'

(51)

= 0 and t = n. If the

radiation density is given by

(52)

387

Quantum Cosmology

then 0 0 is related to B via the constraint equation


Q2

+ 0 2 = 06 = 8~G B.

(53)

As explained at length in Section 2.2, the constraint equation is a separate input if


the minisuperspace is described by (50).
The GWD equation, describing the quantum evolution of 0, can be obtained
by quantizing (50)
.aljJ __
1 a2 ljJ _ ~ 0 2 ,/.
2m 002
2 m
'/'.

at -

(54)

The GWD equation has the stationary state solutions


(55)
where 4>.(0) are the standard harmonic oscillator wavefunctions. A general
solution to (54) can be obtained by the superposition
ljJ(O, t) =

C(E)ljJE(O, t).

(56)

How does one choose c(E)? It will be remembered from Section 2.2 that GWD
equation does not incorporate the constraint equation, while the expectation
value <0> must satisfy the constraint equation. In other words, <0> is
constrained to be the classical solution
<0> = 0 0 sin t =

Cn~Byi2 sin t.

(57)

This is clearly a constraint on the choice of c(E). The simplest choice for ljJ(O, t),
which incorporates the semiclassical behaviour and (57), is the coherent state
with
IljJ [0,

tW =

(nL;)-112 exp { -

~; (0 -

0 0 sin t)2},

(58)

where L; = (2G/3n). With this choice, the expectation value of the line element is
ds 2 = <0 2

>[dt

2 -

~
1 - r2

r2(d8 2 + sin 2 8 d<P2)]


(59)

Clearly, quantum fluctuations have led to a nonsingular universe which


bounces at Lp( ~ 10- 33 cm). A simple calculation will show that the universe is
also free of particle horizons. Besides, (59) also has the correct classical limit.
In this quantization scheme based on GWD, we have thus satisfied both the
criteria suggested earlier. Notice that the constraint equation (53) is satisfied by

T. Padmanabhan

388

(0). It is also satisfied by (0 2 )1/2 in the classical limit, when (0 2 )1/2 ~ (0). In

other words, we treat the invariance under time relabelling to be a classical


concept, valid only in the classical limit. The full quantum dynamics depends on
the observer (as probably it should) and on his choice of time. Conventional
quantum cosmologists claim 27 that 'time has no meaning in quantum gravity'.
We replace this by a weaker dogma, 'invariance under time relabelling is
a classical concept'.
The coherent state chosen in (58) is, of curse, a special choice motivated by the
desire to minimize quantum uncertainty. In general, one should work with the
stationary states given in (55). The geometry in the quantum stationary states
(denoted by 'QSG') are analogous to the stationary energy levels of the hydrogen
atom. The QSGs are independent of the boundary conditions chosen for the
wavefunction t/t[0, t]. The line element is nonsingular in all QSGs and is given
by

(60)

A more detailed discussion of QSGs and their applications can be found in ref.
[28J.
3.3.

Semiclassical Cosmology

So far, we have concentrated on the conformal part of the spacetime metric 0 2 gik'
and assume gik to be 'God-given'. A complete theory of quantum gravity should
include both 0 and gik in its fold. Lacking such a description, we ask the 'next best'
question: Is it possible to treat 0 as a quantum variable and to treat gik as a
semiclassical object? In other words, can one incorporate the 'back reaction' of
o on gik in some systematic manner?
Such a procedure can indeed be devised, as we shall indicate below. To
motivate the approach, let us first consider a simple dynamical system of two
degrees of freedom [ql(t) and q2(t)], described by the Lagrangian
L =

1ml4i + lm24~ ~ gqlq2

(61)

The classical theory is based on the equations of motion


m 1iil =

~gq2;m2ii2 = ~gql'

(62)

while a full quantum theory will be based on the Schrodinger equation


.ot/t
1 02t/t
1 02t/t
l--;-t (qlq2 t ) = ~-2 -;---2~-2 -;;2"=gqlq2t/t
u
m 1 uql
m2 cq2

(63)

Quantum Cosmology

389

As an intermediate step, we may be interested in the approximation of ql being


classical and q2 being quantum mechanical. Given a particular trajectory q 1 (t), q2
can be quantized in this 'given background' via the Schrodinger equation
. atjJ
_
1 a2tjJ
_
lai(q2,t;ql(t = -2m 2 aq~ -gql(t)q2tjJ

(64)

Now we have to determine q(t) self-consistently. One possible approximation is to


replace (62) by

m1

= -g<q2) = -g L+ooCD ItjJI 2 q2 dq2

(65)

Equations (64) and (65) are simultaneous integro-differential equations for


tjJ(Q2' t) and q1(t). The expectation value 'drives' the semiclassical ql(t) and
tjJ(Q2' t) describes the quantum dynamics of Q2.

Similar considerations apply to the action in (5) describing the geometry in


terms of 0 and gik. One can write down semiclassical equations for gik in which
expectation values of functions of 0 serve as the source [similar to (64)]. Since
O(x, t), in general, depends on space as well as time, these equations are
mathematically more complicated. The Schrodinger equation is a functional
differential equation
(66)
with

~ _

:If - -

[gOO
J2
~

- (-aP
b 2 - goo
on 0)
b0
g Va V p
2

.1.

-02

6R

(67)

The 'semiclassical' equation for gik is


(0 2 ) [Rik - HikR] = -8nG Ttl

+ tik'

(68)

where
(69)

Obtaining solutions to these equations is, clearly, a nontrivial task. It is fortunate,


however, that they admit exact, self-consistent, FRW type solutions. For
example, there exists k = 0 FRW model with the line element
<ds 1 ) = L;(2n

1)(1 + ~~){dt2 -

drl - r2(d{:/2

+ sin 1 8dq>2)}.

(70)

The quantum state of 0 corresponding to (70) is described by the wave function


(71)
III

which n(O) are the standard harmonic oscillator wavefunctions. Thus,

T. Padmanabhan

390

Equations (70) and (71) self-consistently describe the background metric [Jik and
the quantum state t/J[Q, t]. Models for k 'f~ 0 and a more detailed discussion of the
above features, can be found in ref. [29]. All these solutions are (i) nonsingular,
(ii) free of horizons, and (iii) have the correct classical limit. These results suggest
that quantization of the conformal mode is a viable approach to quantum
cosmology - at least, as viable as any other approach!

4.

4.1.

Applications of Quantum Gravity

Cosmogenesis and Inflation

A successful quantum cosmological model, as we said before, must possess


nonsingular solutions with correct classical limits. However, a good formalism
for quantum gravity may help us to understand some other - deeper - questions
of physics. The present section and the next explore these possibilities.
One central question in cosmology is the following: Why is the universe there
at all? One possible mathematical translation of this abstract question is the
following: Einstein's equations admit (i) flat empty spacetime and (ii) a FRW
universe with initial singularity. Why does nature prefer (ii) over (i)?
Within the formulation of quantum gravity presented here, the above question
has a simple answer: The equations of semiclassical cosmology does not admit
flat, empty spacetime as a solution. [This can be seen from (68) and (69) and is
discussed in detail in ref. [30].] Stated in another way, flat spacetime is unstable
to quantum gravitational fluctuations. This instability leads to the 'creation of the
universe'.
We shall briefly describe two such models for cosmogenesis. [The details can be
found in refs. [31,32].] Consider a system described by the action

= 16~G

R~ d

+ t f(BiB i - iRB2 -

m2B2)~ d

x.

(72)

The first term is the gravitational action and the second represents the action for
a scalar field B(r, t) of mass m with an additional coupling i RB2. It is easier to
proceed by writing the Robertson- Walker spacetimes in a conformally flat form
(73)

Thus, we are left with two field degrees of freedom Q(r, t) and B(r, tl. Now we
make a conformal transformation to
(r, t) = Q(r, t) B(r, t)

(74)

and write
3

t/J(r, t) = ( 4nG

)1/2

Q(r, t),

(75)

391

Quantum Cosmology
so as to reduce (72) to the form
A =

t f(ii

_I/Iil/li -

IX~21/12) d

4 x,

(76)

where IX~ = (4nGm 2 /3) is the analogue of 'fine structure constant' in gravity. The
Euler-Lagrange equations i5A = 0 for (76) are
D

+ IX~1/I2 =

0,

(77)
(78)

We shall assume that is a quantum field and that (77) is the operator equation
for in the Heisenberg picture. On the other hand, 1/1 which determines the metric
via (75) and (73) will be treated as c-number. We will interpret (78) as
(79)
The flat (vacuum) spacetime corresponds to the choice
1/1 = constant = (m/IXG)'

(OI210) =

o.

(80)

(81)

Clearly, the equations are satisfied by this choice as long as we can ensure (81). We
know that, formally, the left-hand side of (81) diverges because of zero point
energy. However, we shall assume that this divergence is subtracted out in the
quantity <OI210), ensuring (81).
Is this flat, vacuum spacetime stable? To study this, we perturb the spacetime
slightly at t :? 0 by putting
m

I/I(t) = -

IXG

+ e(t), for t > O.

(82)

When the field is quantized in this perturbed spacetime (82) via Equation (77), we
will get a nonzero value for <OI210) which will, in general, depend on e(t).
Substituting back into Equation (79), we will get an equation for perturbation e(t).
If this equation has exponentially growing solutions, then we can conclude that
flat vacuum spacetime is unstable to perturbations of the kind in (82). Such an
analysis can be performed by Fourier transforming and making a mode-bymode analysis. It can be shown that e(t) satisfies the following equation [see ref.
[31] for details].
(83)
with
(84)

This equation can be solved by Laplace transform techniques. Denoting the

T. Padmanabhan

392

Laplace transform of 8(t) by 8(S), we get the transformed equation

m2ct~.

..
{
8(S) = 8n 2 8(0)

L(s)

(m2ct~)

..
1- - - L(sl
2
8n

(85)

where
(86)
Though (86) can be evaluated in closed form, we do not need this result. It is clear
from (86) that L(s) is a monotonically decreasing function of lsi and has a
maximum at s = O. The maximum value of L is
(87)
From the theory of Laplace transforms, we know that 8(t) is determined by the
poles of 8(S) and will grow without bound if ii(s) has poles for real values of s. Real
poles can exist and produce instability, :if [using (87) and (85)J
(m2ct~/8n2) L(O) = (ct~/48n2)

> 1.

(88)

In other words, flat vacuum is unstable to perturbations of the kind in (82), as long
as m2 ~ 36n(hc/G). This instability will generate a transition from flat to a nonflat
spacetime with the creation of massive scalar particles. Though extremely
simplified, this model indicates the essential idea behind cosmogenesis.
As a second example, we consider a spacetime with a cosmological constant
A and show how quantum gravity can lead to primordial inflation. We begin with
a classical action
A

12~;

(R - 2A)

~ d x;
4

L;

= (4nG/3).

(89)

This action describes a system with a cosmological constant. At this stage, we


shall not worry about the origin and magnitude of A, except to assume that it is
positive.
The classical solution, corresponding to the principle of stationary action, may
be taken to be the De Sitter Universe, represented by (H2 = tA),

(1 - ~
3

(90)

R2) dT 2 _

dR

(1 - (A/3)R2)

_ R2(d8 2 + sin 2 edcp2).

(91)

For nonzero A, this De Sitter Universe plays the same role as the Minkowski
spacetime for zero A. Thus, we may take the De Sitter spacetime to be the 'ground
state'. Equation (91) shows the static nature of this spacetime, while cosmological

393

Quantum Cosmology

observers use the comoving line element in (90). For this solution, the scalar
curvature is given by
R = 4A.

(92)

Let us consider the quantum conformal fluctuations about this ground state. We
know that any local minimum will be a classical ground state, stable to small
perturbations. Quantum fluctuations, however, can induce a tunnelling through
a potential barrier and render the local minimum unstable. This is exactly what
happens to the ground state in (90) and (91). The QCFs make the De Sitter
spacetime unstable and give it a finite lifetime t. The universe tunnels out of the
De Sitter phase after an inflation by a factor (exp Ht), which can be large.
The QCFs are governed by the action
A

=-

1
2L~

f('

QlQi - .1
6 RQ

2+ 3A Q 4) V;---:.
-g d 4x.

(93)

Defining

= L;lQ,

(94)

we get
(95)

with
(96)

This potential is shown in Figure 2. The classical ground state corresponds to the
local minimum at = O. Near = 0, the potential may be treated as a harmonic
oscillator potential with (J)2 ~ V"(O) and <2) ~ (J)2. This ground state is
separated from the regions of II ~ w by a potential barrier. Clearly, quantum
tunnelling through this barrier renders the ground state unstable.
v ()

Unstable
ground state

---+--------"1-""'-----:--+--..
Quantum

tunnelling

of the
universe

Fig. 2. Quantum gravitational inflation. The unstable ground state with characteristic dimensions
Lp inflates to the observed universe (see ref. [32]).

T. Padmanabhan

394

The tunnelling probability can be calculated by the usual instanton techniques.


We shall assume that the tunnelling takes place homogeneously over the whole
space of volume [(4n/3) H - 3]. Then, the tunnelling probability per unit time is
given by32,

1
(32j2n')
(4n/3) LpA 3/2 exp - 3A 3/2

I'

(97)

The reciprocal of P gives the lifetime r of the metastable ground state


r

(3/4n) L pA 3/2 exp(32j2n/3).3/2).

(98)

Therefore, the inflation factor for the Universe is


Z = exp(Hr) = exp[(3A 2/4n)exp(32j2n/3).3!2)].

(99)

Being an exponential, Z is huge for a wide range of A. In fact, the minimum value
of Z is about
Zmin :::::: 10 46

(100)

which occurs for ). :::::: 11. This inflation is more than adequate to blow up a
Planck-size bubble (_10- 33 cm) to about _10 13 cm. The size of our observed
universe at Planck time is _10- 2 cm much smaller than this. In other words,
there is sufficient primordial inflation in this model for all values of X

4.2.

Planck Length as 'Zero-Point Lenf/th'

In the absence of gravitational fields, spacetime would be considered flat in the


classical limit. Such a flat spacetime should be more properly treated as quantum
gravitational vacuum. The omni-present vacuum fluctuations will now induce
fluctuations in the classical value of the metric tensor,
g~

'1ik

= dial(1, -1, -1, -1).

(101)

What are the physical effects of such metric fluctuations?


To begin with, consider the proper length between two events (t, x) and (t, y). In
the absence of metric fluctuations, the proper length is just

lo(x, y)

Ix - YI

(102)

When the quantum conformal fluctuations of the metric are taken into account,
we have to deal with all the metrics of the form
(103)

Since the proper distance between (t, x) and (t, y) depends on the value of the
quantum variable (x), we can no longer assign a unique proper length between
(t, x) and (t, y). Instead, we should ask for the probability P(1) for the proper

395

Quantum Cosmology

length to have a particular value 1. This probability, in turn, depends on the


probability for a fluctuation of size (x) to occur, and can be computed by
vacuum functional techniques. One obtains the distribution 33

Ifd

P[(x)] = N exp - 4n 2 L;

xd3y

V(X)"V(J)}

Ix _ yl2

(104)

using whichever one can compute P(l). The final answer is (for details see ref.
[33J)
(105)
with

10

Ix -

(106)

yl;

In (106), L denotes the resolution limit of the apparatus used to measure the
pr~per length between x and y.
We note that P(1) is a Gaussian-peaked at the classical value 10 , as expected. As
long as we consider measurements which are 'coarse-grained' over many Planck
lengths (i.e. resolution limit L Lp), the quantum spread in the Gaussian is small
1

((J2/t6) = 4n 2 (L p /L)2 1.

(107)

In this limit (which is valid, even at the highest man-made accelerators), pel) is
adequately approximated as a delta function b(l - 10 ) and one may neglect
quantum gravitational effects. However, as the resolution improves (i.e. L -4 Lp),
(J approaches 10 , and we lose the concept of a definite length between the events.
The mean-square value of 12 for the Gaussian distribution is given by

(108)
From the definition of L as resolution limit, it follows that 10 > L. Taking the
limits in proper order (maintaining 10 > L), we get
lim {lim <1 2
L-O

lo-L

>}

= (L p/2n)2.

(109)

In other words, the mean-square proper length is bounded from below at


(L p/2n)2. The above discussion once again illustrates how fluctuations can lead to

a stable lower bound in length scales.


Even though the intermediate steps involved the resolution length L, the final
bound did not. This makes one suspect that it may be possible to obtain the lower
bound without bothering about the measurement process. We shall now see how
this can be done.
Naively, one can try to define the mean value of the spacetime interval at the

T. Padmanabhan

396

coincidence limit by the following relation


lim <ds 2 )

lim <[1

+ (X)]2)'1ik dxi dxk.

(110)

Such a definition, however, runs into two difficulties immediately: (i) mathematically, expressions like <2(X) are divergent and need to be 'regularized', (ii)
physically, line interval depends on two I~vents, Xi and yi = Xi + dxi and it is not
meaningful to consider at a single eVI~nt Xi.
Fortunately, both these difficulties can be surmounted by a small modification
of our definition. We shall consider <ds 2 ) to be defined as the limit
(111)
x-y

x-y

Here and elsewhere the 'mean values


integration. For example,

< )

are defined via usual functional

=SE0(x)(Y)~XPiA,

<(x)( )

(112)

SE0exp IA

where A, given by (5), becomes a functional of (x)


A

-(1/2L~)

(113)

ii d 4 x.

Using (113), (112) and (111), we get, (with /6(x,y)

= '1ikdxidxk),

lim <ds 2 ) = lim <(x)(y) 16(x, y)


X""'y

x-y

=
=

lim (Lp/2n)2 '~'/6


10

x~y

(L p/2n)2.

(114)

In other words, the mean-square value ofthe line interval is bounded from below
at (L p/2n)2. The discussion in the previous section illustrates how measurements
respect this fact. In analogy with the zero-point energy of the harmonic oscillator,
we may attribute a 'zero point length' between any two events in the spacetime.
This result can also be looked upon as an 'uncertainty relation' between proper
length and conformal factor. We can interpret (J in (105) as the uncertainty III in
the measurement of proper length. In ll denotes the uncertainty in the
conformal factor, then (106) can be stated equivalently as
(115)
All previous results in quantum cosmology are consistent with this principle.
The result in (114) is valid in any spacetime. Though the expectation value
<(x)(y) is a complicated function of (x, y) in a general spacetime, the
coincidence limit of <(x)(y) is dominated by the flat space behaviour and
diverges as 10 2 Clearly, this feature is enough to reproduce (114).
We shall now examine the physical consequences of this result.

397

Quantum Cosmology

It is well known that quantum field theory is bedevilled by divergences when


straightforward perturbative techniques are used. Because of this situation, it is
impossible to construct a quantum theory from an arbitrary classical theory.
Such a helplessness has forced the physicist to religiously adhere to a small subset
of all possible theories, in which the divergences can be systematically removed.
It has been repeatedly suggested in literature (see ref. [34J) that gravity might
provide a universal cut-off required to remove the ultraviolet divergences. Since
a lower bound on the length scale is equivalent to an ultraviolet cut-off, our result
in the previous sections has important bearings to field theory.
Consider, for example, the usual definition for the two-point Green's function
[for a scalar field '1(x)] via path integral

, _ S~'1 '1(x)'1(y) exp iA


S;;v
f]>
'
'1 exp I'A

Go x, y) -

with
A['1J

= 2"1

'1i'1 i d 4 x.

(116)

(117)

The above equations presuppose that the spacetime is flat (or, at least, gik is fixed
at some value). Since such an assumption is incorrect at high energies, we should
modify Go(x, y) in the following way: Let Go(x, y; gik) denote Green's function
when the spacetime metric is gik' Then the correct Green's function is obtained by
averaging over various metrics with the proper weightage exp iA(gik)
Gtrue(x, y) =

f~gik

Go(x, y; gik) exp iA[gik].

( 118)

Performing this calculation (see ref. [35J for details), we get the final answer as
Gtrue(x, y) =

4~2 i tx _ y)2 + (~p/2n)2 _ it]'

(119)

Note that Gtrue(x, y) has a finite coincidence limit as x - ~ y, in sharp contrast


with Go(x, y). Field theory calculations using Gtrue will be free from coincidence
limit divergences. For example, it has been shown (in ref. [35J) that the one-loop
effective action is finite when gravitational effects are taken into account.
It was always hoped that Planck's length would playa crucial role in quantum
gravity. Analyzing the conformal degree of freedom, one is led to envisage a far
more fundamental role for Planck's length. It provides a universal 'lattice
spacing' for the spacetime.
5.

Critique, Comparison and Open Questions

Every approach to quantum cosmology has its own merits and drawbacks, and
the approach based on conformal quantization is no exception. The major
advantages of this method are the following:

T. Padmanabhan

398

(i) It provides a general, nonperturbative framework to quantize at least one


particular degree of freedom of gravity. There is no a-priori limitation on the
domain of applicability.
(ii) The physical interpretation of the wavefunction (for the conformal factor) is
straightforward. Probability interpretation and a choice of time label exist,
thanks to the generalized Wheeler~De Witt equation.
(iii) Systematic analysis ofFRW Univlerses is possible within the model and the
results are theoretically satisfactory. The quantum universes are nonsingular,
horizon free, and have the correct classical limit.
(iv) The conformal fluctuations and the Planck length playa crucial role in
small-distance physics, as shown in Sectlion 4.2. Thus, the approach adopted here
has important bearings on quantum gravity and is by no means limited to
quantum cosmology.
The major drawbacks of the approach are:
(i) The approach is incomplete and deals with only one degree of freedom.
There is no assurance that the results will have any relevance in a full quantum
gravity model.
(ii) The constraint equation is treated differently from the dynamical equations,
thereby sacrificing the invariance under the relabelling of time coordinate.
(iii) There is no way the model can uniquely specify the quantum state of the
Universe. This has to come as an additional input (like the demanding of
minimum uncertainty).
Of these three drawbacks, (i) is common to all approaches to quantum
cosmology. The transition from the WD equation to the GWD equation (which
leads to (ii) above), is somewhat ad hoc; however, this prescription leads to
tremendous simplification of the conceptual issues. As discussed earlier in the
text, we feel that (ii) is a small price to pay. It should be noted that every other
approach to quantum cosmology has failed to provide simple answers to the
questions we have raised and answered.
The difficulty (iii) is also common to most approaches of quantum cosmology.
The only formalism of quantum cosmology that attempts to tackle (iii) is a recent
approach due to Hartle and Hawking 36 . We have expressed the path integral
Kernel in Section 2 as
K[02' t; 01' OJ =
=

2!0 exp iA[Q]

L eiET I/JE(02)1/J~(01)'

(120)

where I/J E(O) are the QSGs corresponding to the GWD equation. Analytically
continuing K(t) to imaginary times (by putting t = - iT) and taking the limit
T ..... 00, we get

f-+ 00

Quantum Cosmology

399
(121)

Taking 0 1 = 0 and calling O 2 = 0, we have a general prescription for obtaining


the E = 0 wavefunction of the GWD which, of course, satisfies the WD equation.
Thus, the WD wavefunction has the path-integral representation
I/IE=O(O)

= N K(O, 00; 0, 0)

= ~ 00 exp( -

AE)

(122)

[We have renamed I/I~(O) as N, which only affects the normalization; AE is the
Euclidean action.J Hartle and Hawking suggest that the path integral in (122) be
evaluated over all Euclidean spacetimes with compact support satisfying the
boundary condition. This procedure singles out a particular solution to the WD
equation, thereby avoiding (iii).
There are, however, certain problems in this approach which one should keep
in mind:
(a) The approach is based on the Euclidean version of gravity. As we said
before, the Euclidean action is not positive definite, because of the 'wrong sign' for
o. An extra ad hoc prescription which can change the nature of physics [see
discussion on p. 379J is needed.
(b) There are many Lorentzian spacetimes which do not have positive definite
Euclidean extension. Clearly, the Euclidean approach to quantum gravity
ignores these spacetimes.
(c) As in any approach based on WD equation, the wavefunction is 'timeless'.
The accompanying problems of interpretation are nontrivial.
(d) The form of a metric in the Euclidean domain depends crucially on the
choice of the original time coordinate. For example, de Sitter spacetime will not
yield a real positive definite, Euclidean metric if the analytic continuation of the
time coordinates in (90) or (91) is performed. [There is a different coordinate
choice which will allow analytic continuation.J Thus, the Euclidean prescription
does partially restrict the freedom of time relabelling.
Lastly, we mention for comparison that, the Hartle-Hawking solution can
always be obtained from the approach described in the previous sections, by
imposing extra conditions 37 .
We conclude with a mention of three open questions which remain to be
tackled and probably can be tackled with currently available machinery:
(i) Why is the cosmological constant small? Repeated attempts to answer this
question in the classical language have failed. Some indications to how quantum
cosmology may tackle this question can be found in ref. [38]. A clear solution to
this problem will be a feather in the cap of any approach to quantum cosmology.
(ii) How much matter is there in the Universe? There are many models for
cosmogenesis which 'create' a k = + 1, FRW model. In such a closed model, one
can define the 'total amount of matter'. It would be interesting to compute this
quantity from first principles by using a realistic - or at least, semi-realistic description for matter.

T. Padmanabhan

400

(iii) What happens in the gravitational collapse of a star? Two kinds of


singularities have plagued classical gravity - the big bang and the black hole. If
quantum gravity prevents one, presumably it prevents the other as well.
[Discussion in Section 4, as well as the work in ref. [9] confirms this hope.] But, so
far, no one has carried out a systematic analysis of quantum gravitational effects
on black-hole collapse.
The story of quantum cosmology has hardly begun. Nevertheless, the cast
appears to be interesting, and the plot appears to be sufficiently unpredictable!
Appendix 1:

Schrodinger Approach to Field Theory

Consider a classical system based on the action functional


A =

fct q

2 -

V(q, t)] dt.

(1)

In the path integral quantization ofthis system, the central quantity is the Kernel
K(q2 t 2;q 1 t l ) =

f !0qexpiA[q(t)]

(2)

which represents the probability amplitude for the transition from (q I ' t l ) to (q2'
t 2 ). Equations (1) and (2) describe classical and quantum dynamics; they are
independent of boundary conditions. Given a boundary condition in the form of
a initial wavefunction I/I(q I ' 0), (2) provides the wavefunction at later times
I/I(q,t) =

K(q,t;ql,O)I/I(q,O)dql'

(3)

We can show by simple algebra that I/I(q, t) satisfies the Schrodinger equation

.01/1

ITt

1 02 1/1

-2 aq2 + V(q) 1/1.

(4)

Equations (lH4) can be generalized to field theory in a straightforward


manner. Consider the action
A =

ff

dt d 3 x[! qiqi - V(q)]

(5)

in which q(x, t) is a scalar field. The probability amplitude for the field to proceed
from ql(x) (at t 1 ) to q2(X) (at t 2 ) is given by the kernel,
K[q2(X),t 2;ql(X),t 1 ] =

f !0qexpiA[q(x,t)].

(6)

Given the initial wave-'functional' 'I' [q(x), 0], the kernel in (6) will provide the
wavefunctional at later times
(7)

Quantum Cosmology

401

This wave functional satisfies the functional Schrodinger equation


(8)

where

B = -"21 b q2b2(x) + 12

Vq 1

+ V(q)

(9)

As in point-quantum mechanics, 1' (q(x), t) 12 represents the probability for the


field to have the value q(x) at time t. [If a manifestly Lorentz invariant formalism
is desired, one may consider ' (q(x),cr) as a functional of the space-like
hypersurface cr and replace the time derivative in (8) by functional derivative w.r.t
cr.]

The usual approach to field quantization (in Heisenberg picture) is specifically


tuned to the particle physics needs [e.g. the Fock basis with 0, 1,2, ... 'particle'
states arises naturally in this approach]. On the other hand, the Schrodinger
approach is convenient in describing coherent excitations and semi-classical
limits.
One more crucial difference between the two approaches comes to light when
we consider an unbounded Hamiltonian. If V(q) in (1) or (9) is unbounded from
below, then a stable ground state (or 'vacuum') will not exist. The usual
field-theoretic description based on particle number basis becomes useless. On
the other hand, (7), (8), (9) continue to be valid! Given a initial wavefunction, we
can predict the future evolution of the system irrespective of the nonexistence of
stable ground state. This feature makes the functional approach very attractive in
quantum cosmology.
Appendix 2:

The Wheeler-De Witt Equation

Consider a line element expressed in a gauge with goo


f3 etc. = 1, 2, 3; i, k etc. = 0, 1, 2, 3].
ds 2

dt 2

1 and gOa

+ gap dx dx p.

O[ct,
(1)

By straightforward computation, one can express the Einstein action


A

-1-fR~d3xdt
16nG

(2)

in the form
A =

ff
dt

d 3 xj=3ge R - (Tr K)2

+ Tr K2},

(3)

where quantities with the prefix 3 are evaluated for the three-geometry gap' The
quantity Kap (,extrinsic curvature') is given by

(4)

T. Padmanabhan

402

Some algebraic manipulation with (3) will lead to an equivalent expression


A

ff
dt

d3x[iGltvaPgltAap

+ ~3RJ,

(5)

where
GltvaP

= t( -

3g)1/2 {gltagvP

+ gltpgva

- 2g ltv gap }

(6)

The action in (5) describes the dynamics of the three-metric gaP' By varying gap in
(3), one can reproduce the space-space part of the Einstein's equations. On the
other hand, we have lost our handle on the 'time-time' and 'space-time' parts of
Einstein's equations. How can we recover these equations?
We know that (0,0) and (0, IX) components of Einstein's equations are
constraint equations expressing in variance under the coordinate transformations. We may, therefore, demand as an additional input, the invariance of (5)
under coordinate transformations. This will lead to the following constraint
equations
~ {3 R

+ (Tr K)2 - Tr K2} = 0

{~(Klta - glta K)};u = O.

(7)

(8)

One may now construct a quantum theory for gaP based on (5). This will lead to
a (functional) Schrodinger equation

<5 2 '1'

Gltva P<5gltV <5gaP

3
+ VC"3g
- y R'I' = E'I'.

(9)

This equation (which corresponds to our GWD equation) does not incorporate
the constraints (7) and (8). We impose these constraints separately:

<5 2 '1'

GltvaP<5gltV <5gap

<5'1'} _
{ <5g
ap ;P - .

3
+ VC"3g
- y R'I' = 0,

(10)
(11)

Comparing (10) and (9), we see that any solution of (10) will also satisfy (9)
trivially, with E = O. Equation (10) is called the 'Einstein-Schrodinger equation'
by Wheeler and De Witt and the 'Wheeler-De Witt equation' by everybody else.
It contains all the dynamics of gap [since (9) follows from (lO)J, subject to the
constraint of coordinate invariance. In arriving at (9) from (5), one has pushed the
factor-ordering problems under the rug.
When gaP represents the spatial geometry of k = 1 FRW Universe, (7) reduces
to the WD equation discussed before.
A solution to WD equation will be a functional of the 'three geometry' G and
will not depend explicitly on gaP' [This is a direct consequence of the constraints.J
In principle, the three geometry contains information about the metric on the

Quantum Cosmology

403

three-surface as well as about the intrinsic time. In practice, however, disentangling


these two pieces can be a nontrivial task!

Notes and References


I. L. Smolin (1979), What is the problem of quantum gravity?, preprint based on part 1 ofthe thesis
(Harvard University).
2. These questions, of course, are discussed in literature. Some good reviews of quantum gravity are
by C. 1. Isham, in Quantum Gravity I & II (eds. C. 1. Isham, R. Penrose and D. W. Sciama;
c.u.P., 1975, 1980); 1. A. Wheeler in Relativity, Groups and Topology (eds. B. S. De Witt and C. De
Witt; Gordon and Breach, 1964); 1. Hartle in The Very Early Universe (eds. G. W. Gibbons, S. W.
Hawking, S. T. C. Silkos; Cambridge, 1983) and the various articles in Quantum Theory ofGravity
(ed. S. Christensen, Adam Hilger, 1984).
3. E.g. G. 't Hooft (1973), Nucl. Phys. 862,444; S. Deser and P. Van Nieuwenhuizen (1974), Phys.
Rev. DI0, 401, 411.
4. T. Padmanabhan, T. R. Seshadri and T. P. Singh (1985) J. Mod. Phys. (to appear); K. Eppeley and
E. Hannah (1977), Found. Phys. 7, 51; D. N. Page and C. D. Geilker (1981), Phys. Rev. Letts. 47,
979.
5. There are many good reviews on these subjects; for example, see P. Van Nieuwenhuizen (1983)
Relativity. Groups and Topology, II' (eds. B. S. De Witt and R. Stora; North-Holland); 1. Scherk
(1975), Rev. Mod. Phys. 47, 123; J. H. Schwarz (1982), Phys. Rep. 89, 223.
6. Some limited application ofnonperturbative techniques in quantum gravity can be found in T. E.
Tomboulis, Phys. Letts. 708(1977), 361; 97B(1980), 77; L. Smolin, Nucl. Phys. 8208(1982),439; S.
Weinberg (1979) in General Relativity - An Einstein Centenary Survey (eds. S. W. Hawking and W.
Israel; Cambridge).
7 For classical and quantum description of homogeneous cosmologies, see M. Ryan, Hamiltonian
Cosmology (Springer, 1972); M. A. H. MacCallum in General Relativity - An Einstein Survey
(Cambridge, 1979); C. W. Misner in Magic without Magic (eds. 1. Klauder; Freeman, 1972).
8. For detailed discussion of ' mini supers pace', see C. W. Misner, op. cit (ref. [7]) and B. S. De Witt
(1967) Phys. Rev. 162, 1195.
9. This approach was initiated by J. V. Narlikar in (1979), M. N. Roy. Astron. Soc. 183, 159; Gen. Rei.
Grav. (1979), 10, 883 and was developed further by the author; T. Padmanabhan (1982),
PhD thesis.
10. It is easy to show that null geodesics remain null geodesics under conformal transformations,
making the light cones conformally invariant. The converse is somewhat more difficult to prove
but can be done.
II. Coordinate transformations expressing k = 1 FRW Universes in the conformally flat form are
given in, e.g., F. Hoyle and J. V. Narlikar (1974) Action at a Distance in Physics and Cosmology
(Freeman).
12. A down-to-earth discussion of path integrals can be found in R. P. Feynman and A. R. Gibbs,
Quantum Mechanics and Path Integrals (McGraw-Hill, 1965). A flavour of more advanced topics
can be found in L. S. Shulman. Techniques and Applications of Path Integration (Wiley, 1981).
13. For many of the details in this section and the next two, see 1. V. Narlikar and T. Padmanabhan
(1983) Phys. Repts. 100, 152 and T. Padmanabhan (1983) Phys. Rev. D28, 745.
14. An excellent discussion of this and related issues can be found in K. Kuchar, Canonical
quantisation of gravity, in Relativity. Astrophysics and Cosmology (ed. W. Israel; Reidel, 1973);
also see J. Hartle and K. Kuchar in Quantum Theory of Gravity, op. cit. (ref. [2]). The problem of
separating temporal information from dynamical information is stressed in Kuchar's article. For
a more recent discussion, see T. Banks (1985), Nucl. Phys. 8249, 332.
15. The 'usual' conclusion drawn is that "time is a semi-classical concept which cannot be extended
... into the domain of quantum gravity". See, e.g., T. Banks op. cit. [14], p. 336.

404

T. Padmanabhan

16. For previous discussion of this topic, see Hawking's contribution in General Relativity - An
Einstein Survey, op. cit. (e.g. ref. [6]).
17. This should be clear from the Schrodinger approach to field quantization described in Appendix
1. Also see J. Greensite and M. B. Halpern, Nucl. Phys. 8242 (1984) 167.
18. This prescription is discussed for, e.g., in reL [16].
19. The Wheeler-De Witt equation is discussed in many places. Particularly useful articles are the
ones by Wheeler (1964; eited in ref. [2]), Kuchar (cited in ref. [14]) and D. R. Brill and R. H.
Gowdy, Rep. Prog. Phys. (1970), 33, 413.
20. The role of observer in quantum cosmology is not clear. See, e.g., W. Patton and J. A. Wheeler in
Quantum Gravity I, op. cit. (ref. [2]); T. Banks, op. cit. (ref. [14]).
21. D. R. Brill and R. H. Gowdy, Rep. Prog. Phys. (1970), 33, 413; c. W. Misner, op. cit. (ref. [7]);
K. Kuchar, op. cit. (ref. [14]).
22. S. Fulling (1973), Phys. Rev. D7, 2850; P. C. W. Davies (1975), J. Phys. 8, 365; W. G. Unruh (1976),
Phys. Rev. 014, 870; T. Padmanabhan (1982), Ap. Sp. Sci. 83, 247; Class. Q. Grav. (1984), 2, 117.
23. See papers cited in ref. [13], and Phys. Letts. (1982), 87 A, 226; (1983) 15, 435.
24. The comparison between the singular behaviour of hydrogen atom and the universe was first
suggested by Wheeler. It is explored systematically in papers cited above (ref. [23]).
25. See, e.g., 1. V. Narlikar (1979), M. N. Roy. Astron. Soc. 183, 159.
26. J. V. Narlikar (1984), Found. Phys. 14, 443; J. V. Narlikar (1981), Found. Phys., 11, 473; T.
Padmanabhan and 1. V. Narlikar (1982), Nature 295, 677.
27. See for, e.g., the papers in ref. [14].
28. The application ofQSG is not limited to FRW models; for the application ofQSG to other cases,
see T. Padmanabhan (1982), Gen. Rei. Grav. 14. 549; Int. J. Thea. Phys. (1983),22, 1023 and Class.
Q. Grav. (1984), 1, 149.
29. This semiclassical approximation is developed in T. Padmanabhan (1983), Phys. Rev. D28, 745;
Detailed application to cosmology can be found in T. Padmanabhan (1983), Phys. Rev. D28, 756.
30. T. Padmanabhan (1983), Gen. ReI. Grav. 15, 435; Int. J. Thea. Phys. (1983), 22, 1023.
31. The analysis here is based on T. Padmanabhan (1983) Phys. Letts. 93A, 116. Creating the universe
is a very popular pastime of quantum cosmologists, see, e.g., R. Brout et. al., Nucl. Phys. (1980),
170,228; L. Lindley, Nature (1981), 291, 391; A. Vilenkin (1983), Phys. Rev. D27, 2848.
32. T. Padmanabhan (1984), Phys. Letts. 104A, 196.
33. This is similar to the vacuum functional in electromagnetism and is derived in T. Padmanabhan
(1983) Phys. Letts. 96A, 110. For a detailed discussion of Equation (105), see Gen. Rei. Grat'. (1985),
17,215.
34. E.g .. see B. De Witt, Phys. Rev. Letts. (1964), 13, 114; C. J. Isham. A. Salam. and J. Strathdee (1971),
Phys. Rev. D3, 1805.
35. T. Padmanabhan (1985), Ann. Phys. 165,38(1985); Current Science (1985), 54, 912.
36. 1. Hartle and S. W. Hawking (1983), Phys. Ret'. D28, 2960; S. W. Hawking (1984), Nucl. Phys.
8239.257.
37. For example, the 'wave function of the inflationary Universe' can be obtained as a E = 0 solution
ofGWD equation for the action in Equation (95). T. Padmanabhan (1985). TIFR preprint.
38. E. Baum, Phys. Letts. 1338 (1983), 185.

'A dream-child moving through a land


of wonders wild and new,
In friendly chat with a bird or beastAnd half believe it true.'
- C. L. Dodgson (Lewis Carroll).

19. The Photon, the Graviton and the


Gravitino
A. MAHESHW ARI
Physics Department. Regional College of Education. Mysore 570000, India

The purpose of this chapter is to introduce the gauge fields and spinors in four
dimensions and their generalization to spacetime dimensions d(d > 4), as they are
basic to setting up the Kaluza-Klein and supergravity-type of theories.

1.

The Photon

The electromagnetic field All is typical of all the gauge fields and a systematic
study of its properties in four dimensions makes it easier to understand the
properties of the other two gauge fields - the graviton gllv and the gravitino t/J/l'
The Maxwell equations follow from the Lagrangian

(1)
This Lagrangian is invariant under the gauge transformation of the electromagnetic potential All'
All -All

+ 0IlA(x).

(2)

The field equations for All'


O. O'EM _ O'EM = 0
"oA Il ,,,
oA Il
'

are
0).(0). All - Oil A")

O.

(3)

It is easily checked that the field equations for the field All'
OAIl - ollo"AA = 0,
0=0;,(/,

(4)

are singular. The field equations can be solved by using the freedom of the gauge
transformation of All in fixing a gauge. A convenient gauge is the covariant
Lorentz gauge

(5)
405
B. R. Iyer et at. (eds.), Gravitation, Gauge Theories and the Early Universe, 405-413.
&) 1989 by Kluwer Academic Press.

406

A. M aheshwari

A gauge function A can be chosen to fix AI' so that it satisfies the Lorentz gauge
condition. In this gauge, the field equation (4) reduces to the nonsingular form
DAI'=O.

(6)

An important observation is that the imposition of the gauge condition has


reduced the number of independent degrees offreedom of the field AI' from 4 to 3.
However, a unique solution of the Mqxwell equations cannot yet be found,
because it is still possible to perform gauge transformations on the field AI' so that
it continues to be a solution of Equation (6) and obeys the Lorentz gauge
condition. It is easily checked that if AI' is a solution of Equations (5) and (6), then
so is AI' + bAI" where bAI' = 8I'A(x), provided
DA(x) = 0.

(7)

This freedom of applying the gauge transformation twice to find an unambiguous


solution of Maxwell's equations, as we shall see, is typical of all gauge fields. The
double gauge transformation imposes one more restriction on the field AI' and
further reduces the number of independent degrees offreedom by one, from three
to two. We, therefore, expect that in the four-dimensional spacetime, the
electromagnetic field AI' will describe zero mass quanta with two degrees of
freedom. This can be explicitly seen by writing plane wave solutions of Equations
(5) and (6) with momentum pl'. Without loss of generality, we choose pI' = (pO, 0, 0,
p3). The general solution of Equations. (5) and (6) can be written in terms of two
independent polarization vectors c! and c~,
c! = (0, 1,0,0),
c~

I'

= (0,0, 1,0),
=

10 1

11

eipx

+ c~eipx + C.c.

(8)

(9)

with
Pl'pl'

= 0,

(10)

c!pl'

= 0, i = 1,2.

(11)

and
If we choose the following independent linear combinations of the polarization
vectors

and
(12)

The Photon, Graviton and Gravitino

407

then, under a rotation by angle B about the spatial z-axis,


x' = x cos 0 + y sin B,
y' = - x sin B + Y cos B,
z' == z,

e!

(13)

transform as
1-

ell

= ell e

-if)

(14)

These can be identified as + 1 and -1 helicity modes or the left and right circular
polarizations of the photon. The photon is a zero mass field, as can be noted from
Equation (to). We therefore conclude that the electromagnetic field AI" which
appears in Maxwell's Lagrangian, is a massless vector field with helicities 1.
We can easily generalize the discussion to d-dimensional spacetime with one
time and d - 1 space coordinates. We shall distinguish world indices in higher
dimensions by capital Latin indices M, N, P, ... etc. The Lagrangian for the
Maxwell's field in d dimensions will be
(15)

The gauge transformations of the field AM are


AM

--> A~

+ 0MA.

(16)

The freedom of applying the gauge transformation twice restricts the number of
independent massless degrees of freedom of the potential AM to d - 2.
2.

The Graviton

The existence of the graviton field can be seen by carrying out a straightforward
flat space expansion of the Einstein-Hilbert Lagrangian density

s=

2'E_H

d4x ,

(17)

where g is the determinant of the metric tensor g"V and R is the scalar curvature.
We use the following definitions of the scalar curvature

(18)

The Christoffel symbols r~v are given in terms of the derivatives of the metric
tensor
(19)

The flat space expansion of (17) can be carried out using the Gupta procedure.
A symmetric tensor field hl'v is related to the metric tensor g"v by subtracting from

408

A. M aheshwari

it the flat space Minkowski metric tensor


gl'v = Ill'''

Il~",

(20)

+ h~".

In the flat space expansion, the tensor indices are Minkowskian. The curved
space geometrical objects are expressed as a series expansion in hI'"'

+ hl'Ah~ + 0(h3),
-1 - h~ + 1(h~h~ -

gl''' = Ill''' - hI'''


g == det gl''' =
r~" = t(i3I'h~

h~h~)

+ 0(h3),

+ o"h~ - oAhl''') + 0(h2).

(21)

The flat space expansion of the Einstein~~Hilbert Lagrangian density (17) can be
carried by using Equation (18H21). The bilinear terms in this expansion give the
free field Lagrangian of the gravitational field
ro gravity = "41 (hO"" ,A hav ,A -

oL

2hO"" ,v).(1
h ,A

+ 2hO"" ,v h A;',0" -

hO"a,t hA A,r) .

(22)

The Lagrangian, ! gravitY' is invariant under the Abelian gauge transformation

(23)
The field equation for the field hI''' can be obtained from the Euler-Lagrange
equation
o. o! gravity
"oh ll ",,-

o! gravity = 0
oh~"

(24)

The field equation satisfied by hll " is


D hllv - (a" oa hila + all oa h"a) + all a" h~

+ Il il " oa op hap - IlIl"D h~ = 0

or

DhllV

o"Oaha/l- ollOaha"

+ ollovh~ = O.

(25)

This equation is singular. This field equation can be solved by using the freedom
of the gauge transformation of hllv in fixing a gauge. A convenient gauge is the
Fock-DeDonder gauge
0llh llV -

to" h~ =

O.

(26)

In this gauge, Equation (25) reduces to the form

Dh llv

O.

(27)

These equations are similar in form to the corresponding equations, (5) and (6), of
the electromagnetic case. The gauge conditions (26) reduce the independent
degrees of freedom of a symmetric tensor field from 10 to 6. However, as in the
electromagnetic case, there still remains the freedom of applying gauge transformation, Equation (23), once more provided the gauge functions A/l(x) satisfy
(28)
The double gauge transformations impose four more restrictions on the fields hll "

409

The Photon, Graviton and Gravitino

and further reduce the number of independent degrees offreedom from six to two.
We can once again conclude that in the four-dimensional spacetime the
gravitational field h IL ' will describe zero mass quanta with two degrees offreedom.
The spin of the field can be determined, as in the electromagnetic case, by
studying the plane wave solutions of Equations (26) and (27). We choose the wave
to be moving along the z-axis and fix pIL = (pO, 0, 0, p3). The general solution of
Equations (26) and (27) can be written in terms of a symmetric tensor GIL'
h IL '

= GIL' eip . x

+ C.c.,

(29)

with the condition


(30)
and

(31)
We have to find the independent tensors GIL' which satisfy the restrictions (30). It is
equivalent to finding the independent components of GIL'. Equation (30) can be
written as four separate equations

+ GOl = 0,
G32 + G02 = 0,
G03 + G33 = !(Gll + G22 + G33 Goo + G30 = - !(Gll + G22 + G33
G31

GOo),
-

(32)

GOo)'

These equations give


and

(33)
We still have to impose the restriction of applying the gauge transformation the
second time. If we choose the gauge function
(34)

then
(28b)

DAIL =0 ..
Under this transformation, the polarization tensor
G~, =

GIL'

+ GILP, + G,pw

GIL'

is transformed as
(35)

The six independent components of GIL' change as

G'13
G33

= G13 + G1po,
= G33 + G3 pO,

G23 = G23

+ G2 pO,

GOO = Goo -

2Gopo.

(36)

410

A. Maheshwari

We note that only ell and e12 have an absolute physical significance. e13 , e23 , e33
and eoo can be arranged to vanish by choosing the gauge transformation (34),
such that
(37)

We now identify the linear combinations of ell' e12 which under spatial rotation
by angle about the z-axis, Equation (13), transform as definite helicity states. It is
easily seen that the combinations

e =

ell =+=

transform as

(38)

ie 12

2 helicity states, respectively.


(39)

We therefore conclude that the field hI''' describes massless spin 2 quanta, called
the graviton.
In the d-dimensional spacetime the symmetric field hMN will have td(d + 1)
components restricted by 2d gauge conditions and, hence, will describe zero mass
field with td(d - 3) independent degrees of freedom.
It may be noted that

td(d - 3) == 1(d - 2)(d - 1) - 1


and this number can be interpreted as the number of independent components of
a symmetric traceless tensor restricted to d - 2 transverse directions.
3.

The Gravitino

The superpartner of graviton is called the gravitino. It is a massless field of spin


~. In order to write the couplings of spin t and spin ~ fields to the gravity, it is
necessary to distinguish between Lorentz tensors and the world tensors.
Therefore, a new notation is introduced at this stage. To each world tensor field,
a corresponding Lorentz tensor field can be related using the vierbeins or the
vielbeins depending on whether the spacetime dimension is four or not. For
example, AI' denotes a four world-vector and Am the corresponding four Lorentz
vector.
For discussing spinors, it is convenient to use the Hermitian y-matrices and the
Pauli-Kallen metric with X4 = ix o. Before writing the Rarita-Schwinger
Lagrangian for the gravitino field, we list the conventions for y-matrices and some
useful identities

{Ym,Yn}

26 mn ,

e 1234 = e 1234 =

m, n = 1, 2, 3, 4,

+ 1,

1'5=1'11'21'31'4'

y~ =

1,

y':;

I'm'

(40)

The Photon, Graviton and Gravitino

411

In the Hermitian Dirac representation


y=

(? -i(1)
o '
IC1

(41)

Useful identities:
(42)
(43)
(44)

(45)
(46)
(47)
(48)

4.

Rarita-Schwinger Lagrangian

A Rarita-Schwinger spinor ljJa is a four-component Dirac spinor and carries


a Lorentz index. The Rarita-Schwinger Lagrangian for IjJ a is
,0

oZRS

abed ,J; ,
" ,I,
'I'aYSYb U e'l'd'

= "2 e

(49)

The field equation obeyed by the Rarita-Schwinger field can be read offfrom (49)
to be

eabcd l' 5 'h iJ c IjJ d = O.

(50)

Let us define

R a == eabed Ys Yb c IjJ d'

(51)

Using the identities listed above, the following identity can be easily obtained
Ra - 1ra yb Rb == ~ljJa - oa(yIjJ),

(52)

where
~

== yaoa'

Therefore, on-shell the field equation (50) can be rewritten in the form
~ljJa

- oa(yljJ) = O.

(53)

The Rarita-Schwinger Lagrangian density up to a total four-divergence and the

A. Maheshwari

412

field equation (50) are invariant under the gauge transformation


(54)
The gauge function ,,(x) in this case is an arbitrary spacetime-dependent spinor
field. Using the gauge freedom (54), as in the case of photon and graviton fields,
a gauge condition can be imposed on the field I{I a' We choose the covariant gauge
(55)

Y'I{I = 0
by fixing " such that

Y'I{I

+ ya Ga"

O.

In this gauge, the Rarita-Schwinger field equation, as written in the form given in
Equation (53), simplifies to
~l{Ia

= O.

Operating Equation (56) with

(56)
~,

gives

Dl{la = O.

(57)

From this equation, it is clear that l{Ia is a zero mass field. On multiplying
Equation (53) by ya, the following constraint is obtained

G'I{I = ~'Y'I{I,
since yo I{I

(58)

= 0,

G'I{I =0.

(59)

The Rarita-Schwinger field in the gauge (56) obeys the constraint (59) and the
field equation (57). In order to find the spin of the field I{I a' the procedure by now
familiar from the study of the photon and the graviton fields can be employed.
The general plane wave solution with momentum Pa = (ipO, 0, 0, pO) can be
written as

l{I.(p) = ea+ A(p) + ea- X(p) + Pa cp(p).

(60)

e; are the transverse polarization vectors which satisfy


p'e = 0,

and the spinors A(p), X(p) and cp(p) are the solutions of the plane wave Dirac
equations
(61)

As usual, the gauge 'shoots twice'. We can gauge transform away the spinor cpo If

u + and u - are the solutions of the zero-mass Dirac equation with helicities

!,

respectively, then the general solution of Equations (56), (55) and (59) can be
written as
(62)
The two independent terms in Equation (62) describe two physical modes of l{Ia

The Photon, Graviton and Gravitino

413

having helicities t. respectively. It is easily verified that s.+ u+,s;; u- obey the
gauge conditions

y(s; u)

= 0,

(63)

however, sa+ u - , s;; u + do not, as

y(s;u+)=J2iu.

(64)

In d-dimensional spacetime the gravltmo field !/1M will obey the obvious
generalizations of the four-dimensional equations (55), (56), (59)

yM!/IM

= 0,

(65)

~!/IM =

0,

(66)

a!/I

O.

(67)

The gauge condition shooting twice in a naive way, can be seen as the imposition
of restrictions ofthe type given in (65) operating twice. It will be shown in the next
chapter that the Dirac spinor in d-dimensions will have 2[d]/2 independent
components. [d/2J denotes the integer part of d/2. Therefore, the number of
independent physical degrees of freedom contained in a gravitino field in
d-dimensions will be
d

2[dI2] - 3 x 2[dI2] = 2[d/2](d - 3).

(68)

If the spinor obeys other restrictions such as Majorana conditions and/or the
Weyl condition, the number given in (68) will have to be further reduced by
factors of 2 for each condition.

20. The Vierbein, Vielbeins and Spinors In


Higher Dimensions
A. MAHESHW ARl
Physics Department, Regional College of Education, Mysore 570006, India

1.

The Vierbein

The tensor representations of the group GL(4, R) of the general linear 4 x 4


matrices behave like tensors under the subgroup of Lorentz transformations, but
there are no representations of GL (4, R) which behave like spinors under the
Lorentz subgroup. The tetrad formalism or the vierbeins provide an approach for
dealing with spinors in general relativity.
Let us set up at each point X a set of coordinates X(x) that are locally inertial at
X. In the notation used, x are the coordinates labelling points of a general
Riemannian manifold. At the point X in the locally inertial frame, the metric 'lab is
Minkowskian with nonzero components (-1, 1, 1, 1). The metric gl'v(X) in the
general coordinate frame can be obtained from the Minkowskian metric and four
covariant vectors E/(X) by the usual general coordinate transformation rules, if
E/(X) =

OX~X)I '
oX X~X

(1)

gl'v(X) = E/(X)Evb(X)'lab'

(2)

The four covariant vector fields EI' a(x) are called the vierbeins or the tetrad fields.
From Equation (2), one may observe that in some sense the vierbeins EI' a are the
'square root' of the metric tensor gl'v' It is assumed that the vierbeins are
nonsingular matrices and the inverse vierbein fields E~ can be introduced by the
relations
E/(x)E/(x) = i5~,

(3)

E/(x)E/(x) = i5~.

(4)

The inverse vielbeins E/ can be given a geometric interpretation, namely, the


four contravariant vectors that specify the basis vectors of the linear tangent
space at each point of a curved spacetime manifold.
Note that in this formalism the locally inertial coordinates ~x(x) are fixed once
and for all at each point X. So when we change our general noninertial coordinate
system from xl' --+ XiI' = xl' + ,il'(x), the partial derivatives E/(x) change according to the rule
(5)

415
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 415-42J.
1989 by Kluwer Academic Publishers.

A. M aheshwari

416

and under local Lorentz transformations

E~

transform as

E~"(x) = E/(x)Ab"(x)

(6)

with
(7)

The vierbeins E~ and the inverse vierbeins E~ can be used to convert any tensor
field into a set of scalars. For example, given a contravariant vector field AI' (x), we
can use the vierbeins to refer its components at x with respect to the coordinate
system ~~ locally inertial at x:
Aa(x) = AI'(x)E/(x)

(8)

and obtain a set of four scalars Aa(x). We shall see that using this formalism,
spinor fields like the Dirac electron field, can be introduced in Riemannian
manifolds.
There are now two invariance principles which must be met in constructing
a suitable matter action:
(1) The action must be generally covariant with all fields treated as scalars
except the tetrad,
(2) the action must be invariant with respect to the local Lorentz transformations Aab(x).
In general, an arbitrary field
t/la(x)

--->

t/I" (x) will change according to the rule


(9)

D(A(x))/t/lp(x),

where D(A) is a matrix representation of the Lorentz group. The Dirac field of
the electron is a coordinate scalar and Lorentz spinor, and the vierbein E/ is
a coordinate vector and a Lorentz vector. An ordinary derivative is, of course,
a coordinate vector

a
axil

a
----+

ax' a

Gx'JJ

(10)

ox'll GX v '

Note that although E/(%xl') is a coordinate scalar, it does not have simple
transformation properties under position-dependent Lorentz transformations.
For example,
E/

~t/I

uxl'

--->

A/(X)E/(x)-!-(D(A(x))IP)
uxl'

= A~ Eg(D(A) ot/l + (OD(A2)t/I)


oxl'

oxl'

(11 )

What we need to do is to redefine a derivative rJJa in such a way that it is not only
a coordinate scalar, but a Lorentz vector under position - dependent Lorentz
transformations
f?l!a

== E/DI"

(12)

417

Vierbein, Vielbeins and Spinors

D I' is the Lorentz covariant derivative which is defined in terms of the spin
connection field w/ b , as in the standard Yang-Mills programme, so that

(13)

The Lab are the generators of the Lorentz group in d dimensions and satisfy the
commutator algebra
(14)
( 15)

The transformation law of the spin connection fields


definition of the Lorentz covariant derivative

W/b

follows from the


(16)

It is
(17)

Under general coordinate transformations, it is natural to assume that W~b transform as a covariant vector
"ab
uWI'
=

;, ) v

-ul'"

W,. ab -

J v ;,

/C

UVwl'

ab

(18)

Combining the information that the vierbein E/ transforms as a vector under


Lorentz transformations with the transformation law (16) of the Lorentz
covariant derivative, it can be easily checked that the derivative qyal/l == E/DI'I/I
transforms under local Lorentz transformations as
(19)

where the first two terms correspond to the Lorentz group generators in the
vector representation. Under general coordinate transformations, we have
(20)

so that the general coordinate transformations of qy u 1/1 are those of a generic


scalar matter field

(21 )
The notion of covariant derivative can be extended to fields which, in addition
to spinor indices (belonging to specific representations of the Lorentz group), also
carry world indices, e.g. E/, E/. To construct such derivatives, the procedure is
to first convert world indices into Lorentz indices using the vierbeins and then
defining the Lorentz covariant derivatives D1" followed by converting the
Lorentz indices back into the world indices. We shall denote this covariant
derivative by VI' and state the results which are well known in Riemannian

A. M aheshwari

418

geometry. For quantities like 1/11" 1/11' the covariant derivatives are given by the
relations
(22)

VI'I/IV == DI'I/Iv - P"I'I/I",

(23)

and their standard generalizations to the world tensors. The affine connections
r" I'V are defined by the property that
VI'E Va = 0,
(24)

This gives

r\1'

= E/ DI'Ev a
= -EvaDI'E/.

(25)

From this, it follows that the affine and spin connection fields are not
independent. Furthermore, it is easy to show that Vl'gvP is zero as well, from
which one deduces that the connection must satisfy.
(26)

The curvature and torsion tensors are found by using the Ricci identity, which
relates these tensors to commutators of covariant derivatives.
[~a'~b]

== - tRabcdI:.cd - Rabc~C'

(27)

and its generalization


[VI',Vv] == -tRI'v"dI:.Cd - RP.l'vXp -- RI'/V~,

(28)

where Xp are the generators of the tensor representation of the GL(4, R)


appropriate to the tensor nature of 1/1.
For example, for a contravariant vector
(Xp)~ = J~JIi,

and for a covariant vector


(Xp)~ = -J~J~.

(29)

It is easily checked that


(30)

Rab c = E~Eb(DI'E~ - DvE~),


R"~l'v

= avr"~1' - al'r"~v + r'~l'r",v - r' ~Vr"t!l.

(31)
(32)

and
(33)

Vierbein, Vielbeins and Spinors

419

By evaluating the identity

[VIl' VvJE/ = 0,
and using the representation of the Lorentz generators for the contravariant
vector index a,

it can be easily checked that


(34)
Equation (26) and (33) fully determine the connection in term of the metric and
torsion tensors,
fP IlV

{~v}

+ i(R v/ + Rp/ gap gvA +

- R vp A g;'llg UP) .
We define a tensor

K/ d

K/ d

called the contorsion tensor,

iC1l Edp(R v/ Ebp

(35)

+ Rp/ Ebv -

(36)

Rv/ Eb,,).

The spin connection field can now be obtained from the definition of the affine
connection

It gives,

w ab = W ab(E)
Jl

Jl

w/b(E)

+ K Jl ab '

= i(OV(avE/ - allEvb) + Oi. Ebv Eella ;Eve (b -

a)).

(37)

The Einstein-Hilbert Lagrangian density of the gravitational field is given by


the curvature scalar

R(E,w)

= E~EbR"vab,

and the determinant of the vierbein


E

E:

== detIE:I,

!f'EH

ER(E, w).

(38)

(39)
(40)

The coupling of the gravity to the Dirac spinor field IjJ can be written as
(41)

In flat space, spin 1 field is the Rarita-Schwinger field and its Lagrangian is
()
1 abed.T.
a ./,
oLRS="2 e
'I'aY5Yb e'l'd

(42)

A. Maheshwari

420

In curved space, its coupling to gravity can be easily written using the vierbein
formalism
(43)
where
(44)

and
(45)
In the absence of torsion, and if the affine connection is symmetric, the term r~111/1 A
can be dropped from Equation (45).

2.

Vielbeins

The vielbeins are the natural generalizations in d-dimensional spacetime of the


four-dimensional vierbeins. We shall denote by capital Latin letters taken from
the middle of the alphabet (M, N, ...) the world indices and by the capital Latin
letters from the first half of the alphabet (A, B, ... ) the frame indices. The vielbeins
EA M are a set of d covariant vectors connected to the metric of the manifold
gMN and the local Minkowski metric, IJAB, by the definition
(46)

A spin connection field W ABM is introduced to define local Lorentz covariant


derivatives in d dimensions as before,
(47)

The generators of the Lorentz transformations in the spin or representations are


given in terms of the r -matrices of d-dimensional spacetime
LCD

Hre , r D ],

(48)

where
(49)
The problem now is essentially to find the spino rial representations of the group
SO(d). This is discussed in the next section.

3.

Spinors in d Dimensions

We shall state the results only and for details refer to the analysis of F. Wilczek
and A. Zee (phys. Rev. D25 (1982),553). Instead of writing the Clifford algebra for
SO(l, d - l), we study the representations of the Clifford algebra.
(50)

Vierbein, Vielbeins and Spinors

421

for the group SO(d). However, representations crucially depend upon whether
d is even or odd.
d-even (d = 2n)
We begin by proving that there exist 2n Hermitian irreducible matrices r
i = 1,2, ... , 2n, which are 2n x 2n and which satisfy

i,

(51)

The proof is by explicit iterative construction. For n


be chosen to be two of the Pauli matrices

= 1, the matrices desired can

n = 1,
(52)

To iterate from n to n + 1, one constructs the r-matrices for n + 1, denoted by


r(n +1), in terms of r(n) by

r(n+ 1)

(n+ 1) _

r 2n+ 1 and

r(n+1) _
2n+2 -

r(n)

(01 1) '
(

(0i -i)

This proves the theorem.


d-odd (d = 2n - 1):
The dimension of the irreducible matrix representation of r i is 2n- 1 X 2n- 1. The
first 2n - 2 r i coincides with the r\n-1) of SO(2n - 2) and the last one
r(n-1)
r 2n - 1 -- r(n-1)r(n-1)
1
2
. ..
2n -:1.

We conclude by stating that the Dirac spin or representation will be of 2[dj2)


dimensions, where [d/2] is the integer part of d/2. The dimensions of the spinor
representations can be reduced by 2 for d-even if the spinors satisfy the Weyl
condition and, in some dimensions, can also be chosen to satisfy Weyl and/or the
Majorana conditions.

21. Kaluza-Klein Theories


A. MAHESHW ARI
Physics Department, Regional College of Education, Mysore 570006, India

1.

Kaluza-Klein Theories

Einstein's idea, which gave rise to general relativity, was the proposal that
spacetime is nontrivially curved and that the curvature is responsible for the
gravity. The Kaluza-Klein idea takes this one step further, proposing that there
are more than one time and three space dimensions and that the curvature of the
higher-dimensional spacetime in the low enetgy approximation is perceived in
the effective ordinary four dimensions as a unified theory of gravity and gauge
fields. In 1921, Kaluza had suggested that gravitation and electromagnetism
could be unified in a five-dimensional theory of gravity. In the classic
Kaluza-Klein theory, the fifth coordinate was made invisible through a process
of dimensional reduction. This idea has been revived several times, but recently it
has gained popularity because of the concept of spontaneous compactification of
extra dimensions. The starting point of all Kaluza-Klein theories is a theory of
gravitation coupled to some matter fields in d = 4 + K dimensions. The classical
field equations of the metric and the matter fields induce compactification of the
extra space dimensions such that a 4 + K-dimensional background manifold,
MIx M 2' emerges out as a product of a compact space of K dimensions, M 2'
and an ordinary four-dimensional spacetime MI' This is called spontaneous
compactification of spacetime. The next step in the Kaluza-Klein programme is
to expand the metric and the matter fields around this particular solution using
'harmonic functions' for the invariance group G of the compact space of
K dimensions. In general, a finite number of massless modes and an infinite
number of massive modes are obtained using this programme.
The three sections on Kaluza-Klein theories have been divided as follows:
(1) In the rest of Section 1, the original five-dimensional Kaluza-Klein theory
and its treatment from a modern perspective, will be presented in detail;
(2) in Section 2, models for spontaneous compactification in d > 5 and an
introduction to non-Abelian Kaluza-Klein theories will be covered;
(3) and in Section 3, the harmonic analysis on symmetric manifolds, the
Kaluza-Klein models: (i) d = 11 supergravity (ii) Witten's proposal for
SU(3) x SU(2) x U(1) isometry groups, and the problems in the KaluzaKlein programme will be treated.

423
B. R. lyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 423-447.
':[;J 1989 by Kluwer Academic Publishers.

424
1.1.

A. M aheshwari
Five-Dimensional Kaluza-Klein Theory (Old)

The original idea of Kaluza's theory is that spacetime is really five-dimensional.


The line element for this spacetime is postulated to be
(1)

= (Xll, y), fl = 0, 1,2, 3, the signature of the metric has been chosen to be - ,
+ , + , + , +, and y labels the coordinate of the additional space dimension. In

ZM

this ansatz, it has been assumed that the 10 components of the symmetric tensor
gllv and the four components of the vector All do not depend on the fifth
coordinate y. One obvious incompleteness ofthe original Kaluza-Klein theory is

that the parametrization of the five-dimensional metric requires a scalar field cp to


parametrize the 15 independent components of a symmetric 5 x 5 tensor. The
inclusion of a scalar field cp in the five-dimensional metric and the retention of the
y-dependence will be covered in Section 1.2, where the Kaluza-Klein theory has
been covered from a modern perspective. The fifth dimension is assumed to be
compact so that the manifold is cylindrical at each spacetime point, with the fifth
coordinate being circle of length L. The metric gMN can be written as a matrix,
_ (gllv(X)
gMN -

+ K2 AIl(x)Av(x)
K

Av(x)

KAIl(X))
1

(2)

K is a constant with units of (mass)-l or length, and has been introduced in order
that the combination K All is dimensionless. This is to facilitate the subsequent
interpretation of All as the four-dimensional electromagnetic gauge field which
has units of mass. The components gMN of the inverse metric are easily computed
from

MN

=(

IlV

- KAV

(3)

Here All = gllV Av. The Greek indices fl, v, . .. will be raised and lowered using gllV
and gllv' respectively.
The metric components in (2) and (3) have been taken relative to a particular
choice of the basis, namely the coordinate basis in which { dx ll , dy} are the basis
one-forms. The duals are

and form a basis for the tangent space. However, there is another choice of basis
which is convenient for calculations, as it diagonalizes the Kaluza-Klein metric.
This basis is called the horizontal lift basis (HLB) and is obtained by choosing the
one-forms

Kaluza- Klein Theories

IJIL

IJs =

425

dxIL ,

(4)

+ KA/i dxIL.

dy

(5)

The metric tensor in this basis is given by

0)

(gIL

gMN

l'

(6)

and the inverse metric by

(glL

gAMN -_

0) .

(7)

The matrices (6) and (7) may be compared with the corresponding matrices given
in (2) and (3), respectively. The basis vectors ell' e s ' which are dual to elL and eS, are

e =- KA (x)IL
axIL
IL
oy'

es

(8)

(9)

= oy'

The HLB is anholonomic. It means that some of the commutators of the basis
vectors defined in (8) and (9) are nonzero. The commutators of the basis vectors
are
(10)
(11 )

[elL' e s ] = 0,

where
(12)

The nonzero commutation coefficients CMNP defined by the relation


[eM' eN] = C MN P ep

(13)

are easily seen to be


(14)

In an anholonomic basis, the connection coefficients


following expression

r MNP =

Hep(gMN)

+ eN (gMP) -

eM (gNP)

rMNP are given by the

+
(15)

and

r PMN =
~

APQ ~

r QMN .

(16)

A. Maheshwari

426

Using Equations (6), (7), (14) and (15), the nonzero affine connections are found to
be
~

f/l vA
~

f/l v5

= 2 [GAg/l + Gvgl'A ~

G/lgvAJ

f/lVA'

= - "2 FI'V'

(17)
(18)
(19)

and
(20)
Here, f I'VA (without a caret) denotes the components of the connections formed
from the four-dimensional metric gl'v in a coordinate basis. The curvature tensor
in an anholonomic basis is

~R

~M

PQ f

+f

~M

RN

PR CII/Q

(21)

Since we are interested in the scalar curvature, it can be calculated in any


convenient basis. In the HLB, because of the simplicity of the form of the metric
tensor (6), the scalar curvature can be obtained from the following two terms of
the curvature tensor:
(22)

A straightforward substitution of affine connection (17) to (20) in (21) gives


(23)

and
(24)

Therefore
~

R = R

K2

+ "4 liV F I'v

(25)

In the standard version of the Kaluza-Klein theory, the basic action is the
five-dimensional Einstein-Hilbert action

s=-1-fd5Z~R.
16nGL
.

(26)

Here G is the Newtonian gravitational constant. On substituting the KaluzaKlein ansatz and performing the trivial y integration, we are left with the action

Kaluza~ Klein

for the

Theories

427

Einstein~Maxwell

S=

theory

16~Gf d4x~( R + :2 FIlJ Il }

(27)

provided we identify

= 16nG

K2

(28)

is then the Planck length ( ~ 10- 33 cm).


This example illustrates the simplest instance of the Kaluza~Klein 'miracle';
the appearance of the standard Maxwell Lagrangian iFllvpv. It is true that in
order to obtain the Einstein~Maxwell action from the action of the five-dimensional
gravity, one had to be clever in one's choice of parametrization, but it is not true
that the gauge field was put in by hand. It is easy to see that the gauge
transformations of the vector potential All come out as a special case of general
coordinate transformations in five dimensions.
Under the infinitesimal general coordinate transformations
K

ZM

--->

Z,M = ZM _ (M,

(29)

the five-dimensional matrix gMN transforms in the standard fashion as


g'MN (Z) = gMN (Z) _ (M.p gPN _ (N.pgMP _ gMN.p(P.

(30)

In Equation (29) let us choose the following form for the transformation function
(M(Z):
(Il

= 0,

(5 = d(x),

(31 )

and apply it in Equation (30) to find the transformation of g1l5. It is easily seen that
this gives
(32)

which can be identified as the gauge transformation of the electromagnetic field.


1.2.

Five-Dimensional

Kaluza~Klein

Theory in a Modern Perspective

It has already been mentioned that the original

Kaluza~Klein ansatz does not


give all the zero modes. From a modern perspective, the five-dimensional
Kaluza~Klein theory is distinguished by the property that its ground state is not
M 5 , the Poincare manifold in five-dimensions, but is a compactified manifold
M4 x Sl with the reduced symmetry p4 x U(1). The ground state metric (gMN >
IS

(33)

i.e. the ground state is assumed to have been spontaneously compactified to

A. M aheshwari

428

a direct product of four-dimensional Minkowski space and a circle of length L.


The symmetry breaking occurs because the ground state does not share all the
symmetries of the action. The length L of the circle in the fifth dimension cannot
be fixed in the five-dimensional theory. It can be connected to the electric charge
and the Planck length by the coupling of the U(1) gauge field. In order to fix the
size of the compactified manifold, there must be some additional matter fields in
the d-dimensional spacetime, which can give rise to spontaneous compactification.
To accommodate all the 15 independent components of a symmetric tensor in
five dimensions, we parametrize the five-dimensional metric in the following form
A

A,-1/3

gMN-'I'

(gil v

+ </JAIlAv
</JAv

<PAil)
</J.

(34)

In order to keep the expressions simple, we choose units so that the Planck
length K = 1. All the three fields, gllJlO components), All (4 components) and </J (1
component), are assumed to be functions of x and y. The gravity in fivedimensions is described by the action

~f d4xdyJ -

9MN R(9MN) ==

~flt'5 d4xdy.

(35)

We once again introduce an anholonomic horizontal lift basis (HLB),


Oil

= </J -1/6 dx ll ,

and
(36)

It is easily checked that this basis diagonalizes the metric tensor. In this basis, the
matrix of the metric tensor once again has the same form as given in (6)
A

gMN =

0)

(gllV

(37)

The basic vectors ell and es , which are dual to Oil and Os, are
(38)

It is easy to check that

<Oil, ev ) = be,

<Os, ell) = 0, and

<Os, e s ) = 1.

(39)

The commutator functions CMN P can be found by performing a straightforward


calculation. Their expressions are

429

Kaluza-Klein Theories

C/1/ = </J2/3 { - (a/1A v - avA/1)

+ (A/1 aa~v

- Av

a~) },

C ).=_!A,-4/3 a</Jt5 A
/15
6'1'
ay /1'
CIl5

5_!3 </J -5/6( All a</J_a</J)


A,1/6 aA /1
ay
axil + 'I'
ay ,
-

CMN P = - CNM P and

CMNP = CMNQg QP '

(40)

The five-dimensional scalar curvature in the HLB is given by Equation (22), but
can be calculated only by performing an extremely tedious calculation. Care has
to be taken to discard terms that are either total four-divergence with respect to
Xll or a total derivative with respect to y, so that an expression for the Lagrangian
density which consists of terms that are explicitly scalars, both with respect to the
four-dimensional general coordinate transformations and the general coordinate
transformations on the fifth coordinate, is obtained. Note that the determinant of
the five-dimensional metric in the parametrization given in Equation (34) is
;-;:;MN

v' - g

A, -

'I'

1/3 ;---:
'v' - g.

(41)

The detailed explicit expression of the action of the five-dimensional Kaluza-Klein


theory, is given in the Appendix of Section 1.3. We list below the contribution of
zero modes which are obtained from those terms of the Kaluza-Klein Lagrangian
which do not contain derivatives w.r.t. y,
S(O)

fd4x~[R

+ l</JF/1v Pv + i</J-2</J,v</J'T

(42)

Using the ground-state ansatz given in Equation (37), we can associate with
</J a scalar field
</J=I+<1>.

(43)

Substituting (43) in (42), we identify that the zero modes of the five-dimensional
Kaluza-Klein theory are a massless spin 2 field, g/1V' a massless vector field, AI"
and a massless scalar field, <1>. We therefore have discovered that the five-dimensional Kaluza-Klein theory contains, in addition to a graviton and a photon,
a Brans-Dicke scalar field.
The Kaluza-Klein programme requires that after the spontaneous compactification of the higher dimensions, all fields, including those contained in the
higher dimensional metric, should be expanded in harmonic functions over the
compactified space, and an effective four-dimensional theory be obtained by
integrating the harmonic functions. The five-dimensional Kaluza-Klein theory is
sufficiently simple, as it admits a complete analysis of its properties, at least at the
classical level. In the following, we carry out the harmonic analysis of the fields

A. M aheshwari

430

g"v(x, y), A" (x, y), (x, y) on the circle ofthe compactified dimension. We define
the fields h"v (x, y) and <l>(x, y),

<l>(x, y)

< >,

(44)

where the ground state manifold M4 x SI is determined by the vacuum


expectation values
<g"v>

= 1]"v'

< >=

1.

(45)

In terms of the length L of the circle S 1, the Fourier expansions can be written as

(46)
with the reality conditions
(47)

The fields htJ(x), A~n)(x) and <l>(n)(x) are charged fields and couple to the U(l)
gauge field A~O)(x), the electromagnetic field, with charge
en

2nKn

=--.

(48)

The next step is to substitute the Fourier expansions of the tensor, vector and
scalar fields in the Kaluza-Klein action and integrate the dependence on the fifth
coordinate. The main result of this analysis is that there exists a unitary gauge in
which, in each of the charge sectors, the vector and the scalar modes disappear by
absorption into the tensor field and the massive excitations of the five-dimensional
Kaluza-Klein theory are pure massive charged spin 2 fields.
Since
(49)

in the bilinear approximation to the four-dimensional action different charge


sectors are orthogonal to each other. Therefore, we write the expression for the
four-dimensional Lagrangian ft' 4 in terms of the fields htJ(x), At) (x) and cI>(n) (x)
of a nonzero charge sector, n #- 0, and drop the superscript (n) from the
four-dimensional fields. By substituting the Fourier expansions, (46) in the
Kaluza-Klein Lagrangian, Equation (AI), one finds that the following terms

Kaluza-Klein Theories

431

contribute to the charge-n sector:

~~) = (f>~
=

dy

Y)

[{l.h*"V
h ,;. - h*"V ,V).(1
h ,;.
2
,A.
O'V

+ [A*V,/l A

V,1l

+ .1.2 h*"V ,v h;'

).,(1

- A*/l,V A V,Il - ink(AV , V


h*;'
).

_1v

,\I

h;')
).

ink (h*;'t (A;.,t + At,;.) - h;'t (A;',t


* + At,;.
*) ) +
+T
. k

+ ~(<1>*
A;'.
2

,A

2k 2
- <1>1;. ) + _n_{(h*Vth
2
Vt -

,;.

h*~ h~) + (h*~ II> + h~ II>*)} -1<I>* <1>

(50)

where k = 2n/L.
In the bilinear approximation, the spacetime tensor indices are Minkowskian
and the Lagrangian is a scalar under Poincare transformations. A unitary gauge
can be fixed by defining a tensor field 1/1 /lV' which absorbs the fields A/l and <1> in
h/l v , such that when the Lagrangian (50) is expressed in terms of 1/1 /lV' the fields A/l
and <1> disappear completely like Goldstone fields leaving behind the Pauli-Fierz
Lagrangian of the charged massive spin-2 field 1/1 /lV' In the linear approximation,
the ansatz for 1/1 /lV is
I/I/lV

= h/l v

h~Yf/lv + Yf/lv( <1> - ~~Aa,") + :k(A

+ A v),

(51)

or
(52)
It is a satisfying calculation to verify that on substituting the expression for the
field h/l v given in Equation (52) into Equation (50), all the terms containing A/l
disappear, and all terms containing <1>, including its kinetic and mass terms,
disappear except for one term in which <1> plays the role of an auxiliary field. The
new expression of the Lagrangian is
~~)

= 1[1/1*<1 v,;. 1/1,,/ - 2 1/I*<1V,v 1/1;./ +

+ .1.(.I,*"v
.1,"
+ .1,<1V
.1,*" )
3 'I'
,vo/ a,a
V'
,v'i' a.,a

_ 1.,1,*"

+ nZP(I/I*/lVI/I/lV - tl/l*"al/l~)] +
+ HI/I*<1V,v<1>,<1 + I/I<1V,v<1>*,<1].

3'1'

.I,P ...\ +

1l,).'Y

(53)

A. M aheshwari

432

It is the Pauli-Fierz Lagrangian for a massive charged spin-2 field. If we add the
source term
((r - t(J~rl'lV)t/J:v

to the Lagrangian

+ ((J*I'V -

t(J*~IJI'V)t/JI'V

(53), the field equations for t/J I'V are easily seen to be

(0 - rn2)t/JI'V

rn 2 81' ,I,
'Y JlV = 0,

= (J~v

3~2(IJI'VO - 81'8v)J~~,

rn 2 = 4n 2 n 2/L 2,

and
(54)

We conclude by stating the results of the harmonic analysis ofthe five-dimensional


Kaluza-Klein theory. It has been seen that the zero charge sector consists of
a graviton, a photon and a Brans-Dicke scalar, and the nonzero charge sector
consists of an infinite tower of charged massive spin-2 fields.

1.3.

Appendix: The Kaluza-Klein Laorangian


2(5)

F{J[R + t4>FI'Jl'v +~4>-24>,v4>'v] +


+

cg [1AV gAr 8g8yAr + Aic;, 8g8yAr _ t AV;ag

V - Y

,v

V<1

gAr 8gAr _
8y

_ 4>-1 8A A4>,A _ t4>-2 84> A"4> - t4> F"A(A 8A A - A 8A,,)] +


8y

8y'"

" 8y

+ F{J [4>-1 {! 84> ~(AAA") +! 84> A" Ar 8g "r +


2 8y 8y
2 8y
8y

+ 1.

Ar}] +

"V Ar 8g"A 8g vr 1. "" Ar 8g v " 8g


4g 9 ay8y -4g 9 ay8y

8y

Kaluza-Klein Theories

433

(AI)

2.

Spontaneous Compactification and Isometry Groups

The starting point of all Kaluza-Klein theories is a theory of gravitation coupled


to some matter fields in d = 4 + K dimensions. The classical field equations of
the metric and the matter fields induce compactification of the extra space
dimensions, such that a 4 + K -dimensional background manifold, M 1 X M 2'
emerges out as a product of an ordinary four-dimensional spacetime, M l ' and
a compact space of K dimensions, M 2. The next step in the Kaluza-Klein
programme is to expand the metric and matter fields around this particular
solution using 'harmonic' functions for the invariance group C of the compact
space of K dimensions. In general, a finite number of massless and an infinite
number of massive modes are obtained using this programme.
We shall construct models for spontaneous compactification in dimensions
d ? 6 and show how the presence of the matter fields provide a length scale for the
compactified dimensions. We start with gravity gMN (M,N = 1,2 ... d) plus
matter fields (denoted collectively by <1 in dimensions d > 4, signature (- + +
+ + ... +) described by the Einstein-Hilbert action

s=-l-fddZ~
R+
16nCd
MN

(1)

where ... denote the coupling to matter fields. R is the scalar curvature in
d dimensions, (length)-2, and Cd' (length)d-2, is the d-dimensional version of
Newton's constant. We fix the sign convention for the curvature tensor and the
scalar curvature as follows

and

(2)
We now look for stable 'ground-state' solutions of field equations, <gMN> and
<<I>
which exhibit spontaneous compactification. The metric <gMN> will
describe a background manifold M 1 X M 2 provided it is block-diagonal,

>,

(3)

A. Maheshwari

434

g~v(x) is the metric of a four-dimensional spacetime manifold M 1 with the


signature ( - + + + ) and coordinates xl', and g~n(Y) is the metric of a compact

internal space, M 2' with Euclidean signature ( + + ... + ) and coordinates ym.
We place an additional restriction that the manifolds M 1 and M 2 be maximally
symmetric. The requirements of maximal symmetry for d = 4 spacetime restricts
us to Einstein spaces

(4)
if Y4 < 0, spacetime is de Sitter and the isometry group of the manifold is SO(l, 4);
if Y4 = 0 the spacetime is Minkowskian and the symmetry group is Poincare; and
if Y4 > 0 the spacetime is anti-De Sitter (ADS) and the symmetry group is SO(2, 3).
Sensible field theories can be constructed on Minkowskian or ADS manifolds
only. Therefore, we look for solution with Y4 ~ o. As far as the extra dimensions
are concerned, we want M 2 to be compact and to yield physically interesting
isometry groups. If M 2 is an Einstein space
(5)

Yano's theorem restricts Yk < O. It states that Einstein spaces of Euclidean


signature are always compact, and those with Yk > 0 have no symmetries if
compact. Therefore, we look for th(:ories which would give background
manifolds of the form M 1 X M z such that

R~v = Y4g~v' Y4 ~ 0;
R~n = Yk g~n' Yk < o.

(6)

These requirements place severe restrictions on the Kaluza-Klein theories. We


shall present two models which satisfy the conditions (6) for spontaneous
compactification.

2.1.

Freund-Rubin Compactijication

In the Freund-Rubin model, the matiler field is introduced as a completely


antisymmetric tensor F M 1M 2... Ms of rank-So This field is the curl of a rank -(S - l)
anti symmetric tensor potential AM\ ... M s _ I '
(7)
This field is the natural generalization of the Maxwell field F M,M2'

The action for the coupling of the matter field F M,M2 ... Ms to the d-dimensional
gravity is taken to be of the form

fddZ~ {...!:...R
+...!:...F
C
Cz
1

M, ... Ms

FM2... MS}

'

(8)

Kaluza-Klein Theories

435

where C 1 and C z are two-dimensional constants and

9=

detg MN

The field equations for the metric gMN and the vector potential AM, ... Ms_, are
easily found to be

F M, ... Ms-,
- _ C 1 S(F
Cz
MM, ... Ms-1 N

~g
F
FMI ... MS)
2S MN M, .... Ms
'

(9)

and
(10)
C 1 S/C z can be made equal to SnG N A trivial solution of Equation (9) and (10) is
RMN = 0 and

FM, ... Ms =0 .

(11 )

It is rejected because it is not of the form of Equation (6) and cannot give
physically interesting isometry groups. A nontrivial solution is
1

Fill .. Ils =

jS!IgJ

fF-SIlI .. Ils

0 otherwise

(12)

F-~''''IlS is the S-dimensional Levi-Civita tensor and f is a constant. To evaluate


the right-hand side of Equation (9), we use the identity

(13)
(jP(1 ... = (jP (j(1 ..
Il

}lV...

(14)

The following relations can be easily obtained.


F Ml ... Ms FM, ... MS =

rtl

s'

(15)
(16)

tis =

gs

[gJ'

(17)

The Einstein equation, Equation (9), breaks up into the following two equations
o
A(d - S - 1) 0
Rllv = (d _ 2) gllv,

(IS)

436

A. Maheshwari
o

Rmn

A(S - 1) 0
(d _ 2) gmn,

(19)

A = 8nG Nf2 sign(gs).

(20)

We note that the signs of the curvature scalars of the two product-spaces are
opposite (d > S + 1) and are determined by the sign of A. Therefore, for
d > S + 1, if the determinant gs is negative, i.e. the time is in the S-dimensional
space, then the d-S-dimensional space compactifies. If S = 4, d > 5, the fourdimensional manifold can be chosen to be the maximally symmetric ADS
spacetime. In d = 11 supergravity the supersymmetry forces the introduction of
an antisymmetric tensor field with S ,= 4. The Freund-Rubin analysis goes
through with minor changes. For d =, 11 supergravity one can choose the
four-dimensional Einstein space to be ADS and the seven-dimensional compactified manifolds to be S7 or squashed S7. These cases have been studied in
detail by Duff and Pope.

2.2.

Monopole Compactijication (Horwath, Palla, Cremmer and Scherk)

This model is based on Einstein-Maxwell theory in six dimensions with


a cosmological constant. It is characterized by the action

fd6Z~(~1 (R + A) + ~2 F MNFMN).

(21 )

(22)

In this expression, C u C 2 and A are dimensional constants with dimensions L 4 ,


L2 and L -2, respectively. The field equations for gMN and AM are
1

RMN - 2gMN R -

A
2
gMN

= -

8nG TMN ,

(23)
(24)
(25)

C1
- =4nG
C2

'

(26)

where G is Newton's constant of gravitation.


The most symmetric solution to these field equations would, of course, be the
six-dimensional Minkowski space with the vanishing Maxwell field. However,
such a solution can hardly be relevant to the description of the four-dimensional
world. We shall therefore restrict ourselves to the maximally symmetric solutions

Kaluza~Klein

Theories

437

with the structure of MIx M 2' with M 1 and M 2 being the maximally symmetric
four- and two-dimensional spaces of constant curvatures, respectively.
We make the Kaluza~Klein ansatz for the six-dimensional metric gMN'
gMN dZ M dZ N = g,jx) dxl' dx v

+ gmn(Y) dym dyn,

(27)

AM dZM = Am(Y) dym,


m,n = 5,6.

I1,V = 0,1,2,3;

(28)

gI'V(x) is the metric of the maximally symmetric four-dimensional spacetime


which can be De Sitter, Minkowski or anti-De Sitter; and gmn(Y) is the metric of
the two-sphere. It is convenient to use spherical-polar coordinates on the
two-sphere,

(29)

where a is the radius of the two-sphere and will be determined in terms of the
dimensional constants which have been introduced as parameters in the action. It
is. easily seen that a solution to

is the rotation invariant Maxwell field given by the potential,


(30)
A <I>

= 2e(cosO-1),
n

= 2e (cos 0 + 1),

0~O<n,0~tj><2n,

0 < 0 ~ n, 0

tj> < 2n.

(31)

It is the monopole solution ofWu and Yang. The North and South pole solutions
are connected by single-valued gauge transformations einq., provided n is an
integer. The Maxwell field
F = dA = -

= -

~ sin (} dB /\ dtj>
2e

n
.
- 22 a dO /\ a sm 0 dtj>.
ea

(32)

If we choose an orthonormal basis on the two-sphere

di =

dy6=asinOdtj>,

adO,

(33)

the nonzero components of the field F MN are


F

F 56 d y 5

/\

dy 6 ,

F 56

= ~.

n/2ea 2 .

(34)

A. Maheshwari

438

The energy-momentum tensor T4 MN in this basis is


n2

T". = -

t9".F~6 = - -e28
a49".,

~m=O.

~~

We are looking for maximally symmetric solution of the manifolds M 1 and M 2'
therefore

(36)
(37)

where A is a dimensionless parameter. The six-dimensional scalar curvature is


easily seen to be

=-

2(~ + :2).

(38)

If A > 0, M 1 is de Sitter; if A < 0, M 1 is ADS, and if A = 0, M 1 is Minkowski. The


Einstein field equations in this basis give two algebraic equations for A and a 2 ;

A
2G

A. nGn 2

+ a2 - 2 e2a4 = 0,
A.

G- 2 +

(39)

nGn 2

(40)

e2a4 = O.

These equations can be solved for Ija 2 and A;

(41)
and

A= ~(A.G
3

~{1
+ )1- 3nGn A.}).
3nn
2e
2

(42)

If A. < 0, then for each value of the monopole charge n, there is one positive
solution for a2 This solution corresponds to A < o(anti-De Sitter world). On the
other hand, if A. > 0, then there are two positive solutions for a2 , provided n2 is not
too large,

Kaluza-Klein Theories

439

At the upper end of this range, however, one finds A > 0 (De Sitter world) which
should probably be excluded. Of particular interest is the case of flat four-space
A = O. This occurs for

2nGn 2
e

(43)

=--2-'

and
(44)

2.3.

Isometry Group and the Yang-Mills Fields

We want to find the zero modes (gauge bosons) in a Kaluza-Klein theory which
due to suitable coupling to matter fields has a spontaneously compactified
background manifold MIx M 2' The ground state metric therefore has a block
diagonal structure

<gMN(X, y)

(g~~x)

0)

g~n(Y)

Let G be the isometry group of the compactified manifold M 2' It is determined by


finding the Killing vectors on M 2' Killing fields K~)(y) are determined by the
Killing equation
K~!n

+ K~~~

(45)

O.

If we associate Killing vectors K() to the Killing fields K~),

K()

K()n(y)

a~n'

(46)

the isometry group of the compactified manifold M 2 can be determined from the
Lie brackets
(47)

The structure constantsf"ll y fix the isometry group G. The Kaluza-Klein ansatz,
which gives the zero modes, has the form

_ (g/lJX)
gMN -

+ A~)(x) K()n(y)A\,Il) K(Il)m(y)g~n(Y)


A~)(x)K~)(y)

A~.)(X)K~.)(Y)) .

(48)

g~n(Y)

Note that in this ansatz we have suppressed the massless scalar modes which, like
the Brans-Dicke scalar of the 5-dimensional theory, exist in the general case.

440

A. Maheshwari

Now consider a general coordinate transformation


ZM

ZM

+ ~M(Z),

(49)

under which

c5g MN(Z) = VM~N + VN~M

(50)

and focus attention on the special form of the transformation


~M(X,

y)

= (0, ea(x)K(a)n(y)),

(51)

with ea(x) as arbitrary infinitesimal functions of the four-dimensional spacetime


coordinates xl'. On working out the transformations of g/lv(x, y) and, hence, that of
A~a)(x), it can be shown under the general coordinate transformations given in
Equation (51), that
(52)
This is precisely the transformation law for Yang-Mills fields with gauge group
G. Hence, G is obtained as a subgroup of the d-dimensional general coordinate
transformations. We may conclude from this analysis that the key problem ofthe
Kaluza-Klein programme is to start with suitable matter fields coupled to
d-dimensional gravity, such that the background manifold is spontaneously
compactified and the compact manifold possesses a physically interesting
isometry group.

3.

3.1.

Harmonic Expansions, Chiral Fermions and All That

Coset Spaces G/H

Let us assume that on account of the presence of the matter fields, the ground
state geometry factorizes into the product of four-dimensional spacetime
(Minkowskijor ADS with a compact manifold MK of K-dimensions. Let the
compact manifold be invariant under the action of some group G which will be
interpreted as an 'internal' symmetry. If H is a subgroup of the compact
continuous group, G, then the space of the left cosets G/H, is invariant under the
action of G. We shall assume that the internal space M 2 , obtained by solving
4 + K-dimensional equations, is such a quotient space and its dimension
K = dimension G - dimension H.
Let some region of MK be parametrized by the coordinates ym, m = 1 ... K and
let there be given a K-bien, i.e. a set of linearly independent one-forms
ea(y)

dyme~(y),

IX

= 1,2, ... , K.

If the ea are orthonormal, then the metric tensor components can be expressed by

Kaluza-Klein Theories

441

If the compact space admits a group of motion, G, then gmn will satisfy Killing's
equations. Equivalently, to each element g E G there corresponds a mapping
y --+ y' such that e~(y') = efl(y)Dp, where D is an orthogonal matrix, i.e. an element
of the tangent space group, O(K). Such transformation leave the metric invariant.
In fact, the existence of a group of motions, G, of MK implies a mapping at each
point y
g --+ h(y, g)

--+

D(h(y, g)),

where h is an element of the subgroup H, which can be embedded in O(K).


The coset spaces also provide the most economical way in terms of internal
dimensions for obtaining the compactified manifold of a desired symmetry. If
H = 1, MK = G.
Many group theoretical techniques are available to implement the parametrization of GjH. Suppose the coordinates, ym, label the cosets of G with respect to
the subgroup H. That is, from each coset let there be chosen a representative
element L y . To define a covariant basis consider the one-form,
e(y) =

1; 1 dLy .

This object belongs to the infinitesimal algebra of G and, therefore, can be


expressed as a linear combination of generators, Qa.
e(y) = e&(y)Q" = dyme&mQ".

The generators Qri fall into two categories. The set which generates the subgroup,
H, and the remainder Q~, ('f. = 1 ... K, associated with the cosets, GjH. Correspondingly, one writes

The coefficients e~(y) which constitute a nons in gular K x K matrix provide the
vierbeins on MK = GjH invariant under the group of motion, G. In terms of the
adjoint representation of G defined by
g-lQrig = Dl(y)Qp,

the Killing fields can be expressed as


K~(y) = D/(L(y))e;J(y).

3.2.

Harmonic Expansions

(1) MK = G. Consider a scalar field <l>(x, y). Under coordinate transformations


y --+ y', <l>(x, y) --+ <l>'(x, y') = <l>(x, y). Consider now the ansatz

<l>(x, y)

= D(L -l(y))<l>(x).

Here L(y) is a group-valued function which is generically of the form


exp(iw>(y)Q~) and D is some unitary irreducible representation of G. The

A. M aheshwari

442
fundamental property of L(y) is that under left translations
g: L(y) --+ L(y')

gL(y),

9 E G.

Therefore under y --+ y', we have


D(L -l(y<l>(X) --+ D(L -l(y'<l>'(x) = D(L -l(yD-l(y)<l>'(x),

so that for scalar fields we have


<l>'(x) = D(g)<l>(x).
We conclude from it that the four-dimensional fields carry a representation of G.
The formal expression of harmonic expansion over G can be written in the form

n P.q

Here d n is the dimension of the unitary irreducible representation D(n), p and


q specify labels within this representation. If G = SU(2), the D~~ are the Wigner
rotation functions.

(2) Coset Spaces GjR. On coset spaces, we have to proceed in a slightly different
way. The fundamental transformation property for L(y) is now
g: L (y) = L(y') = gL(y)h -l(y, g).

Therefore, under the coordinate transformation y

--+ y',

D(L-l(y<l>(x) --+ D(L-l(y'<l>'(x)

= D(h)D(L-1(yD-1(g)<l>'(x), hER.
We have a factor D(h) on the right-hand side which cannot be compensated by
requiring a suitable transformation behaviour for <l>(x). We therefore require that
under a coordinate transformation, the world scalar <l>(x, y) transforms according
to
<l>(x, y)

--+

<l>'(x, y') = [D(h)<l>(x, y),

where [D denotes some particular representation of H. Writing out some indices,


this becomes
<l>;(x, y) = [D;/h)<l>/x, y).
Only if [D happens to be a trivial representation, do we have the old definition of
scalar field. The indices i,j are, by definition, indices associated with H. Since <l>(x)
carries a representation of G, the index associated with i must be carried by
D(L-l(y in the symbolic formula for the harmonic expansion. Therefore, on
GjR, the correct formula of the harmonic expansion is
<l>j(x, y) =

I I (thr D\,!p(C l(y<l>~((x).


n

P.~ v~

Kaluza- Klein Theories

443

Here d[} is the dimension of the representation of H. The sum over n is over all
irreducible representations that contain [[]) of H. If, in a given representation,
[[]) appears several times, then ( labels these different representations.
As an example of the harmonic expansion, let us consider the vielbein itself and,
in particular, the component E~(x, y). Since these transform like the components
of a K-vector under SO(K), even the lowest term in the harmonic expansion must
display some y-dependence. Recall from Section 1 that the line-element of the
five-dimensional K-K theory was postulated to be
ds 2 =

gMN

dZ M dZ N

= gl'v(x) dxl'

dx v

+ (dy + AI'(x) dxl')2.

From this, we read off suitable vielbein components. In fact let


EA = Et,dZ M

The vielbein components are determined by the requirement that the line element
takes the form
ds 2 = YfABEAEB,

with YfAB = (Yfab.

+ 1).

It gives
A -_ (E:
EM

AI') .

Note that the appearance of complete square (dy + AI' dxl')2 in the line element is
reflected in the appearance of 0 in the matrix for the vielbein. For the
d-dimensional case, we make the ansatz
Et,(Z) = ( EaI'

EO)
1'.
E':,.

If we write down a harmonic expansion for this vielbein, then the first term for
E: will simply be E:(x), since these behave like components of a frame scalar
under internal rotations. The first term in the harmonic expansion of E~ is

E~(x, y)

= -

D~L- l(y))A~(x)

= - A~(x)DP<L(y)) = - A~(x)Kp(y).
The ansatz for the vielbein is completed by setting for E':,.(x, y) simply the vielbein
for MK = G/H, that is E~(x, y) = e~(y). Thus, we get

A. Maheshwari

444

The metric equivalent of this Kaluza-Klein ansatz for the vielbein is


(0) (

) _

gMNX,Y -

3.3.

(g~~(X) + (AIl,A.)
A~(x)Kln(Y)

At(X)Kfm(Y))

g~~(y)

Witten's Modelfor SU(3) x SUO) x U(l) Kaluza-Klein Theories

Known particle interactions can be described by the gauge group SU(3) x SU(2)
x U(l). So the symmetry group of the compact space MK must at least contain
this as a subgroup,
SU(3) x SU(2) x U(l)

G.

So MK must at least have SU(3) x SU(2) x U(I) as a symmetry group. To be as


economical as possible, we may wish to choose MK to be a manifold with
minimum dimensions with an SU(3) x SU(2) x U(I) symmetry. What is the
minimum dimension of a manifold which can have an SU(3) x SU(2) x U(I)
symmetry?
U(l) is the symmetry group of the circle Sl, which has dimension one. The
lowest dimension space with SU(2) symmetry is the ordinary two-dimensional
sphere S2. The space of lowest dimension with symmetry group SU(3) is the
complex projective space CP2, which has four real dimensions. Therefore, the
space Cp2 x S2 X SI has SU(3) x SU(2) x U(I) symmetry, and it has 4 + 2 + 1
= 7 dimensions. The minimum dimensionality of a manifold with SU(3) x
SU(2) x U(I) symmetry is seven, although Cp2 x S2 X Sl is not the only
manifold which has this symmetry. If, therefore, we wish to construct a theory in
which SU(3) x SU(2) x U(I) gauge fields arise as components of the gravitational
field in more than four dimensions, we must have at least seven extra dimensions.
With four 'noncom pact' spacetime dimensions, the total dimensionality of our
world must be at least 4 + 7 = 11. This last number is most remarkable because
eleven dimensions is probably the maximum for supergravity. A supergravity
theory in d > 11 would have to contain massless particles of spin greater than
two.
If one is willing to suppose that the ground state of eleven-dimensional gra vity
with appropriate matter fields is a product of four-dimensional Minkowski (or
ADS) spacetime and a compact manifold with SU(3) x SU(2) x U(l) symmetry,
the gauge fields will arise as components of the gravitational field. Of course, to
describe nature, it is not sufficient to have the gauge group. It is also necessary to
have quarks and leptons of zero mass which should be in the appropriate
representations of the gauge group.
How can one obtain massless quarks and leptons in the Kaluza-Klein
framework? To understand the basic idea, suppose that in a 4 + K dimensional
theory we have a massless spin one-half fermion. It satisfies the 4 + K
dimensional Dirac equation, IN = 0 or explicitly
4+K

i= 1

yiDil/J

O.

445

Kaluza-Klein Theories

This Dirac operator can be written in the form


P(4)I/I

+ p(int)I/I = O.

This shows that eigenvalues of p(int)I/I = AI/I, will be observed by four-dimensional


observers, who are unaware of the existence of the extra microscopic dimensions,
as mass IAI of the fermion. The operator acts on a compact space so its spectrum is
discrete. However, there is an identity
("/DYI/I(x, y)

== (- DmD m- tR)I/I(x, y) (Lichnerowicz theorem)

On compact spaces - DmD m and - tR have the same sign and, moreover, R is
constant, therefore the right-hand side is never zero. Thus (yiDY and, consequently,
yiDi have no zero modes.
Of course, to reproduce what is observed in nature, we would need quite a few
zero modes of the internal space Dirac operator. In nature, there are at least 45
fermion degrees offreedom of a given helicity, counting all colours and flavours of
quarks and leptons. Also note that the fermions of a given helicity form a complex
representation ofSU(3) x SU(2) x U(I). The fermions of one generation transform
under SU(3) x SU(2) x U(I) as (3,2)1/3 + (3,1)-4/3 + (3, 1)2/3 + (1, 1)2 + (1, 2)-1,
which is a complex repesentation.
The left-handed fermions transform differently from the right-handed fermions.
Can the Rarita-Schwinger fields help?
The Rarita equation in the gauge yMI/I M = 0 is
or
P(4)I/IM

+ p(int)I/IM

0 (M

{.u,m}).

Again zero modes of p(int) are observed as massless particles in four dimensions.
The general zero mode is a sum of modes of two special kinds. For M = {.u}, P(int)
is the ordinary spin! Dirac operator, and the zero modes are spin! fermions in
four dimensions. For M = {m} the zero modes have spin! as seen in four
dimensions. So the zero mode solution of
p(int)I/Im

=0

will be observed as massless spin! fermions.

3.4.

Problems of Complex Representation of Symmetry Group

Odd-Dimensions. The crudest problem arises in odd K. For odd K, the group
O(K) has only one spinor representation. Likewise, the group 0(1,3 + K) has
only one spinor representation which transforms under 0(1,3) x O(K) as the
product of the four-component spinor of 0(1,3) and that of O(K). This being so,
fermions that are left- or right-handed in four dimensions transform the same way
under transformation of internal space. They obey the same Dirac equations in
the internal space, so they have the same quantum numbers and furnish a real
representation of any relevant symmetry group.

A. M aheshwari

446

Even dimensions. For even n, the operator y = y1y2 ... yn anticommutes with all yi
so it is a c-number in any representation of the Clifford algebra. Since y2 = 1
(depending on n), the representation space of the Clifford algebra decomposes
into two eigenspaces of y, the eigenvalue being 1, i. Since ycommutes with the
O(n) group generators t[yi, ')Ii], the group has two inequivalent spinor representations, labelled by the eigenvalue of y. In a world of 4 + K dimensions, we define

'I =

')I1y2 ... y4+K,

y(4) = y1')12 ... ')14,

y(int)

y5')16 . ')14 + K.

'I labels the spinor representation of 0(1,3 + K).

y(4) measures the helicity of


four-dimensional fermions and ')I(int) labels the spinor representation of O(K), it
measures what might be called internal helicity.

'I =

y(4)y(int).

This equation says that for a fixed 'I, the four-dimensional and internal chiralities
are correlated. If we start with a fermion field withy = 1 in 4 + K dimensions, it
breaks down under 0(1,3) x O(K) into components with
m(4)

= I) -1')1(int) =

1)')1(4)

I) -ly(int)

+ 1,

= - 1.

Case of 4n dimensions. '12 = - 1. The eigenvalues of 'I are i. Being complex


conjugates, the eigenvalues of 'I are related to each other by CPT, and CPT
requires that there be an equal number of fields withy = i andy = - i. Hence,
there is no net correlation between four-dimensional chirality and internal
chirality. Fields of 'I = + i give one correlation. Those of 'I = - i give the opposite
correlation. ('I is odd under CPT and CPT acts on spinors just by complex
conjugation.)
Case of 4n + 2 dimensions. In this case -y2 = 1, so y has eigenvalues 1. CPT
leaves 'I unchanged, and we can consider a theory in which fermions are of the
same helicity, say 'I = 1, only. This corresponds roughly to a theory with V-A
gravitational interaction that forbids fermion bare masses. The question is
whether V - A gravity can reduce to V - A interactions in four dimensions.
Fixation of phase factor I). We note that ')1(4) (in Majorana basis) is a real matrix
whose square is -1; so the eigenvalue of ')1(4) are i. In 4n + 2 dimensions ')I(int)
likewise has square -1 and eigenvalues i. A Fermi field that obeys

'I

y(4)y(int)

+1

therefore has
(A)

y(4) = -

')I(int)

= i

or

(B)

y(4) = -

')I(int)

= - i.

A CPT transformation will complex conjugate the eigenvalues, so eigenvalues of


type (A) and (B) are exchanged by CPT. This is as it should be. A zero mode of the

Kaluza-Klein Theories

447

internal Dirac or Rarita-Schwinger operator with y(int) = - i corresponds to


left-handed fermions in four dimensions. The complex conjugate will have
y(int) = + i and corresponds to a right-handed massless fermion in four dimensions.
Massless fermions in four dimensions will transform into a complex representation
of the symmetry group G, the zero modes of the internal Dirac operator with
y(int) = _ i form a complex representation of G.
Having understood that in 4n + 2 dimensions it is possible to have complex
representation of chiral fermions, the next level of questions are why would
a wave operator have zero eigenvalues? And why would these zero eigenvalues
form complex representations?

References
Introduction to Kaluza-Klein Theory
Th. Kaluza, Sitzungsber. Preuss. Ahad. Wiss. Berlin, Math. Phys. KI (1921), 966; O. Klein, Z. Phys. 37
(1926). English translations of these two papers by T. Muta are available in An Introduction to
Kaluza-Klein Theories (H.C. Lee (ed.)), (World Scientific, Singapore, 1984).
P. Bergman, Introduction to the Theory of Relativity (Dover, New York, 1976).
W. Pauli, Theory of Relativity (Pergamon. London 1958).
A. Salam and 1. Strathdee, Ann. Phys. (NY) 141 (1982), 316.
E. Witten, Nucl. Phys. 8186 (1981) 412.
W. Mecke1enburg, The Kaluza-Klein idea: status and prospects, rCTP report IC/83/32(Trieste,
1983).
A. Maheshwari, A study of Higgs effect in 5-dimensional Kaluza-Klein theory, Pramana -J. Phys. 27
(1986), 383.
T. Applequist and A. Chodos, Phys. Rev. D28 (1983), 772.
Spontaneous Compactification
J. Scherk and 1. H. Schwarz, Phys. Lett. 857 (1975), 463; E. Cremmer and J. Scherk, Nucl. Phys. 8103

(1976), 393; 108 (1976), 409.


Z. Horvath, L. Palla, E. Cremmer, and J. Scherk, Nucl. Phys. 8127 (1977), 57.
J. F. Luciani, Nucl. Phys. 8135 (1978), 111.
P. G. O. Freund and M. A. Rubin, Phys. Lett. 897 (1980), 233.
P. Candelas and S. Weinberg, Nucl. Phys. 8237 (1983), 397.
S. Randjbar-Daemi, A. Salam, and J. Strathdee, Nucl. Phys. B214 (1983),491.
M. 1. Duff, B. E. W. Nilsson, and C. N. Pope, Kaluza-Klein supergravity, Phys. Rep. 130, 1 (1986).
Chiral Fermions in Kaluza-Klein Theories
E. Witten, Fermion quantum numbers in Kaluza-Klein theory, in Shelter Island I I, eds. R. Jackiw, N.

Khuri, S. Weinberg, and E. Witten (MIT Press Cambridge, 1985).

22. Kaluza-Klein Cosmology


J. SAMUEL
Raman Research Institute, Bangalore 560080, India

1.

Introduction

The movement started by Kaluza [1] and Klein [2] is a promising approach
towards a unified description of the forces of nature. This approach postulates
that space has more dimensions than have been hitherto explored. While there is
no experimental basis for this postulate, there are theoretical reasons for pursuing
it. It results in a unification of physical concepts which now appear to be
unrelated. For instance, in Kaluza-Klein (KK) theories, 'internal' quantum
numbers, like 'charge', appear on the same footing as spacetime-related
quantities, like energy and momentum. Charge is related to the momentum along
the extra dimensions. Similarly, gauge transformations appear as a particular
class of coordinate transformations in the higher-dimensional space. This puts
together two powerful symmetry principles of physics - gauge invariance and
general covariance. The discrete symmetries of quantum field theory - parity P,
charge conjugation C and time reversal T - do not appear in the same way in
four-dimensional theories: P and T are related to spacetime transformations, i.e.
reflections in space and time, but C is an internal operation. In KK theories, Cis
a reflection in the extra dimensions and so is on an equal status with P and T. In
the higher-dimensional space, there is but one force - gravity. The fact that we see
different forces in four dimensions is because we only explore the low energy
sector of the theory.
Particle physicists tend to ignore gravity in their thinking. There are excellent
reasons for doing so, since in practice, gravity is negligibly weak at the atomic and
nuclear scale, and at the theoretical level, general relativity is not renormalizable.
Indeed, even when particle physicists do study gravity, they tend to play down its
geometric character (so dear to relativists) and treat it as just one more field
theory (Feynman [3]). Weinberg [4], for instance, feels that "the geometric
approach has driven a wedge between general relativity and the theory of
elementary particles". However, the recent history of particle physics (in which
Weinberg himself has played an important role) seems to counter this view. For,
particle physicists have been led to use gauge theories in their description of all
the forces of Nature, and gauge invariance as a guiding principle in writing their
Lagrangians. At a classical level, gauge theories have strong geometric characters.
The Yang-Mills (YM) fields Am are 'connections' just like the affine connections
r~n of general relativity. Parallel transport is common to both theories. The field
strength (curvature) is related to the nonintegrability (path dependence) of
449
B. R. Iyer et a/. (eds.), Gravitation, Gauge Theories and the Early Universe, 449--465.
1989 by Kluwer Academic Publishers.

J. Samuel

450

parallel transport. While there are differences (the Lagrangian for YM fields is
quadratic in the field strengths, unlike in gravity; there is no analogue of the
metric in YM theories), the similarities are more striking than the differences. So it
would seem that the geometric approach, far from driving a wedge between
elementary particle physics and gravitation, actually brings them closer together.
The idea of Kaluza and Klein is to understand both gravity and gauge fields in
a geometric way by resorting to higher dimensions. The theory was first proposed
as a unification of gravity and electromagnetism. This theory is briefly sketched
below.
2.

Five-Dimensional Kaluza-Klein Theory

Let ZM, M = 0,1,2,3,5, be coordinates for five-dimensional spacetime. The


geometry of the spacetime is described by a metric tensor ' MN of Lorentzian
signature.
ds 2

= ' MN dz M

ZM'=

(xm,y),

dz N .

(1)

m = 0,1,2,3

The first four coordinates zo, z1, Z2, Z3 describe ordinary spacetime, while
describes a hypothetical fifth dimension,. which is like a circle (Sl)

o ~ Y ~ 1.

(2)
Z5

(3)

y = 0 nd y = L are identified. We write the line element in the form

ds 2

= gmn

dxm dxn - a2 (dy

+ KAm dxmf

(4)

and identify gmn as the metric of spacetime, describing gravity and Am as the
potential describing electromagnetism. So five-dimensional gravity splits up into
four-dimensional gravity and electromagnetism. But not quite! There is also
a scalar field a (square root of ' 55) which is essentially the size (aL) of the fifth
dimension. All the ' MN(X) are supposed to be independent of y, so 0/ oy is a Killing
vector of the five-dimensional space. The constant K introduced above is the
Planck length:
K

16nhG
c3

=--

(5)

which is necessary to make sure that the Am have dimensions of mass.


The original form for the line element (1) is invariant under general coordinate
transformations in five-dimensional space
ZM --> ZIM(Z).

Only a subset of these respect the y indept:ndence of the metric coefficients. These
are
four-dimensional general coordinate transformations.

Kaluza-Klein Cosmology

451

= cy
rescaling the fifth-dimension. (This is not interesting.)
(3) y -+ y' = y + cx(x) local transformations in the origin ofthe y-coordinate.
(2)

-+

y'

Under the last transformation,


(6)

So the transformations of class 3 can be interpreted as gauge transformations.


Under transformations of class 1, Am transforms like a covariant vector field.
Let us now invert the metric tensor. Introduce 'tetrads' (actually pentads)
eO

= dxo,

ei = dxt,

+ KAmdxm.

e5 = dy

(7)

Then,
ds 2

= grfirierfieri -

a 2 e5e5

(8)

The inverse quintads are

e o = - - KA-

axo

ei

= ax2 -

nay'

KA 2ay '

e3 = ax3 - KA 3ay '

e., = ay"

(9)

These satisfy
erfieri={jrfiri.

(10)

The inverse metric tensor is given by


t

MN

aza aza
M

=9

rfiri

erfieri - a

-2

e.,e5

Or, more explicitly,

(12)

J. Samuel

452
inverts to

gmn
'MN =

(13)

-KA m
where gmn is the matrix inverse of gmn and
(14)

Consider a particle moving geodesic ally in the five-dimensional space. Its


Hamiltonian is
(15)

where PA are the canonical momenta. Since 3/3y is a Killing vector, Ps is


a constant of the motion. H assumes the form
H = Hgmn(Pm - KPSAm)(Pn - KPSAn) - a-2p~ - m 2 ],

(16)

which is the Hamiltonian


H = Hgmn(Pm - eAm)(Pn - eAn)

+ m'2]

(17)

for a charged particle in four-dimensional spacetime if we set


(18)

In the quantum theory Ps is quantized in units (we set a = 1 for the moment) of
2nh/L:
(19)

which shows that charge is quantized in units of

2nK
e=-.

(20)

Ja

Since the coupling constant e =


~ 10- 1 is of order 1, the size of the extra
dimension must be of the order of the Planck length. (Exercise: Do the same
dimensional reduction for a scalar field.)
So the five dimensional theory of pure gravity ('MN) does reproduce both
gravity (grnn) and electromagnetism (Am) for test particles. Coming now to the
field equations, the action is taken to be the Einstein-Hilbert action in five
dimensions.
5

fs Zy'
ims~,

1
1= 16nG s d

(21)

Kaluza-Klein Cosmology

453

where G5 is the Newton's constant and 51i is the scalar curvature in five
dimensions. By computation (which is simplified by use of the quintads d", e5) it
follows that
(22)
and

ft=F9

(23)

So the reduced action in four dimensions is


4[

16~G5f d4xF9(41i _ ~2 FmnFmn).

(24)

the usual one for gravity and electromagnetism.


In the above, gauge transformations act on the space S1 describing the compact
y dimension. In order to realize non-Abelian gauge groups, we must have more
extra dimensions. The isometry group of the compact space B is then the gauge
group. If yI' are coordinates on B, the generators of the action of the group G are
Killing vector fields on B
(25)

X~( y),

where (J( labels the generators and Jl is a contravariant index. The form of the line
element is now given by
ds 2

gmn dxm dxn - Yl'v(dyll + KA~(x)dxmX~(y))(dyV + KAe(x)dxnXll(Y)).

(26)

A~ are the gauge fields. Essentially, all that was said above for the five-dimensional theory carries over to the general case.

3.

Remarks

(1) One might ask what space B should be if the currently used gauge group is to
act on it. It turns out that at least seven dimensions are necessary. For, given
SU(3)c x SU(2) x U(l), the smallest spaces which have these isometry groups are
given in Table I.
Table I
Group

Space

Real dimension

U(l
SU(2)
SU(3)

Sl
S2
CP2

1
2
4

J. Samuel

454

This calls for at least 11 spacetime dimensions. Curiously, this is the largest
dimension [5] in which supergravity theories can be formulated. This has led to
an interest in using S7 (on which the gauge group SU(3) x SU(2) x U(1) can be
realized) as the compact space.
(2) The assumption that alay is a Killing vector seems rather arbitrary. But
notice that the smallness of the fifth dimension makes it energetically expensive to
have variations of the fields in y. The energies needed to excite higher modes are of
the order of Planck energy. So in a low energy analysis, one can suppose that all
fields are y independent.
(3) The modern attitude towards the smallness of the extra dimensions is that
it must be a consequence of the dynamics. It is unaesthetic to legislate its size and
leave it at that. The idea of 'spontaneous compactification' is that the dynamics
determines that some dimensions should become small. The ground state of
Einstein's theory in N dimensions should, according to naive reasoning, be MN
(Minkowski space in N dimensions). For, the ground state usually has a large
symmetry and q>N (the Poincare group in N dimensions) is the maximal
symtnetry (N(N + I )/2 generators) allowed. One assumes instead that the ground
state has lower symmetry than this, i.e. the ground state is M4 x B whose
symmetry group is q>4 x G. This is similar in spirit to spontaneous symmetry
breaking (SSB) in particle physics.
(4) The mechanism for compactification seems to require additional fields Gust
like Higgs fields are needed in SSB). This detracts from the simplicity of a purely
geometric theory.
(5) The Kaluza-Klein approach generates only gauge fields. Matter fields (in
particular fermions) do not arise naturally in the theory. These fields can be
naturally brought in if you invoke supersymmetry.
(6) The standard model uses chiral fermions. The left- and right-handed parts
of the fields couple differently in the Lagrangian. There does not seem to be
a natural way of achieving this in Kaluza-Klein theory. Topologically nontrivial
field configurations in the internal space may help.

4.

Dimensional Reduction

This idea is really not very strange or esoteric, but is already present in many
down-to-earth physical situations. Consider a particle on the surface of a crystal.
It is free to move easily in two dimensions,. but the motion normal to the surface is
constrained by a large work function. We can model the situation by a particle in
a box of sides (L, L, L')

o ~ x ~ L,

o ~ Z ~ L'.

(27)

The Schrodinger equation admits solutions

. nx 7rX . n ny . n 7rZ
tp = A S l l l - - Slll-Y- Slll-zL

];'
(28)

Kaluza~Klein

455

Cosmology

with energy
(29)
Now let L' become very small. Then, at low energies the higher nz modes are hard
to excite and the quantum number nz is forced to be 1. The expression for energy is
(apart from a constant rr2112 /2mL'2) just the same as for a two-dimensional
system.
Exercise: Do the same analysis for a wave guide of dimensions L, L, L' L. Show
that in the two-dimensional sense, the photon appears to have a mass.
A particle in a strong magnetic field. Suppose the field along the z direction is so
strong that the energy between Landau levels is ~ me 2 . Then the spectrum looks
like as shown in Figure 1, where the excitations close together correspond to
motion in the z direction. At low energies ( mc 2 ), the particle is locked in the
n = 0 sector and behaves essentially like a one-dimensional particle, whose only
degree of freedom is motion along the z direction.
n = 2

=1

n = 0

Fig. l.

Having considered the dimensional reduction of test particles in these examples,


we now look at the dimensional reduction of the Einstein~Hilbert action.
Starting with the five-dimensional expression
(30)

and using
(31 )

(32)
since
det e~ = 1,

(33)

J. Samuel

456
we arrive at
4

I=_L_fd
16nG 5

4X

~g(4~_.1K2F
Fmn_25A)
4
mn

v' -y

(34)

So, in the four-dimensional world, we have

(35)

G is Newton's constant in four dimensions,


4A=5A,
4

(36)

A is the cosmological constant in four dimensions, and

2nK
e=-,

(37)

e is the dimensionless coupling constant of electromagnetism.


In a theory with D compact dimensions,
4+D~

4~

+ D~ _

iFmnFmn.

(38)

The cosmological constant in four dimensions is therefore

4A

5.

5.1.

= 4+DA _

D~.
2

(39)

Cosmology

Motivation

(1) While the Kaluza~Klein idea may contain a germ of truth, there is no clear
way to test its predictions. The sizes of the compact dimension of the order of the
Planck length are beyond the reach of terrestial experiments. One possible test of
the theory is the early universe ~ the highest energy laboratory available. If extra
dimensions do exist, they would certainly have been important in the early
universe. One can study KK cosmology as a way to test the theory. The scale for
KK cosmology is between the Planck and the GUT era, 10- 43 - 10- 35 sec.
(2) Particle physicists would suppose that the present size of the extra
dimensions is preordained. However, it would be more appealing if the smallness
of the extra dimensions is a consequence of the dynamical evolution of the
universe. Just as we see the three spatial dimensions expanding, maybe the extra
dimensions contracted to their present size.
(3) The dimensionality of the world is surely one of the deepest questions in
physics. This number has always been regarded as fixed and eternal, just as,

Kaluza-Klein Cosmology

457

before Einstein, the geometry of spacetime was regarded as fixed and eternal.
Einstein showed that geometrical properties of spacetime depended on the
physical processes going on in it and, hence, was dynamical. A natural extension
of this idea is to suppose that the dimension of the world we live in is also affected
by the physical processes in it. As mentioned before, the dimension of the world is
not likely to differ from three plus one, except in the energetic conditions
prevalent in the early stages of the universe.
(4) Apart from the particle physicists' desire to test their theory, there is also
the hope that a higher-dimensional cosmology may help resolve some outstanding problems in our understanding of the universe. And there are several:
(a)
(b)
(c)
(d)
(e)

Horizon problem,
Flatness problem,
Monopole problem,
Entropy problem,
Cosmological constant,
(f) Smoothness (size of fluctuations),
(g) Baryon number problem,
(h) I ni tial singularity.

This hope has not yet been realized, but it gives us a reason for going ahead.
Needless to say, all these considerations are very speculative. Not all those who
write papers on KK cosmology agree. A clear picture as yet to emerge and, at the
present time, we can best talk about 'models' which capture one or another
feature that we are interested in. There is not enough known to build a 'scenario'
which can quantitatively come to grips with the real situation.

5.2. The Kaluza-Klein Cosmological Line Element

What kind of metric would we use to describe the early universe? Observations
are consistent with the assumption that the universe is homogeneous and
isotropic (maximally symmetric) in the three spatial dimensions. We will make
this simplifying assumption and take the spatial metric as
R2(t)gij dx i dxi,

(40)

where i}ij is the metric for a maximally symmetric three space

..

gIJ.. dx ' dx J

dr2
2
2
2
2
+ r (de + sin edqJ ).
1 - k j r2

(41)

We cannot suppose that KK cosmologies are isotropic, since it is clear that the
extra dimensions are much smaller than the spatial ones today. We will suppose
that the compact dimensions are, within themselves, homogeneous and isotropic.
This is done for reasons of simplicity. Even if the internal space was not
'maximally symmetric', there is no reason to attack this complicated situation

458

J. Samuel

before we deal with the simplest possibility. We can also set cross-terms in the
metric to zero, since their presence would violate isotropy. Thus, the metric tensor
has the form
Xi

Xi

yll

[~

0
R2(t)gij
0

yV

a,(~J

(42)

and the general line element is


ds 2 = dt 2

R2(t)!7iidxidxi - a 2 (t)Yllvdxlldxv,

(43)

where both g and y describe symmetric spaces. R and a are the scale factors for the
spatial and compact dimensions. It is their evolution we wish to determine. Thus
KK cosmology is nothing more than anisotropic, homogeneous higher-dimensional
cosmology. The field equations that will determine R and a as functions of time are
the higher-dimensional Einstein's equations
(44)

that follow from the 4 + D-dimensional Einstein action. We first look at some
simple solutions of these equations.

5.3.

Kasner Solutions

These are vacuum solutions (we set TMN = 0 for simplicity) of Einstein's
equations that describe homogeneous, anisotropic universes. These are important, even in ordinary cosmology, so this class of solutions is worth studying, even
in the three-dimensional context.
Let us start with
(45)

where M, N take values 1,2, 3, 5 ... 5 + D - 1. This is a synchronous reference


system. We separate the time and space derivatives in the expression for the Ricci
tensor. We define
aYMN
XMN=Tt

(46)

Then (using y to raise and lower indices)

a
at

=tr(y-ly) = tr(lny)" = -lndety.

(47)

459

Kaluza-Klein Cosmology

The usual formula for the r's shows

r80 = r~ = r8N = 0,
(48)

are the usual 3 + D-dimensional r's made from I'MN') The Ricci tensor is
given by

(A~p

~oo =

-"2

at X

1 M

4X N X

M -

g~OM

= ~XN M;N -

~MN

="2 at XMN + i(XMNX -

M'

(49)

XN N;M)'

I alP

2XM XNP)

+ PMN

where P MN is the 3 + D dimensional Ricci tensor. These will be used later to write
the general equations. For the present application, consider the simplified form
(50)

I'MN(t)

These are Bianchi type 1.


Digression on Bianchi types: Bianchi classified homogeneous spaces in three
dimensions. A space is homogeneous if its isometry group acts transitively (orbit

of the group is the whole space). Bianchi assumed that the action was simply
transitive, i.e. there are no fixed points.
The infinitesimal generators of the group action X a are Killing vectors of the
manifold and are three in number. They must be closed under commutation
(51)

Bianchi's idea was to classify the spaces according to the structure constants of
the isometry group. Even if the structure constants look different to start with, the
X;s might be linearly transformed (with real matrices) so that the C's are made
the same. Bianchi recognized nine distinct cases; These are the Bianchi types.
Bianchi type I has all C's zero. The Xa in our case are clearly a/azM
Since I'MN has no spatial dependence
~OM =

0,

(52)

P MN = 0.

The field equations are


aM
at x M

+ l.MN
2X N X M

a r:.

(53)

(54)

JY XMN = 2A MN,

(55)

YY

(56)

I'

at(Y YX N) =

from (54), we get

M=-=

r:. A'MM'

460

J. Samuel

where AM N are constants. Setting AM M ,= 1 by a change of scale, we have


(57)

Y = t 2

Putting Equation (55) back in (53) yields


AM N Ic N M =1.

(58)

Next use (55,57) and arrive at


.

YMN

XMN

= -}M
t

(59)

XPN'

By means of a suitable linear transformation, we can arrange for


diagonal with principal directions V iN and eigenvalues Pi'

)'M N

to be
(60)

Vk

are space-independent unit vectors. Thus, the solution is


2Pi V i Vi
fMN - ~
M
N'

" _"t

(61)

Choosing new coordinates along the V's gives


ds 2

dt 2 -

t2P1

dz P

t2P2

dz 22 ..

t2P3+D(dz3+ D)2

(62)

The P's are constrained by (58,57)


(63)

LPi=LP;=1.

Some p's clearly must be negative (Pi> 0 => Pi < 1 => Pf < Pi => 1 = LiP; <
LiPi = 1, a contradiction). So, Kasner solutions describe universe which are
expanding in some directions and contracting in others. This is a poor description
of our universe, since we see red shifts in all directions. But for KK cosmology, the
Kasner solutions are just what we are looking for.
5.4.

5-d Solution [10]

Consistent with the isotropy of space set


Ps

-t

Then

so
(64)

does solve Einstein's vacuum equations. This describes the fifth dimension
contracting and the others isotropically expanding.

Kaluza-Klein Cosmology

461

Recall that the coupling constant of electromagnetism depends on the size of


the fifth dimension as

2nK
e=-.

(65)

Since L

e2

~ t- l / 2 ,
~ K2

or

e2
c3
4nhc 16nGh oc t,

(66)

this model predicts a cosmological variation of the coupling constants of nature.


We have here a surprising and unexpected connection with Dirac's large
number hypothesis. Dirac [6] believed that large numbers in physics are
unnatural. He noticed that the ratio of the electromagnetic force to the
gravitational force is ~ 1040 and that the age of the universe in atomic units is
also of this order. It seemed to him remarkable that two quite different numbers
should agree so closely. He went on to suggest that maybe this is true for all times,
leading to a relation of the fine structure constant with the age of the universe.
(Dirac also argues from his hypothesis that A. = 0, the universe is at k = 0, Q = 1).
However, the model above is too simple to be taken seriously. It has been
pointed out that the anisotropic expansion would be isotropized by particle
creation effects. This was mentioned in the chapter by Panchapakesan (Chapter
17).
5.5.

More Realistic Models

We now go on to more realistic models, which have D extra dimensions and also
allow for the presence of matter. For definiteness, we will suppose that space has
topology S3 and the internal space SD, so the gauge group is O(D + 1). Matter is
usually described by quantum fields. We will therefore model the matter as
a quantum mechanical ideal gas of particles. In fact, at the high temperatures
prevalent in the early universe, we can neglect the rest masses of the particles and
suppose that we are dealing with a gas of massless particles - a photon gas in
3 + D dimensions. The form of the energy momentum tensor for such a gas is

TMN =

[~PlOgii ~],

(67)

P2'Yp.y

where p is the energy density and PI and P2 the pressures in the real and internal

1. Samuel

462

spaces. Equation (67) describes a fluid which is spatially isotropic in its rest frame.
But we allow for an anisotropic pressure in the extra dimensions. Note that the
energy momentum tensor shares the symmetries of the metric tensor.
Starting from Einstein's equations and remembering that p, PI and pz depend
only on time and not on the spatial coordinates, we arrive at the differential
equations that describe the dynamical evolution of the two scale factors R(t) and
aCt) of the universe. These are

3R Dii
-+-=
R
a

-8nG 3 + D P,

R (lP kl) DRa


Ii + 2 RZ + RZ + Ra =
ii
-+(D-l)

8nG3+DPI'

(a22 + 2
kz) +-=8nG3+DP2
3Ra
a

Ra

(68)

These are the basic equations of Kaluza-Klein cosmology.


Notice at once that these equations admit the vacuum (PI = pz = P = 0)
Kasner solutions, as they must. These are

R rx
31X

t a,

a rx

+ Df3 =

tfJ,

1,

31X2

+ Df3z =

1,

f3 < 0.

IX> 0,

(69)
(70)

These were considered before in the case of D = 1.


For massless particles, the energy momentum tensor is traceless (M, N =
0,1,2,3,5 ... 5 + D -1)

(71)

+ Dpz

(72)

qjMN TMN =

which implies
P = 3PI

However, this does not determine the equation of state (unlike in the 3 + 1
dimensional case), because we do not know the relation between PI and pz. Most
commonly, people suppose that Pascal's law holds, i.e.
PI

= pz =

p.

(73)

This is a good approximation when the extra dimensions are not too small. In the
opposite regime when the extra dimensions are much smaller than the spatial
ones, we can suppose that
pz

0,

(74)

since the energy needed for matter to propagate along these dimensions is large.
We make the first choice. So our energy momentum tensor has the form

+ p)U M UN _ pqjMN,
is the (D + 4)-velocity, having only a time component.

TMN = (p

where U M

(75)

Kaluza-Klein Cosmology

463

The conservation of energy


TMN;N

(76)

is really contained in Einstein's equations. But we can nevertheless invoke it


separately to easily deduce the behaviour of p as the universe expands. Equation
(76) along with the geodesic equation (matter follows geodesics since the pressure
gradients vanish due to the homogeneity of the universe).
UM;N UN

(77)

implies that
(p

For

+ ph UN U M + (p + P)UN;N U M

P;N,,MN

= O.

(78)

we use

UN;N

ft y~

UN =_1_( ;;;U N )
.N

,N

=(R 3 aD )'=3R+Da.
R3 aD
R
a

(79)

This leads to

3R Da)
P + ( If
+ -;; (p + p) = O.

(80)

From (72,73),
p = (3

+ D)p,

(81)

P ex (VI V2 )V = VV,

(82)

where VI = R 3 and Vz = aD are volumes of the two spaces and

4+D
3+D

V= - - - .

(83)

For a massless gas in 1 + 3 + D dimensions, one can easily see from scaling
arguments that the energy density goes as
p ex

T4+D.

(84)

Putting (82) and (84) together, we deduce that


bT = constant,

(85)

where
b = (VI V2 )1/(3+D)

(86)

is the geometric mean of the scale factors of the universe. This is relevant to the
thermal history of the early universe. If all the dimensions were expanding, the
universe could only cool. But if some are contracting and some expanding, it is
the behaviour of b which determines whether the universe cools or heats. Sahdev
[7] has given some numerical integrations of the field equations in which the
universe starts out infinitely hot, cools to a certain temperature, reheats again,
and then cools in the usual Robertson-Walker fashion, since by then the universe

464

J. Samuel

is essentially 3 + I-dimensional. Thus, the extra dimensions do affect the thermal


history.
One of the cosmological problem listed earlier was the entropy problem: How
did the universe get so much entropy? The problem is quantitatively explained in
[8J and is related to the horizon and flatness problems. It is suggested [7,9J that
Kaluza-Klein cosmologies may help in producing more entropy. Naively, this
seems plausible since there is more phase space available and this should lead to
more entropy. This is expected to manifest itself in the four-dimensional world
when the excitations in the extra dimensions annihilate each other. (The higher
mode excitations have been called 'pyrgons' by people educated in the classical
tradition. The modes with no fifth component of momentum are, of course,
photons.) So the entropy is expected to be produced by the decay of pyrgons into
photons.
More quantitatively, let us define the effective four-dimensional temperature
T4 as the fourth root of the energy density in ordinary space:

T1 ex paD.

(87)

Then the four-dimensional entropy density is given by


S4

Tl ex [aD VVJ3 /4

If we suppose that the universe expands adiabatically (isentropically) in the 4

(88)

+D

dimensional sense, we still find that four-dimensional entropy is produced. The


total entropy
R) 3D/(4(3 + D))
S4 ex R3 S4 ex R3[a DVVJ3 /4 ex ( ~
(89)
increases as R increases and a decreases. However, it has been pointed out that
quantitatively the entropy produced is insufficient to solve the entropy problem.
There appears to be a mistake in the powers given in [9J for the rate of entropy
production. The correct dependence of S4 on R and a (Equation (89)) is rather
weak.
5.6.

Conclusion

In summary, the Kaluza-Klein idea leads to the following possibilities in


cosmology.
(1) The variation of coupling constants on cosmological time scales. This
could be quite important because reaction rates would significantly depend on
the value of these coupling constant.
(2) The thermal history of the early universe could be affected by the
contraction of the extra dimensions. This too could affect reaction rates.
(3) The extra dimensions may contribute to the total entropy of the universe.

Kaluza-Klein Cosmology

465

References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.

Th. Kaluza, Sitzbunger, Preuss Akad. Wiss. Berlin, Math. Phys. Kt, 966 (1921).
O. Klein, Z. Phys. 37, 895 (1926).
R. P. Feynman, Lectures on Gravitation, California Institute of Technology (1971).
S. Weinberg, Gravitation and Cosmology, John Wiley (1972).
E. Witten, Nucl. Phys. 8180, 412 (1981).
P. A. M. Dirac, Proc. Roy. Soc. London A165, 199 (1938); A365, 19 (1979).
D. Sahdev, Phys. Lett., 1378, 155 (1984).
A. H. Guth, Phys. Rev. D23, 347 (1981).
E. Alvarez and M. B. Gavela, Phys. Rev. Lett. 51, 931 (1983).
A. Chodos and S. Detweiler, Phys. Rev. D21 2167 (1980).
M. Gleiser and J. G. Taylor, Phys. Rev. D31, 1904 (1985); D32, 3337 (1985); D33, 570 (1986).

23. An Elementary Introduction to the


Gauge Theory Approach to Gravity
N. MUKUNDA
Centre for Theoretical Studies. Indian Institute
Bangalore 560012. India

ol Science,

1. Introduction
The physical basis for the passage from special to general relativity has been
explained in Chapter 1 by P. C. Vaidya. One ends up with a generally covariant
theory of gravitation. Spacetime is viewed as a pseudo-Riemannian manifold
carrying a distinguished second rank symmetric covariant tensor, the metric field.
This brings along with it the notions ofthe Christoffel connection, and covariant
derivatives of tensor fields of various ranks; and based on the principle of
equivalence, one has a minimal way in which any special relativistic (field) theory
(not involving spinors) can be extended to include coupling to gravitation.
Finally, one has an action for the gravitational field itself, namely the
Hilbert-Einstein expression.
Very early on, Weyl attempted to enlarge this framework and unify gravitation
with electromagnetism in a geometrical manner. Just as in the metric theory of
gravitation parallel transport of a vector round a (small) closed loop generally
alters its direction, though not its magnitude, Weyl tried to interpret electromagnetism in terms of a change in magnitude as well. The names 'gauge
transformation' and 'gauge theory' go back to this attempt ofWeyl. But while the
known gauge transformations and invariance of Maxwell's equations were
mathematically reproduced properly, the Weyl theory was untenable on physical
grounds.
After the advent of quantum mechanics, it was realized, by Weyl again, and by
London, that the gauge changes of electromagnetism should be associated with
local phase changes of complex-valued quantum mechanical wave functions for
charged particles, and not with anholonomic length changes of real vectors and
tensors. In the course of the development of elementary particle theory, this kind
of gauge invariance with respect to an internal symmetry group led to the
fundamental ideas of non-Abelian gauge theories of the Yang-Mills type. In this
context, the internal symmetry groups that are permitted are the compact
semi-simple Lie groups briefly described by Mukunda in Chapter 14. The basic
structures of Yang-Mills theories have already been covered in Part n.
One can now go back to the original problem of the passage from special to
general relativity, and ask if some modified form of the Yang-Mills argument for

467
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 467-479.
(0 1989 by Kluwer Academic Publishers.

468

N. Mukunda

an internal symmetry can be used for this purpose as well. Here, in place of
a compact semi-simple Lie group, one would have to deal with the homogeneous
and inhomogeneous Lorentz groups of special relativity which act on spacetime
as well as on dynamical fields, and ask what it might mean to gauge them. This
problem was first analyzed by Utiyama, Kibble and Sciama and it is this work
that will be briefly described in an entirely elementary manner in this section.
What it finally leads to is what is known as the Einstein-Cartan theory which has
been described in detail by A. R. Prasanna in Chapter 8. However, as a contrast to
the compressed and efficient differential geometric methods used in Prasanna's
treatment, a more naive and simple minded approach will be taken here, relying
throughout on local fields, their transformations, and their Lagrangians. This
section is meant merely to serve as an introduction to this point of view towards
gravity, emphasizing the similarities to and differences from an internal gauge
symmetry; the considerable amount of recent works in this field are mentioned
via representative references. One will see the important concepts of vierbeins,
torsion, and coupling of spinors to gravitation, all emerge in a fairly simple
way-each of these important concepts is used in other parts of this book, for
example, in the contributions of A. N. Maheshwari (Chapter 20) and R. Kaul
(Chapter 25).
2.

The Yang-Mills Construction

To set the notation, and for later comparison, quickly recall the essential steps of
the Yang-Mills procedure. A set of 'matter' fields >(x), on some spacetime
background .it, is given; here.it may be Minkowskian as in special relativity, or
could be more general. The fields >(x) belong to some unitary (possibly reducible)
representation D(g) of some simple compact Lie group G. If the anti-Hermitian
generators of this representation are Ta , they obey the commutation relations
(1)

where the f,,c b are the (real) structure constants of G. The action of G on >, both in
infinitesimal form and in finite form, are given by
>'(x) - >(x)

= b>(x) = ea Ta >(x), I(al 1;

>'(x) = D(g)>(x).

(2)

Here we have constant spacetime independent parameters ea , or group element g,


corresponding to a global transformation. Therefore, the derivatives transform in
the same way as :
bo /l (x) = ea Ta 011 >(x);
0/l>'(x)

= D(g)o/l>(x).

(3)

A matter Lagrange density 2?M(>; 0/l invariant under this global action by
G must identically satisfy
02?M(>;J T

o>

A..

a 'I'

+ 02?M(>;>/l)
O>v

A)

a'l

== 0

,a =

12

, , ...

(4)

The Gauge Theory Approach to Gravity

469

where the arguments ~ in 2!M need not necessarily be a~ . It is assumed here that
all other arguments of 2!M(; a) (not explicitly indicated) are unaffected by G.
Noether's theorem then yields the result that when the field equations for are
obeyed, the following currents associated with G are conserved

(5)

When the group G is 'gauged', the e and g in Equation (2) become functions of
spacetime
a

b(x) = ea(x)Ta(x);
'(x) = D(g(x))(x).

(6)

As a result, we lose Equations (3) for a, since extra terms linear in and
proportional to o~ ea appear and we cannot expect 2!M(; a) to be any longer
invariant; in fact we find

(7)
Here, no use has been made of any equations of motion.
However, the comment following Equation (4), to the effect that the argument
~ in 2!M in Equation (4) need not be o~ rjJ, gives us the clue to a natural
construction of a modified LagrangIan which will be invariant under the local
transformations (6), starting with the globally symmetric 2!M(rjJ; orjJ). Namely, we
generalize the ordinary derivatives aIi rjJ to new derivatives D ~ rjJ which are
covariant with respect to G:

A:

(8)

The behaviour of the new fields


(which are covariant vectors on vIt) is
determined by the condition that D ~ rjJ now must transform in the same simple
way in which a~ rjJ did before gauging. This requirement, and its consequences, are
bD ~(A)rjJ(x) = sa(x)TaD Il(A)rjJ(x),
D~(A')rjJ'(x)

bAIl(x)

= D(g(x))DJA)(x);

[sa(x)Ta, AIl(x)] -

a~

(9a)

sa(x) Ta,

bA~(x) =fbacGb(x)A~(x) - 0llsa(x),


A~(x) = D(g(x))(A~(x)

+ a~)D(g(x))-l.

(9b)

Here, both the infinitesimal and finite transformation laws have been given, and
for All both in matrix and in component form in the infinitesimal case. With this
introduction of the gauge potentials A~, and the transformation laws secured for
Dil rjJ, we immediately see that the modified matter Lagrangian density 2!M(rjJ; DrjJ)
is invariant under local gauge transformations:
b2!M(rjJ; orjJ) = 0 under (2) => b2!M(rjJ; D(A)rjJ) = 0 under (6,9).

(to)

N. Mukunda

470

This new matter Lagrange density !fM(<'.p;D(A)) will lead to new equations of
motion for which are gauge-covariant.
Turning to the construction of a Lagrange density for the gauge fields
A~ introduced above in order to make the matter action gauge invariant, we must
look for some function 2'o(A;oA) which is also gauge invariant. That is, 2'o(A;oA)
must be unchanged when A~ undergoes the inhomogeneous (infinitesimal)
transformation given in Equation (9b). The imposition of this requirement is
conveniently done in two stages. One finds at first that on account of the
inhomogeneous term in bA~ in Equation (9b), 2'0 must depend on A~ and its
spacetime derivatives in a specific combination

F~v = avAil - 0IlAv - [AIl,Av] = F~v Ta,

(11 )

F~v = ovA~ - 0I'A~ - fbaeA!A~.

This second rank covariant antisymmetric tensor F~v is also recognized to be the
commutator of the gauge-covariant derivatives D I'(A) in the sense that
[DjA),Dv(A)] = -F~v Ta

(12)

As a result, under a gauge transformation F IlV transforms in a linear homogeneous


manner, that is, this 'field strength tensor' has a covariant gauge transformation
character given in infinitesimal and finite forms by
bFllv(x) = [ea(x)Ta,Fllv(x)],
bF~v(x)
F~ix)

= fbaeBb(x)F~v(x);

= D(g(xFIlV(x)D(g(x-l.

(13)

The remaining condition is that !fo(F) must be invariant under the linear
homogeneous transformation of F appearing in the last line of Equation (13). This
can be expressed in the following interesting way: if we define the derivative of
!fo(F) with respect to F to be G~v by the convention
b!fo(F) == W~v bF~Il'
G~v

= _

(14)

G~J1,

then the condition on !fo(F) takes the form


b!fo(F) = 0<;> fabe G~v F~v

== o.

(15)

The complete Lagrange density governing both 'matter' cp and gauge fields A is
then
(16)
with !fM subject to the conditions (4) and !fo to (15). We see how in a systematic
way, starting from the global symmetry of !fM(;O) under G, we have been led to
the locally gauge invariant augmented Lagrange function (16).
The condition (15) restricting the form ofthe 'free' term !fo(F) can be interpreted

The Gauge Theory Approach to Gravity

471

in another way. Just as the matter fields were supposed to belong to some unitary
representation D(g) of G, both F:v and G~v (in the latter case, only if 20(F) is
invariant under G!) belong to the adjoint representation of G. By working out the
generator matrices for this representation appropriate for F:v and for G~v, and
employing Equation (12) which holds in any representation, we can show that
(17)

so
(18)

These are the Bianchi identities in the present context: they will be obeyed once one
has the gauge invariance and they do not need the equations of motion at all.
Rather, one must convince oneself that the equations of motion do not conflict with
the Bianchi identities in any way! Indeed, there is no conflict, as the following
argument shows. The gauge field equations of motion are
(19)

wltere the expressions J~ are the ones encountered in Equation (5) before the
gauging of G. The question is whether DI'JI'(;D) vanishes; for consistency it
must, since D I'D v GI'V vanishes identically. It now turns out that while D I'}I'( ; D)
do not vanish identically, they do vanish by virtue of the field equations for ! This
happens because the gauge invariance of 2M(;D) can be exploited to yield the
identity
D }I'('+'D'+')
Ii

a ,+"

'f'

==

b2M(;D) T .+.

a 0/,

(20)

hence, the stated conclusion.


All the above development is independent of the nature of the spacetime
background jt. If we now assume that .4t is the spacetime of special relativity,
a natural choice for 20(F) similar to the Maxwell Lagrangian emerges
_

1
4e

20(F) - - -2'1

I'P

IJ

va

F I'V F paa '

(21)

Here e is a coupling constant and the raising and lowering of the group index is
done with the help of the Killing metric of G. In this situation, G~v and F:v are
essentially the same, and this is what is encountered in the particle physics
contexts described in Part II.
The essential step in the above analysis is thus the introduction of the gauge
potentials A: and the gauge-covariant derivative operator D 1'( A): it is designed to
produce D)A) which has the same local gauge transformation law as itself.
That D Il(A) is simply ell plus something linear in , is a reflection of the fact
that under an infinitesimal gauge transformation, the 'unwanted term' in be I' is
also linear in .
Now we see in outline how the above technique can be modified in the case of
the Lorentz group of a special relativistic field theory.

N. Mukunda

472

3.

Gauging a Special Relativistic

Mattf~r

Lagrangian

We deal with fields on Minkowski spacetime .4t belonging to some representation of SL(2, C) rather than SO(3, 1), so that spinor fields can be accommodated.
To keep the notation simple, however, we indicate Lorentz transformations
A, N, ... , in various equations, rather than SL(2, C) elements. It is also more
transparent to deal with finite transformations as far as possible.
Under the inhomogeneous Lorentz transformation
(22)
let the multiplet of the matter fields rjJ(x) transform via
rjJl(X') = D(A)rjJ(x).

(23)

This is accompanied by
O~rjJl(X') =

A/ D(A)oA(x).

(24)

Manifest relativistic invariance of a matter Lagrange density !i'M(rjJ; orjJ) means


!i'M(rjJl(X' ); o~ rjJl(X') = !i'M(rjJ(X); o,A(x)),

(25)

or on using Equations (23), (24) and suppressing indices


2'M(D(A)rjJ(x); AD(A)orjJ(x)) = !i'M(rjJ(X);orjJ(x)).

(26)

Here, once again o,A(x) could be replaced by any indexed object rjJll(X), and this
equation would continue to hold.
To gauge the group of transformations (22) will be interpreted to mean that we
make the six parameters in A and the four translation parameters all into ten
independent arbitrary functions of x. When this happens, we still have A(x) an
element of SO(3,1) (more correctly of SL(2, C)) for each x, and the basic
transformation law for rjJ reads
rjJl(X ') = D(A(x))rjJ(x).

(27)

However, since all(x) are arbitrary, there is no way in which the matrix A(x) can be
'read off' from any given expressions for x' in terms of x: this was possible in
special relativity, since A and all were both constant, but no longer so. In equation
(27), therefore, we are led to view x' as arising from x by a general coordinate
transformation (GCT) not referring to A(x) at all; and in addition to this GCT, the
transformation law of rjJ involves a local 'shuffling of components' for which alone
the 'local Lorentz rotation' A(x) is relevant. With this disengaging of A(x) and x'
from each other, we are led to modify our index conventions: indices fl., v, . .. will
appear on spacetime coordinates as Xll, XiV, ... ; while for A(x) we use distinct
indices a,b, ... Thus, A\(x), Ab"(x), .... This means that any spinor, vector or tensor
character that rjJ has in special relativity now appears as its local Lorentz
transformation behaviour. Under a GeT, each component of rjJ behaves as
a scalar field!

The Gauge Theory Approach to Gravity

473

The law for derivatives of that we get from Equation (27) is


ax v

a;,'(x') =

-~-

[D(A(x))ov(x)

ux'I"

+ ovD(A(x))(x)].

(28)

This has both familiar and unfamiliar features. Tn place of the matrix A/
appearing explicitly in Equation (24) in special relativity, we have now the
Jacobian associated with the GCT: this kind of thing is new, not present in
ordinary gauge theory. The derivative of D(A(x)) in the second term inside the
bracket is, however, familiar: it is just like the unwanted terms linear in
occurring in bo in gauge theory. If we wish to exploit the relativistic in variance
of 2 M (;o), as expressed in Equation (26), to achieve invariance under the
transformations (27) for a modified matter Lagrange density, our strategy must
involve two stages: (i) just like A~ previously, we must introduce as SL(2, e) gauge
potential to get rid of the unwanted term 0vD(A(x)); (ii) after this, we must
introduce additional new fields whose role will be to absorb the Jacobian factor in
Equation (28) and put in its place the matrix elements Aba(X). With such a doubly
generalized derivative of , we can exploit Equation (26).
If the generators of the (SL(2, e)) representation D(A) are Sab = - Sba obeying
the usual commutation rules (with rt the Minkowski metric)
[Sab,Sed] = rtae Sbd - rtbe Sad
SO

+ rtad Seb -

rtbd Sea

(29)

that for infinitesimal parameters Gab = - f,ba, we have

D(bl:

+ f,\)

~ ~

+ 1f,a b Sab ,

(30)

in the first step, we define an SL(2, C)-covariant derivative by


DI'(A)(x) = (01'

+ AI')(x)

= 0l'(x)

+ 1A~bSab(X).

(31 )

Under the combined GCT, plus local Lorentz transformation (27), we will have
D;,(A')'(x') =
~

~xv

Ox'l'

Dv(A)(x)

(32)

provided
ox v

A~(x') = ox'I'D(A(x))(Av(x)

+ a.)D(A(xWl.

(33)

The similarities and differences in comparison to Equation (9b) must be noted:


the SL(2, e) gauge potential A)x), also called the spin connection, is a covariant
vector under GCT, and has the familiar linear inhomogeneous transformation
law under local SL(2, e).
At the next stage, we define yet another generalised derivative !2!a linearly

N. Mukunda

474

related to DIl</J, by using new fields h/(x):


:a</J(x) = h/(x)DIl</J(x)
= h/(x)(/\

+ AIl(x))</J(x).

(34)

This will transform as o</J did in special relativity (cf. Equation (24)) namely we
will have
:~</J'(x')

= A/(x)D(A(x)):b</J(X),

(35)

ox

(36)

provided
'll
ha'11 (x) - -;- Aa b (x)h b" (x).
ux"
I

Thus, h/ has a linear homogeneous transformation lw; it is a contravariant GeT


vector, and a local Lorentz four-vector.
With these ingredients and on the basis of the property (26) for 2 M , we see that
when </J goes to </J' according to Equation (27), we will have
(37)
i.e. the new Lagrange function 2 M (4';:</J) built from the original special
relativistic one, is a GeT and local Lorentz scalar. However, again in contrast to
the Yang-Mills case, this is not the end ofthe argument. Since x --+ x' is a GeT, we
need a scalar density, not just a scalar, as the integrand in the matter action. Thus,
to 2 M (</J;:</J) we must append a multiplicative factor which transforms by the
appropriate Jacobian determinant of the GeT. Taking advantage of the
postulated behaviour (36) for the h-field, our final gauged matter action density is
(38)
4.

Kinematics of the Gravitational Variables

Gauging the inhomogeneous Lorentz group, as far as matter is concerned, has so


far brought in 40 new variables: the 24 A~b and the 16 h/. While the former are
Yang-Mills potentials for SL(2,C), the latter are naturally interpreted as
a vierbein. It is evident that the homogeneous and the translatory parts of the
Poincare group have not behaved in the same way in this process: this is
ultimately because in special relativity we do not normally consider a 'shuffling of
components' under spacetime translations (but more on this later). The metric
tensor can be defined in terms of h/ in the usual way. We assume that the matrix
(h/) is nonsingular, with inverse (hall)' and then
gil" = I]abhall hb ",
gil" =I]abh/hb v ,

(39)

The generalized derivatives D 11': a are applicable to any multiplet </J belonging to

The Gauge Theory Approach to Gravity

475

any representation of SL(2, C) but at the same time a GCT scalar, and their
properties are

cp

(D(A) of SL(2, C) (GCT scalar) =>

(D(A) of SL(2, C (GCT Covariant vector),

Dcp

fiJcp

~ (A

D(A) of SL(2, C (GCT scalar).

(40)

But we have already introduced the vierbein with simultaneous nontrivial


SL(2, C) and GCT properties, so we are obliged to set up a more general covariant
derivative than D1" V /1 say, capable of being applied to fields which have definite
tensor behavior under GCT, in addition to SL(2, C) behaviour. So we define V/1
(and also a 15/1) as
VI' = D/1

+ rl'

= 15/1

+ A/1

8/1 + A/1 + r/1

8/1

+ tA~bSab + r/1/ Xp,

(41)

where r/1/ is an affine connection, and Xp are the GL(4,R) generator matrices
appropriate to the GCT tensor character of the field to which V/1 is to be applied.
We require of V/1 that

cp

(D(A) of SL(2,

Vcp

C (some GCT tensor)

=>

(D(A) of SL(2, C) (same GCT tensor covariant vector).

(42)

(the 64 r's along with 40 A's and h's make up 104 gravitational variables in all,
but the number of independent ones will soon be reduced to just 40). The operator
~ will behave as desired if A/1 transforms as before (Equation (33, while
(43)

The commutator of two V's gives useful tensorial quantities

[V/1' Vv] = tPUb/1v S ab + R~I'\Xp + T\v V ;.,


pub /1V = SL(2, C) gauge field strength
8Il A ab
+ A ac 11 Abcv - A ac veil'
Ab
= 8 v Aab
Il v
RP ~/lV = curvature tensor
=

(44)

8llrv/ - 8 vr/1/ + r/1/r v:' - rv/rll~'"


;.).
torsIOn tensor = r /1V - r V/1 .

;..

T I'V

The appearance of the torsion tensor is something new, not encountered in the
original Yang-Mills argument; it is due to the fact that operation on a GCT
tensor by V /1 increases the tensor rank by one covariant vector index, rather than
leave it alone. The transformation laws of these derived objects are given in Table
I.

476

N. Mukunda
Table I.
Local SL(2, C)

GeT

2nd rank anti symmetric


tensor

2nd rank anti symmetric


covariant tensor

Scalar

4th rank tensor, as


indicated; antisymmetric
in I' and v

Scalar

3rd rank tensor, as


indicated; antisymmetric
in I' and v

The vierbein h/ , hal' is able to convert GCT tensor behaviour into corresponding local SL(2, C) behaviour and vice versa (except that SL(2, C) spinors cannot
be converted into any GCT tensorial objects). This is similar to the use, in the
metric theory, of gl'v and gl'V to pass between contravariant and covariant tensors.
In the same spirit and to reduce the number of independent variables, we may
postulate that the vier be in must be covariantly constant:
Vl'h av = 0.

(45)

This will ensure that the application of VI' commutes with passage back and forth
between GCT and SL(2, C) tensor behaviours.
Two immediate consequences of the postulate (45) are:
Vl'gv). == (jl'gv). = 0,
Fab f.1v

haPhba RP allv'

(46)

Thus, the metric is covariantly constant (this does not mean that r 1'/ is the
Christoffel connection!); and the SL(2, C) gauge field strength is essentially the
same as the curvature tensor ofthe background spacetime. In addition, one finds
that the spin connection AI' and the torsion T are two aspects of the same thing in
the sense that the vierbein plus the spin connection determine torsion, and
equally well the vierbein plus the torsion determine the spin connection. The
relevant equations, which are linear inhomogeneous in each direction, are:
h/(0l'hav + hbvAabl' - (/1 +-+ v)),

T\v

Aabl'

= ~ha).hbv[{hd(Ovhcl'

+ T A/lV -

- 0/lh\) -I- hC/lovhc). - ()...... v)}

TV/l). - T/l AV ].

(47a)

+
(47b)

The independent gravitational field variables can thus be taken to be the h/ and
A~b, or the h/ and T\v: in either case, there are 40 variables in all. With hand
A independent, we have

r/l/ =
T\v

h/D)A)haa,

r\v - r\1' (same as (47a)).

(48)

The Gauge Theory Approach to Gravity

477

With hand T independent, we have

= r~~)/J + t(TP Il

T/i/ - T./),
reO) = symmetric Christoffel connection from gllv'
ll /

A~b =

5.

huvb/i(r)hbv(same as (47b)).

(49)

The Gravitational Action

We will be quite brief in discussing possible choices for the action for the
gravitational variables. If we wish to be as close as possible to the usual metric
theory, enlarging it to the minimum essential extent so as to encompass spinor
matter, we set the torsion T\v = 0 from the very beginning. Thus, the
independent gravitational variables are the 16 vierbein components h/. These
may be viewed as made up of the 10 metric components gllv' and the six
parameters oflocal SL(2, C) transformations. Then reo), RP.llv and the multiplier
Ll(h) are all expressible in terms of the metric gllv' but for the spin connection
A~b the use of h/ is essential.
The simplest gravitational Lagrangian is the usual one, namely the Hilbert~
Einstein expression, which involves only g/iV and its derivatives. (In fact, if has
no spinorial fields at all, one can dispense with h/ and A~b altogether and recover
the familiar theory.) The total Lagrangian density is then

!f!T =
Ll(h) =

tJ -gR + Ll(h)!f!M(;~)'
.;=9,

(50)

where R is the scalar curvature. The only comment we must make is that when the
Einstein field equations are obtained, the matter energy momentum tensor is, in
general, at first not symmetric, nor is it covariantly conserved; but both these
properties needed for consistency are ensured once the field equations for are
taken into account.
Another possibility is to allow the torsion to be nonzero. It is then more
convenient to work with the 40 h's and A's as the independent gravitational
variables. For the gravitational action density, one has available several basic
tensorial building blocks: h/, T\v and RP.llv ' Whatever choice one makes, one
expects the field equations corresponding to varying h/ to involve ultimately the
energy momentum tensor of matter. But there will be other field equations too,
when we vary A~b. The matter contribution to these is via the matter spin density
defined as

bAabLl(h)!f!M(;~)
Il

==

aAabLl(h)!f!M(;~) = -S~b'

(51)

Il

The simplest action density for the gravitational field is linear in the curvature
tensor or equally well in the SL(2, C) field strength. This is a possibility that is not

478

N. Mukunda

available in ordinary Yang-Mills theory. One can choose, following Kibble

!i' G =

t /1.(h)R,

R -- ha~h b vF ab ~v -- g,vRP.pv

A:

(52)

b then turn out to be basically


The field equations obtained by varying
algebraic rather than differential equations. It turns out that the torsion does not
propagate; and torsion can be nonzero only in those regions of spacetime where
the spin density is also nonzero.
Of course, other choices for !i' G are possible, for example something quadratic
in For R, more like the choice made in ordinary Yang-Mills theory. These have
been studied in the literature but we do not pursue them here.
Ifboth an internal symmetry and the Lorentz group are being gauged, then one
must combine the methods developed to handle each kind of gauging. Three
categories offield quantities then emerge: matter, internal symmetry gauge fields,
and gravitational fields, with characteristic behaviours under the total gauge
group.

6.

Translational Gauge Potentials

In conclusion, we briefly remark on the following point: we noted that the


homogeneous and the translational parts of the Poincare group seemed to
behave rather differently from one another under the gauging process. Is it
possible, to some extent, to treat them in a more uniform manner? An attempt can
be made along the following lines.
Let us begin by treating the Poincare group rJ} as though it were just an internal
symmetry group, not acting on spacetime coordinates at all. In place of the
generators Ta for a true internal symmetry group G, we now have generators
Lab(really the same as Sab) and na obeying the commutation relations.

[na' n b] = O.

(53)

In place of the previous gauge potential A~ = A: Ta , we now have

ha~ na
A ~ -_.lAab",
2 ~ ""'ab -

(54)

with both spin-connection and vierbein introduced at one stroke. The Poincare
covariant derivative Dil is defined by

Dil

= all

+ All
(55)

The Gauge Theory Approach to Gravity

479

and the SL(2, C)-covariant D/t in Equation (31) is only a part of


commutator of DJl with Dv gives

Dw The
(56)

with

pbJlV

given as before by Equation (44), while


(57)

This is, in fact, the same expression as one obtains on combining the two lines in
Equation (48), and then converting the contravariant vector index to a local
SL(2, C) vector index. In fact, using the SL(2, C)-covariant part DJl of DJl ,
Equation (57) becomes
(58)
At this stage, if one wishes, one can restrict the local Poincare transformations
to local SL(2, C) rotations alone. Then, of course, b will transform in linear
inhomogeneous fashion, as a gauge potential should, via Equation (33); while
ha /l will undergo the homogeneous transformations (36). (Here the GeT
behaviours have been added on.) The possible advantages and difference in
emphasis in this approach are then that:

A:

(i) A:b and haJl are both initially gauge potentials, though later one may give up
the freedom to perform 'local translations';
(ii) pb JlV is the same as before, but now the torsion is part of the Poincare gauge
field strength, and its expression (58) is obtained directly, not as a result of
exploiting Vh = 0 as done previously;
(iii) the affine connection r can now be taken as being defined by Equation
(48); then Vh = 0 will follow, rather than the other way around.
Some Representative References
H. WeyJ, Space-Time Matter. Dover (1952).
W. Pauli, Theory of Relativity, Pergamon (1958).
F. London, Z. Phys. 42, 375 (1927).
C. N. Yang and R. L. Mills, Phys. Rev. 90, 191 (\954).
R. Utiyama, Phys. Rev. 101, 1597 (1956).
T. W. B. Kibble, J. Math. Phys., 2, 212 (1961).
D. W. Sciama, in Recent Developments in General Relativity, Pergamon (1962).
S. Weinberg, Gravitation and Cosmology, Wiley (1972).
F. Hehl, P. Van der Heyde, G. Kerlick, and J. Nester, Rev. Mod. Phys. 48, 393 (1976).
D. Ivanenko and G. Sardanashvil), Phys. Rep. 94, 1 (1983).
Tulsi Dass, Pramana, J. Phys. 23, 433 (1984).

24. Graded Lie Algebras


B. R. SIT ARAM
Physical Research Laboratory, Navrangpura, Ahmedabad 380009, lndia

1.

Introduction

Graded Lie Algebras (GLAs) were first introduced in Mathematics in 1961 by


Nijenhuis and Richardson as a generalization of the notion of a Lie algebra which
proved to be of use in the theory of deformation of structures. However, it was not
until the advent of the applications in physics, in the guise of supersymmetry and
supergravity, that research into their properties really got a firm start. What we
intend to do in this chapter is to summarize briefly some of the properties of
GLAs which are of importance from both the physicist's and the mathematician's
point of view. Basically, as we know, a Lie algebra consists of objects which have
a well defined 'Lie algebra bracket' defined on them. The basic properties of such
brackets are well known - they are anti symmetric, bilinear, and satisfy Jacobi
identities. Such Lie algebras are suitable for describing quantum field theoretic
operators which are bosonic in nature, i.e. which satisfy commutation relations
among themselves. However, fermionic field theoretic operators satisfy anticommutation rules among themselves. Hence, we have a need for a structure
which admits both kinds of operators, bosonic and fermionic, and both kinds of
'brackets', commutators and anticommutators. Such a structure is precisely the
one provided by GLAs A GLA therefore, should consist of two kinds of objects:
bosonic or even generators and fermionic or odd generators which have the
following relations between them.
1st operator

2nd operator

Bracket

Result

boson
boson
fermion

boson
fermion
fermion

commutator
commutator
anticommutator

boson
fermion
boson

Technically, we have a vector space g = go + gi = bosonic part + fermionic


part and two brackets [ , ] and { , }. If Xi are a basis for go and ~ are a basis for
gi' then we demand that

= qjXk,C~j = - eji'
[Xi' Ya] = dfaYp,
{Ya,Yp} = e;pXi,e;p = e~a.
[Xi,X j]

481
B. R. lyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 481-485.
,:g 1989 by Kluwer Academic Publishers.

B. R. Sitaram

482

In addition to the above, we must have the following Jacobi identities


[Xi,[Xj,XkJJ + cyclic = 0,
[Xi,[Xj,I:J] + cyclic = 0,
[I:,{Yp,I:}] + cyclic = 0,

[Xi,{I:,Yp}] + {I:,[Yp,XJ} - {Yp,[Xi,I:]}

= 0.

These identities are natural generalizations of the usual Lie algebra identities and
are automatically satisfied whenever [,] and { , } are the usual commutator and
anticommutator brackets.
2.

Examples of Graded Lie Algebras

(1) The Dirac GLA: go = {Jo}, g1 = {Jl"'" I n },


{Ji' JJ = 26ijJ o,
[10' JJ = 0, i, j = 1, ... , n.

(2) The Grassman GLA: g = g1 = {J 1 ... I

{Ji' J j }

= 0,

n}

i,j = 1, ... , n.

(3) The Supersymmetry GLA: go = {M Jl ,. PJl} = Poincare algebra,


g1 = {Q1 ,Q2,Q3,Q4}'

[Qa,MIlJ

i(aIlVQ)a;[Qa,P Il ] = 0,

{Q.,Qp} = - 2(yJl).pP Il ,
a llV = SCl'll,'/J, Q = Q+yo.

To simplify future discussions, we now introduce a mathematical notation which


will allow us to treat all the operators on the same footing. Given a GLA, call an
element of the GLA to be homogeneous ifit is either even or odd. (Note that since
a GLA is a vector space, it makes sense to consider elements which are sums of
even and odd elements: for example, J 0 + J 1 is an element of the Dirac GLA:
such an element is not homogeneous). For homogeneous elements, define an
integer called the degree of that element, and denoted by 1X I, as follows: IX 1= 0 if
X is even and IXI = 1 if X is odd. We then define for all homogeneous elements of
the GLA a bracket called the GLA bracket by
[ZI' ZJ] = ZI Z J - (_l)IZ,I'IZJI ZI' ZJ,

and extend the bracket by linearity to all pairs of elements of the GLA. We then
easily see that the GLA bracket has the following properties:
(i) [ , ] is bilinear (by construction),
(ii) for homogeneous elements,
[ZI,ZJJ

= - (-

l)lzr!'IZJI[ZJ,ZI]'

Graded Lie Algebras

483

(iii) for homogeneous elements,

1[Z/, zJ]1 = IZ/I + IZJI (mod 2),


(iv) for homogeneous elements, we have the Jacobi identities
(- 1)IZrl'IZkl[Z/, [ZJ' Zk]]

+ cyclic =

O.

We can in fact use these properties to define an abstract GLA: we must have
a vector space, a degree defined for homogeneous elements and a bracket [ , ]
which satisfy the above conditions.

3.

Maps of GLAs

If a, bare GLAs, then a function 1: a----+b is said to be a GLA homomorphism if


the following is true

(i) 1([Z/, ZJ]) = [f(Z/),J(ZJ)]'


(ii) 1 is linear and 11(Z) 1= 1Z I
The first condition is just the statement that the G LA structure is preserved, while
the second is necessary to ensure that the image of a in b is actually a GLA. If, in
addition to being a homomorphism,fas a map of sets is also invertible, then 1 is
said to be an isomorphism of GLAs. Note that the inverse map of an
isomorphism is automatically a homomorphism ofGLAs. To define the notion of
representations, we need to consider a special GLA, the GLA of endomorphisms
of a graded vector space. Consider a vector space which has been written as
a direct sum of two vector spaces: V = Vo + VI with each element of V being
expressible uniquely as a sum of an element of Vo and an element of VI' Consider
now linear transformations of V, i.e. matrices which can act on elements of V. Call
the set of all such matrices (linear transformations) as End(V). Using the splitting
of V into Vo and VI' we can split End(V) into two parts: (i) (End(V))o- linear
transformations which preserve the splitting, i.e. take elements of Vo into
themselves and elements of VI into themselves; these transformations will form
the even part of our GLA; (ii) (End(V))I-linear transformations which exchange
elements of Vo and VI; these transformations will form the odd part of our GLA.
AI :Az) Vo
A3 1A 4 VI

A = (- - - -r- - -

d)
En (V;

A z,A 3 =O=AE(EndV)o
A I ,A 4 =O=AE(EndV)1

It is trivial to construct a GLA out of this set of transformations by using as

a GLA bracket the commutator or the anticommutator as the case may be.
A representation of a GLA is then just a homomorphism f: g ----> End(V). If
a subspace V' = V~ + V' of V exists such that under the representation, the
elements of V' are transformed among themselves, i.e. V' is invariant under the
representation, then the representation 1 is said to be reducible. For such

B. R. Sitaram

484

representations, the matrices in image (f) look as follows:

o i
(A_l~L~l~
Al4

~2~~~2~)VO} V

0:

A24

V'l} VI

~3l. +- ~33. ~4~ ~A~2_

A34

I A44

If, in addition, there is a complementary subspace V 2 in V which is also invariant


under the representation, i.e. if = Vl + 2 and both Vl and 2 are invariant
under the representation f, then the representation f is said to be completely
reducible. For such representations, the matrices in image (f) look like this:

An interesting example of a representation of a GLA is the adjoint representation


of the GLA on itself. As we have seen, a GLA is also a vector space and, hence, can
take the place of the V used in this section. For each element Z in a GLA g, we
define a linear transformation on g by

where Z[ is some basis for g. If this has to be a representation, we must have,


[ad(Z) ad(Z') -

(_l)lzllz'l

ad(Z') ad (Z)]Z[

ad([Z, Z']) Z[

i.e.
[Z,[Z', Z[]] - (_I)IZIIZ'I[zl,[Z, Z[]] = [[Z, Z'], Z[]

which is just the Jacobi identity. For example, for the n = 2 case of the Dirac
GLA, the adjoint representation is given by

adJ = ({--i~O~,
l

\0 1 0 )

The reducibility of the adjoint representation is a consequence of the fact that the
Dirac algebra has nontrivial ideals (cf. infra).

Graded Lie Algebras


4.

485

Classification of GLAs

As in Lie algebra theory, GLAs can be classified into various types. In fact, the
various definitions given below are straightforward generalizations of the
conventional definitions of Lie algebra theory.
Note: Given a GLA g and two subspaces hl and h2' we use the notation [h 1 , h 2]
to mean the set of all elements of g got by taking an arbitrary element of h 1 , an
arbitrary element of h2 and computing their GLA bracket.
(1) g is said to be Abelian if [g, g] = 0, i.e. if the GLA bracket of all elements
vanishes. An example of an Abelian GLA is the Grassmann GLA.
(2) A subspace h of g is said to be a subalgebra of g if h is closed under the
operation of taking the GLA bracket, i.e. if [h,h] c h. For example, the Dirac
GLA with n generators can be considered as a subalgebra of the Dirac GLA with
m generators, when m > n.
(3) A subalgebra h of g is said to be an ideal of g if [g, h] ~ h. Note that this is
a more stringent condition than the previous case. For an Abelian GLA, all
subalgebras are, in fact, ideals. The example discussed in (2) is, in fact, an example
of an ideal.
(4) Given the GLA g, form the series of subspaces gin) defined by

g(O) = g,
It is straightforward to show that for any g, these subspaces are, in fact, ideals.
A GLA is said to be nilpotent iffor some n, (and, therefore, for all m > n), gin) is the
trivial GLA. It is trivial to see that all Abelian GLAs are nilpotent.
(5) Given the GLA g, form the series of subspaces gin) defined by g(O) =
g,g(n) = [g(n-l),g(n-l)].
Once again it is easy to show that these subspaces are ideals. A GLA g is said to
be solvable iffor some n, gin) is trivial. It can be shown that all nilpotent GLAs are
solvable, though the converse is not true.
(6) Finally, a GLA is said to be simple ifit does not possess any nontrivial ideals
and semisimple ifit does not possess any Abelian ideals. Note that no semisimple
GLA can be solvable; in fact, a characteristic feature of these GLAs is that
[g, g] = g. A major point of difference between Lie algebra theory and GLA
theory lies in the existence of nondegenerate, invariant bilinear forms on
semisimple Lie/graded Lie algebras. While the Cartan-Killing form provides us
with such a form in Lie algebra theory, no such form exists in the graded case. One
consequence of this is that in contrast to the ungraded ese, reducible representations of semisimple GLAs are not necessarily completely reducible.

References
The mathematical theory of GLAs is covered in detail by
M. Scheunert, The Theory of Lie Superalgebras, Springer Lecture Notes in Maths 716 (1979).
Most review articles on supersymmetry also contain brief accounts of GLA theory and examples,
e.g. P. Fayet and S. Ferrara, Phys. Rep. C32, 249 (1977).

25. Supersymmetry and Supergravity


R. K. KAUL
Centre for Theoretical Studies, Indian Institute of Science,
Bangalore 560 012, India

1.

Introduction

Despite their different statistics, supersymmetry [1,2] puts bosons and fermions
in the same multiplet. This has opened up the possibilities of some remarkably
elegant ideas in theoretical physics. Perhaps that in itself is good enough
a motivation to explore the possible structure and implications of such
a fundamental symmetry.
Is this symmetry new compared to the symmetries we have been familiar with
earlier? In the past there have been attempts to place particles of different spins in
one mUltiplet. Schemes placing pseudoscalar and vector mesons in one multiplet
(35-dimensional representation ofSU(6)) or spin 1/2 and spin 3/2 baryons in one
multiplet (56-dimensional representation of SU(6)), have been known for a long
time. This SU(6) scheme, successful in many ways, was essentially nonrelativistic.
To have a relativistic scheme, one possible way was to enlarge the symmetry to
SU(6, C). But unfortunately, SU(6, C) being a noncompact group, has infinite
dimensional unitary representations. This is not acceptable. Such would be the
fate of all spin containing bosonic symmetries. It is a definite consequence of
a no-go theorem of Coleman and Mandula. This theorem can, however, be
circumvented if we include anticommuting generators in addition to the
commuting generators in our scheme. We shall have an occasion to discuss this in
some detail later. So, it appears that the natural state of affairs, if we wish to have
a relativistic description of spin containing symmetries, is one where we have
symmetries with both bosonic and fermionic generators. This is what supersymmetry is.
Besides elegance, is there anything else that makes supersymmetry so
interesting? There is one major property of supersymmetric field theories called
'naturalness'. Ordinary field theories with elementary scalar fields are not natural- in
the sense that radiative corrections to the masses of these scalar fields tend to
draw these masses towards the highest mass of the theory. Thus, smallness of
a scalar mass becomes impossible to maintain. This is precisely the origin for the
so-called gauge hierarchy problem in grand unified theories. In particular, in the
SU(5) scheme of unification, there are two energy scales (i) Mx ~ 10 15 GeV at
which SU(5) breaks down spontaneously to SU(3) x SU(2) x U(I) due to
a nonzero vacuum expectation value of an appropriate elementary scalar field,
and (ii) Mw ~ 100 GeV where the Salam-Weinberg gauge group SU(2) x U(I)

487
B. R. lyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 487-522.
1989 by Kluwer Academic Publishers.

488

R. K. Kaul

breaks down to electromagnetic U(I) group, again due to the nonzero vacuum
value of another elementary scalar field. Choosing the parameters of the Higgs
potential appropriately at the tree level, it is always possible to arrange the
desired size for these mass scales, so-called gauge hierarchy, Mi /Mlv ~ 10 26 But,
unfortunately quantum corrections upset this arrangement giving radiative
corrections proportional to Mi to Mlv. Technically speaking, this is due to the
presence of quadratically divergent scalar self-energy graphs with heavy fields
going around in the loops, and those logarthmically divergent self-energy graphs
which involve dimensionful three- point couplings proportional to the heavy
mass. However, supersymmetry does solve this problem [3]. In order to
appreciate how supersymmetry possibly could solve this, we recall an elementary
observation namely, a radiative diagram with a fermionic closed loop has
a relative minus sign with respect to a diagram with a closed bosonic loop. Hence,
troublesome radiative diagrams could be arranged to cancel between the
fermionic and bosonic effects. This cancellation would have to be operative at
every order of perturbation. Hence, we must have a symmetry, which relates
bosonic and fermionic effects, to ensure this. Indeed, it is true that in
supersymmetric field theories, both quadratic divergent and large logarthmically
divergent (due to large three-point couplings) corrections to smaller masses are
absent, except when there is a U( 1) factor group involved. In the presence of a U( 1)
factor group, the harmful effects are absent only if the trace of U(l) charge matrix
is zero.
Naturalness or the resolution of the gauge hierarchy problem is perhaps the
strongest motivation for introducing supersymmetry. It was only after this
realization about six years ago that supersymmetry became such a serious
candidate for a possible new fundamental symmetry of nature.
It may be appropriate here to mention that there does exist an alternative
solution to the problem of naturalness. This is the technicolour option, where
elementary scalar fields are replaced by fermion-antifermion condensates. There
are many a phenomenological problem associated with these ideas, which we
need not go into here [4].
There is yet another important reason for supersymmetry to be of interest. Ifwe
were to extend the global supersymmetry to a local one, we would necessarily get
the general coordinate in variance. This, thus includes gravity automatically.
There may be a deep rooted connection between ordinary interactions and
gravity in the form of supergravity, theory of local supersymmetry. In fact, the
recently popular superstring theories are hoped to be the ultimate unified theory
of all the interactions.
Having presented a set of motivations. for studying supersymmetry, we shall
adopt a rather pedagogical presentation of some of the issues involved in this
chapter. We shall start with discussing Coleman-Mandula theorem and its
implications. We shall see how fermionic symmetries with spin 1/2 charge
can be accommodated with bosonic symmetries and write down the supersymmetry algebra in four dimensions. We shall next construct representations of
the supersymmetry algebra on one partilcle states and also representations on

Supersyrnrnetry and Supergravity

489

fields. Supersymmetrically invariant Lagrangians will also be discussed. Then we


shall take up a discussion of spontaneous breakdown of supersymmetry. Two
mechanisms for this, Fayet-Illipoulos and Fayet-O'Raifeartaigh, will be studied.
After this, we shall localize the supersymmetry and discuss supergravity in 3 + 1
dimensions. We shall also discuss supergravity theory in higher dimensions,
namely in 11 and 10. Ten-dimensional supergravity is relevant for the string
theories that have become popular of late.
An incomplete list of various reviews and books that have appeared in this
area has been collected in [5].

2.

Coleman-Mandula Theorem and Supersymmetry Algebra [6,7]

The various possible symmetries of the S-matrix are very strictly restricted by
a theorem due to Coleman and Mandula [6]. Using analyticity of the elastic
scatt~ring amplitude as a function of the center-of-mass energy and invariant
momentum transfer for a system with the nontrivial S-matrix, these authors have
proved for us that the only possible conserved quantities which transform as
tensors under the Lorentz group are: (l) the usual energy momentum operator PII
and the Lorentz generators M Il V ' (2) arbitrary translationally invariant Lorentz
scalar conserved quantum numbers like electric charge, baryon number, etc, and
further, (3) for the systems with all particles massless, conformal invariance is also
allowed [7]. Thus, there is no way of combining the spacetime symmetries with
internal symmetries except in the trivial way using the direct product of the
Poincare group (conformal group, if all the particles are massless) and the internal
symmetry group. The latter must also be compact and itself a direct product of
semi-simple groups with U(I) factors. This means that to the algebra of the
Poincare group
[P Il , MAp] = i(1)IlAPv - 1)llpPA)'
[Mllv, MAp] = i(1)v;..M IlP -1)vpMp). - 1)1l"M vp

+ 1)llpMvA)

(2.1)

and the internal symmetry group (we call the generators Ia)
[Ia,I b] = ifab c Ie'

(2.2)

we add the following commutators,


(2.3)

which reflect the direct product structure. In particular, the internal symmetry
generators Ia will commute with the two Casimir operators p 2 and W 2 ,
p2 = P pll
11

'

W11 --

l2 Gllv).p

pVM"P

of the Poincare group


(2.4)

R. K. Kaul

490

These imply that the masses and spins of all particles in an irreducible multiplet
should be the same.
It may be pointed out that the Coleman-Mandula theorem is not valid in 1 + 1
dimensions. The reason is simple enough, there are only two possible values for
the scattering angles, 0 and n: and, hence, there is no analyticity of the amplitude in
the scattering angle. This is precisely tht: reason why exotic bosonic conserved
charges are allowed, for example, in sine-Gordon theory, in 1 + 1 dimensions.
Now going back to 3 + 1 dimensions, we have realized, as a consequence of the
Coleman-Mandula theorem that no bosonic charges that would change spin are
allowed. This leads us to the possible conserved spinorial charges. We may think
of a system with a set of conserved charges with spins!, 1, ... etc. In particular, let
N charges QaJi = 1,2, ... , N) transform under the Lorentz group as (1, 0)
representation and their adjoint Qlii transform as (OJ).* Since (t, 0) x (OJ) =
(t, t), {Qai
transforms as (t,
There is only one spin-l conserved quantity
for an intdacting theory in accordance with the Coleman-Mandula theorem and
that is the energy-momentum vector

Qn

Taking a convenient definition for the structure constants, we may, therefore,


write
(2.5)

On the other hand, for spin 1charges Qilai which transform as(i, 0) under Lorentz
group, {Qllai' Q{&} ~ (1,1). But Coleman--Mandula theorem does not allow for

* Our notation will be: Spinors transforming as (}, 0) and (O,}) under SL(2, C) will

be denoted by
undotted and dotted indices respectively, Q, and Q". The two-dimensional Levi-Civita tensors,
e'P(e l2 = _e 21 = I), e"P(e l ) = _ll = -1) and their inverses e,p(e12 = -"21 = I), Gi{l(el; = -ell =
-1) can be used to raise or lower the indices, Q' = e'PQp, Q" = Q/!e Pd . The Lorentz scalar products of
two spinors are I/Ix = I/I'x, = - x,l/I' = xl/l and 1/l'1 = I/l;i" = - '1;I/l" = jl/l and complex conjugation is
(1/1')* = I/l'. The (1.}) representation of SL(2, C) is related to the vector representation of SO(3, 1) by
Clebsch-Gordon coefficients ((J"),p = (I, (Ji),/! and their conjugates (u");P = (I, _(Ji);p, with property
(J"

U,

+ (J,u" =

2/1",

and

u" (J,

+ u,(J" = 211"" '1", = (1, -1, -I, -1).

The Lorentz generators for the spinors (~, 0) and (0, ;) are, respectively, (J ",/2 and - U",/2 with
i

(J"'=2,((J"u,-(J,u")

and

u"'=2,(ii"(J,-CT,(J").

The Dirac gamma matrices can be written as

y" = ( u"

(J")
0

and a four component Dirac fermion as 1/10 =


baSIS.

(X:) and a Majorana fermion 1/1


I

(~:) in this Weyl


X

491

Supersymmetry and Supergravity

any bosonic conserved quantity in an interacting theory, with this property. The
same would be true with all other tensor-spinorial charges, leaving us with the
only possibility of the spin t conserved charges discussed above.
In order to see what can be the possible structure of other Qai' Q&j
anti-commutators, notice that {Q.i' Q/lj} is (t, 0) x (t,0) = (1,0) + (0,0) under
Lorentz transformations. This commutator can thus be (a /lvl/ M /lV and a scalar.
The former is ruled out when we insist upon the validity of the Jacobi identity.
Similarly for {Qf, QpJ ~ (0, 1) + (0,0). Again (a /lvl" tl MIIV which transforms as
(0,1) is not allowed by the Jacobi identity, so we are left only with possible scalar
charges. Thus, the most general algebra is

fQ-i Q- j} - 2
l

i'

Bit!

zij

'

(2.6)

where Z ij = - Z ji' Zij = - Z j, called central charges, are some internal symmetry
generators. Obviously, Z'j' Zij can exist only for N ~ 2.
Next, in order to fix the commutator [P/l' Q.J, we again notice that, since
(t,t) x (t, 0) = (1,t) + (0, t), and also since there is no conserved charge with (1,t)
transformation properties, [P/l,Q.J = (C,J.tQf, where C/l are some combinations of the all-matrices. Again, Jacobi identity can be satisfied only if C II = O.
Hence

(2.7)
Also, to find out [Q'i' M /1vJ, notice that (t,O) x (\,0) = (1,0) + (~, 0) and again
there are no conserved charges with transformation properties (!, 0). Thus we are
left with [Q'i' M /lvJ = (C I1V)/ QPi. Jacobi identities can be satisfied, if C,lV = a 1Iv12.
Hence

[Qai,M/lJ = t(a/1vl/Q/li

(2.8)

and similarly
[Q~,M/lJ

= -

fi
:2\ QiM/lv)
i

This is what should be expected, after all Qai' Qi' transform as (t,O) and (0, t)
respectively under the Lorentz group; ta /1V and - ta /lV' respectively, are the
generators for these representations.
With respect to the internal symmetries, generated by fa'. the Q'i' form some
representation
(2.9)

The internal symmetry group has to be compact and, hence, a suitable linear
combination of the Q's for which the representation matrices are Hermitian
d = ta, is always possible. Then, for the adjoint Q~, we have [Q~, f aJ = - Q~(ta)/'
so that the largest possible internal symmetry group which acts non trivially on
Q.i is U(N). For example, for the simplest supersymmetry, N == \, the internal
symmetary group is U(1). The charge that generates this symmetry has been

492

R. K. Kaul

called R-charge in literature. The R-symmetry properties of the supercharges are

[Q"RJ

Q"

(2.10)

This U(1) symmetry group is chiral, under parity, Q ~ Q and, hence, R -> - R.
Thus we have seen that the Coleman-Mandula theorem allows for only spin t
spinorial conserved charges, which thereby mix the spins of states by t units and
[Q, W 2 J =I- 0. That is how we can have states with different spins in the same
irreducible multiplet. But since [Q,p 2 J =,0, O'Raifeartaigh theorem [6J, namely
all members of the multiplet have the same mass, still holds.
It should be noted that in the above discussion only Poincare invariant
theories in flat spacetime have been discussed and not those involving gravity. In
supergravity theories with high extended N, it is possible to have noncompact
internal symmetries.
One immediate and very important consequence of the supersymmetry
algebra (2.5) is the fact that energy operator is always nonnegative. This follows
readily, if we take the trace in C(, eX indices for any i = j. We see that the energy is
simple sum of squares

IQ,Y = tr(a/1 P/1) = 2P o

for each i. If the vacuum respects the supersymmetry

Q,iIO>

(2.12)

0,

we have <01 PolO> = 0. The vacuum energy of supersymmetric theories is always


zero. On the other hand, if there is a spontaneous violation ofthe supersymmetry,
supercharge does not annihilate the vacuum,
Then, for spontaneously broken supersymmetric theories, vacuum energy is
always positive. We depict this situation in Figure 1 by drawing the potential
profile. In case (a), supersymmetry is unbroken, whereas in case (b) it is
spontaneous broken.
3.
3.1.

Representation of the Supersymmetry Algebra on One-Particle States [8J


Massive Case

It is always possible to choose a frame of reference so that the one-particle state is


at rest, P/1 = (M, 0, 0, 0). The little group of a time-like vector being 0(3), the little
algebra of the graded algebra is simply the one generated by the angular
momentum J and the super-charges Q,/s. In the rest frame the algebra of Q's
(2.5,6) is given by (no central charges assumed).

{Q,i,QpJ = {Q~,Q~} = 0,

{Q'i,Qn = 2M<51<5 xi

(3.1)

Supersymmetry and Supergravity

493

(b)

(a)

Fig.!.

Notice Q~/.J2M and Q.d.J2M satisfy the algebra of 2N creation and


annihilation operators and, hence, can be used to build up a Fock space with
a positive metric. A massive one-particle state IM,J,J z ), characterized by the
mass M, total spin J and spin projection along the Z-axis, J z' may be taken as the
Clifford vacuum upon which the Fock space states are to be built. This Clifford
vacuum would be annihilated by Q.i'S. Successive application of Q~ will generate
the various states:

Q'iIM,J,Jz)

0,

IM,J,Jz;{n~,n~}) =

n. ( ~.)n~

!<;J

,,/2M.

~)n~IM,J,Jz)'

Y 2M

(3.2)

where, since QT = Q~ = 0, we have ni = 0,1 and n~ = 0,1 only. One of the two Q,'s
raises the spin by t unit and the other lowers it by t unit,

Q!I .. .j. .. )

so that the (2J max

I .. j + t ... ),

Q~I .. .j. .. ) =

I .. j

t ... ),

+ 1) states with maximum spin, J max = J + N 12 are given by


(3.3)

The minimum spin is zero if the Clifford ground state spinJ :::; N/2, otherwise it is
J - N /2, the corresponding state is obtained by applying all the Q~/.J2M
creation operators on the ground state.
The number of ground states is 2J + 1, corresponding to J z = - J, - J + 1, ... ,
J - 1, J. The representations are irreducible only for J = 0. Also renormalizability
allows massive matter with spins t. This, using J max = J + N/2 = NI2 for J = 0,
yields N = 1 as the only supersymmetry for renormalizable description of massive
matter. If we allow for central charges in the algebra, it is possible to relax this
restriction to include the possibility of N = 2 besides N = 1, but no more than that.
Now let us give a few specific examples of massive one particle representations of
the supersymmetry algebra for N = 1. Here, a representation of the supersymmetry
algebra is the direct sum of the four irreducible representations of Poincare group.

IM,J,J Z ;n 1 ,n 2 )

Qi"
Q?
;;:;-u ;;:;-uIM,J,J

y2M y2M

z ),

494

R. K. Kaul

with (n 1 ,n 2 ) = (0,0), (1,0), (0,1), (1,1). The corresponding spin and parity
assignments for these irreducible representations of Poincare group are
(n 1 ,n 2 ) = (l,O):(M,J -

(0,0): (M, J)i~

(1, 1): (M,J)-i~

+ !)-~
where chirality assignment '1 = i for integer J and '1 = 1 for half-integer J. As
(0, l):(M,J

an example, let us consider the following particular values of J and '1:


(i) (a) J = 0, '1 = -i, we have only three representation of the Poincare group in
this supermultiplet.
(M,Otl

(M,O)-l

(M,!)i

scalar

pseudoscalar

left-handed spinor

(b) J =

(ii) J

=
'1

i. Here again we have only three states:

(M,O)-l,

(M,O)+l

(M,~)-i

pseudoscalar

scalar

right-handt:d spinoe

!, '1

1.

(M, O) 1

(MJ)i

(MJyl'i

(M, 1)+ 1

(scalar/pscalar)

(L/R spinor)

(R/L spinor)

(pseudovector/vector)

(iii) J = 1, '1 =
(MJ)i

(M,Wl

(M,l)l

(L/R spin or)

(pseudovector/vector)

(vectorj

(iv) J =

'1 =

pseudovector)

spinoe)

(M,l)l

(MJ)i

(M,1)+i

(M, 1)+ 1

(vector/pseudo-

(L/R spinor)

(R/L spinor)

(pseudovector/

vector)

3.2.

(M,1)+i
(RjL

vector)

Massless Case

A massless one-particle state can always be cast into the light-cone frame, so that
it moves along the z axis,
Pfl = E(1,0,0, 1),

~Pfl =

0.

Besides energy E, we will also need to specify the helicity }~ = Wo / E = (M P)/ E


(where Mi = mCijk Mjd which gives the projection of the spin along the direction
of the motion. Since
[MP,QoJ = -t(O'P)/Qp,
we have
WoQodE,)~) = -!(O'zPz)/QpiIE,.~)
=

E(-!O'z

+ Qoi WoIE,A)

+ A)/QPiIE,A),

(3.4)

Supersymmetry and Supergravity

495

which implies, Q Ii' Q2i lower and raise the helicity by t unit respectively.
Similarly, Q;, Q~ raise and lower the helicity by t unit. But from the supersymmetry
algebra in the light-cone frame,
{Qai' Qpj} = 0,

{Q~,Q~} =0

{Q2i' Q~} = 0,

{Qli'

Qn

(3.5)

4b{ E

we have Q2i = and, hence, we are left with only N-dimensional Clifford algebra
of QIi and Qu. From the Clifford ground state IE, A), QliIE,A) = 0, we can
generate the various states by successive application of the adjoint generators

Q\/J4E:
-i

-i

Q
t I E,)_. ) = IE)
;;;-z;;
,y4E

+ 2;1')
I,

-j

Q
t
Q
i IE')
;;;-z;;
;;;-z;;
,A = IE'
,A
y4E y4E

+ 1")
; I,]

(3.6)

Notice these states have to be totally anti symmetric in the labels i,j, k .... These
are in all 2N states, with multiplicities,

corresponding to helicities

)., A

respectively. In particular for N


IE)

+ 2, ..... ). + 2' .... A + "2'

and

~IE'A) =

y4E

1 we have two states,

IEJ

+ t).

(3.7)

Since a Lorentz invariant theory must have CPT-symmetry, to these we have to


add the parity conjugate states with helicity - Ie and -(Ie + t). For example, for
A = 0, we have the states IE, 0) and IE, t) and for )_ = 1, we have the CPT
conjugate states IE, -t) and IE,O). Hence, for N = 1, the CPT respecting
multiplet contains two helicity zero and one each ofhelicity t states, i.e. a scalar,
a pseudo scalar and a Majorana spin t fermion.
Another example for N = 1 supersymmetry has for A = t, two states IE, t) and
1E, 1) and the CPT conjugate states given by )_ = - 1,1 E, - 1) and IE, -1). This
is the photon multiplet containing two helicity states of a photon, A = 1 and
two helicity states of a spin t Majorana fermion, Ie = t. For )_ = 1, we have the
two states IE,i) and IE, 2) and their CPT conjugate states given by )_ =
-2,IE, -2) and IE, -1). These together make the interacting graviton
multiplet.

R. K. Kaul

496

We may wonder what is the possible number of supersymmetries that we can


have. If we are interested in the massless renormalizable field theories only so that
we should not have states with spin ;;;d, we are restricted to N ~ 4. Also, since
spin ~ cannot be coupled to gravity in consistent manner, we cannot allow for
N > 8, for consistent supergravity theories.

4.

Representations of the Supersymmetry Algebra on Fields and Invariant


Lagrangians [9]

Here we shall restrict ourselves to the N = 1 case only. An infinitesimal


supersymmetry variation of a field <l> may be written as
(4.1)

where we have introduced the anticommuting parameter r:;' and its conjugate
p == (r:;")*. These anticommute with every fermionic object and commute with
every bosonic object
A commutator oftwo variations, <5(r:;1) and <5(r:;2) as dictated by the supersymmetry
algebra,
'Q r:;"
- Q-i] -_ 2r:; '((J Il) "(J r:;-SpIl
[r:;"

(4.2)

can be written as
(4.3)

4.1.

Chiral Supermultiplet

Let us now generate a representation of this algebra, starting with a complex


scalar field <po Notice from the algebra above, the dimension of the supersymme try generator Q, or Q" is~. Therefore, Q, transforms fields of dimensionj into
those of dimensionj + ~ or derivatives of fields of dimensionj - ~. In view of this
let us define a spinor (of dimension 1) by the following variation of the scalar field
<p(x):
<5(r:;)<p(x) = j2r:;tj;.

Notice, we have assumed [<p, Qo"] = 0 here. It is this constraint that defines
a chiral supermultiplet. The name chiral follows from the fact that only one type
of index, d, is involved in the constraint. The factor j2 in the equation above is
only for convenience.
In order to fix the variation of the spinor field, we realize that <5(r:;)tj; should be of
dimension 1 + 1 = 2, so tj; can transform into derivative of <p(x) and a scalar field

Supersyrnrnetry and Supergravity

497

F of dimension 2
r5(e)1/! = j2i(Jl'eaI'CP

j2eF.

The choice of the coefficient of aI' cp term is fixed so as to ensure that the
commutator of two variations r5(e1) and r5(ez) of cp closes in the manner dictated
by the supersymmetry algebra above.
Next, we have to find the transformation properties of F. To do so we evaluate
the two variations on the spinor field
J(e1 )J(ez)I/!, = j2i[((J1' ez), a/l(-j'i / I/! p)]

+ j2ez, r5(1)F.

By Fierz-rearrangement,
((J/lez),a/l(dl/!p) = t((J/lijv O/lI/!),(e1(J'z)

= -(a/lI/!),(e1 (J'e 2 )

t(J'ij/lal,I/!),(e 1 (J'e 2 ),

so that
[J(el),r5(Z)]I/!, = -2i(el(J/leZ

ez(J/le 1 )a 11 I/!,-

-i((J'ijllc/lI/!),(1(J,Ez -

+ j2(ez r5(e1)F

Gz

(J,31)

- 81r5(z)F)

In order this exhibits the closure, last two terms should cancel. Thus
J(e)F = ij2 eijl' a/ll/! = -

ij2 a/lI/!(J1l e.

With this transformation property of F, it can be easily demonstrated that the


commutator of two supersymmetry transformations on F also closes
[J(1),r5(z)]F = -2i(1 (J/lez - ez(J/li:1)aJ.

Thus, we have obtained a 'chiral' representation of the supersymmetry algebra in


forms of fields (cp, I/!,F):
r5(e)cp = j2r.I/!,

r5(e)1/! = ij2(J/lea/lqJ

+ j2f;F,
(4.4)

This supermultiplet is also called Wess-Zumino multiplet.


Instead of imposing the chirality condition [cp, EQ] = 0, if we had imposed the
'anti-chirality' condition [cp, eQ] = 0, we would have obtained an anti-chiral
multiplet (cp*, If/, F*), with transformation properties
j2EIf/,

J(e)cp*

J(e)F*

= ij2w/la/llf/.

J(e)1f/

ij2w 11 (\cp*

j2EF*,
(4.5)

In both these cases, we have four real Bose fields and four real Fermi fields. These
are the two smallest multiplets. Further, also notice the dimensions of these fields
cp(cp*), I/!(If/), F(F*) are 1,1,2, respectively. Hence, F and F* cannot be physical

R. K. Kaul

498

fields, but are what are called auxiliary fields. This also follows from the fact that
the equation of motion for the massive Majorana field is

iiilloili/l + mlfi = 0

(4.6)

so that variation of the F field can be written as


(4.7)

which would be satisfied if F = - m<p*. This is the constraint equation expressing


the auxiliary field F in terms of the scalar field <p*. If we use this constraint, the
supersymmetry variations of the physical fields become
(4.8)

But now the algebra would close only on the mass-shell, i.e. only when equations
of motion are satisfied. One way of appreciating why the on-shell and off-shell
representations are different is as follows: A complex scalar field describes two
real scalar particles and a Majorana fermion describes two particle states of spin
up and spin down (antiparticle of a Majorana fermion is itself), so that there is
matching of the number of bosonic and fermionic states in the physical (or
on-shell) version. On the other hand, in the off-shell version, we have four bosonic
degrees of freedom (2 for <p and 2 for F) and four fermionic independent
components of 1/1 (two complex compone'nts). Again matching is respected. Thus,
in going from the fields to the states, some dimensions have been lost. However,
since supersymmetry must keep the Boe- Fermi equality of the degrees of freedom
intact, it does so by means of auxiliary degrees of freedom in the off-shell version,
which disappear completely when we go over to the on-shell version.
4.2.

Lagrangian

Let us now turn to writing down the Lagrangian for the chiral supermultiplet. We
have already noticed that the F field transforms into a space derivative under the
supersymmetry transformation. This always is the case for the highest dimension
component of a given multiplet. Since the spacetime integral of such a total
derivative is zero for fields falling off sufficiently fast at infinity, such terms would
be invariant under supersymmetry. This property can be exploited in writing
down supersymmetrically invariant Lagrangians.
Consider the following two combination of the fields and their derivatives:

Lo

Lm

icll'lfiii"l/1 + <p*O<p + F*F,


<pF + <p* F* - tl/ll/l - tlfilfi

(4.9)

It can easily be verified that these transform into total derivatives under
supersymmetry transformations (4.4) and (4.5). In fact, they form the highest
dimension components of some suitable supermultiplets. Thus, the complete

Supersymmetry and Supergravity

499

Lagrangian density can be written as


L= La + mL m

(4.10)

The field equations implied by this Lagrangian are


iijll8 IlIjJ

+ ml[i =

0,

+ mcp* = 0,

Dcp

+ mF*

O.

(4.11 )

Using the constraint equation, F = - mcp*, in the last equation, we have the field
equation for a scalar field with mass m, Dcp - m2 cp = O. Thus, this Lagrangian
describes a noninteracting system consisting of a Majorana spinor and a complex
scalar field, both of mass m.
In order to introduce interaction, we may add the following combination ofthe
field, which again transforms as a total derivative under supersymmetric
transformations
Lg = g[(cp2F - cpljJljJ)

+ h.c.] + AF + ;.*F*.

(4.12)

Notice since F itself transforms like a total derivative, we have added separate
terms linear in F and F* There are other possible combinations of the fields, but
of higher dimensions and, hence, for the sake of renormalizability, we may restrict
ourselves to the above form only.
The field equations for F for the interacting system L = La + mLm + Lg are
again purely algebraic, with no dynamical content:
(5L

-(5F =

F*

(5L
(5F* = F

+ J. + mm + gm 2

0,

+ J.* + mcp* + gcp*2

't'

,'-

O.

(4.13)

Using these constraints, the total Lagrangian density takes the form
L = i8 Ill[iijllljJ + cp*Dcp - im(1jJ1jJ + l[il[i) - g(cpljJljJ + cp*l[il[i) - V(cp, cp*),

(4.14)

where the potential function takes a simple form


V(cp, cp*) = F F* = I),

+ m(p + gcp212.

(4.15)

This is a very general feature of the supersymmetric field theories, the potential
can always be written as a sum of absolute squares of the auxiliary fields of the
system and, therefore, the potential can never be negative, V ~ O. Also, the
ground state energy is zero and the corresponding value of the auxiliary field
F = 0 for unbroken supersymmetry.
The particular relationship of the various couplings in the Lagrangian above
(for example, boson mass = fermion mass, and Yukawa coupling ;,y = g, the
cp4-coupling, etc.) is reflection of the supersymmetry. These relationships are
respected even after renormalizations [9]. It so happens that all the linear and
quadratically divergent contributions to various couplings cancel out exactly

R. K. Kaul

500

between the loops with bosonic and fermionic internal lines. Only a single
logarithmically divergent infinity survives. This is absorbed as a common
wavefunction renormalization of all the fields here, CfJ and 1/1.

4.3.

Lagrangian with Several Interacting Multiplets

Now let us write down a general supersymmetric Lagrangian density for


a coupled system ofn scalar supermultiplets (CfJa,l/1a,Fa),a = 1, ... ,n,

Lo

= '2)o"lfi/j"l/1a
a

+I

CfJ:OCfJa

+ IJJ:,

(4.16)

where W(CfJa)' called superpotential, is a function of CfJ a only and not of CfJ: and for
renormalizability is restricted to be, at most, cubic in CfJa:
W(rpa) =

gabcCfJaCfJbCfJc +

a~b~c

I ~~CfJaCfJb + I}'aCfJa'
a~b

The auxiliary fields Fa can be eliminated by their equation of motion


F:

+ oW(CfJ)jOCfJa =

0,

(4.17)

so that the Lagrangian density simply becomes

(4.18)

For an example of the single field studied above, W = gCfJ3j3


The supercurrent for this system may be written as

+ mCfJ2j2 + )'CfJ
(4.19)

which can be verified to be conserved, O"S" = 0, using equations of motion


implied by the above Lagrangian.

Supersymmetry and Supergravity


4.4.

501

Supersymmetric Gauge Theory with Gauge Group U(l) [lOJ

It is straightforward to verify that a 'photon' potential All' a Majorana spinor A,


and a dimension 2 (auxiliary) field D form a supermultiplet under the

transformation laws:
b()AIl

= i(rJ Il)" + W / \

b())" = iD

+ 2(JIlVF IlV'
(4.20)

where the field strength F liv = 81lAv - 8vAw Care must be taken in verifying that
the algebra closes, because, although the algebra on J. and D does close, for All' we
have
(4.21 )

The extra term


transformation

+ 2i(1 (Jv2

- 2(Jv1)8 IlAv is nothing but a field-dependent gauge

(4.22)

Thus, the algebra closes only up to a field-dependent gauge transformation. On


the other and, the algebra closes exactly (without any gauge transformations) on
the supermultiplet (F IlV' A, D).
It is possible to write down a supermultiplet containing All fields, which closes
on its own without having to invoke a gauge dependent transformation. Such
a multiplet contains besides (All' I", D) some additional auxiliary fields, three real
scalar fields C, N, M with dimensions 0, I, 1, respectively, and one Majorana
spin or X with dimensions 1. A Lagrangian for these fields would be invariant
under both supersymmetry and gauge symmetry. But it is possible to choose
a gauge (Wess-Zumino gauge) so that all these additional auxiliary fields C, M,
N, X are rotated away and we are left with only (All' )", D) discussed above. This
gauge can be maintained under supersymmetry transformations only if additional field- dependent gauge transformations, mentioned above, are made. Thus, in
this gauge, supersymmetry is valid only modulo field-dependent gauge transformations. The Lagrangian density in this gauge can be written as
(4.23)

This changes only by a total derivative under supersymmetry transformations.


This would be the supersymmetric generalization of the pure electromagnetic
field Lagrangian.

R. K. Kaul

502
4.5.

Supersymmetric Yang-Mills-Shaw Theories [11]

The extension to non-Abelian theories is straightforward. Here we have


a supermultiplet (A a, Jea, Da) in the adjoint representation of the group G with
transformation properties
!5(e)A:

i(eii /lJea

+ 6(JIlIa),
i

!5(e)Jea = ieD a + -2 (J/lVeFa/lV,

(4.24)
Here
(.@/lJe)a

0/lJea -

grbcA~Jec

is the gauge covariant derivative of Jea and the field strength,


F:v

= 0/lA~

- ovA: - grbCA~A~,

where bC are the structure constants for the Lie algebra of the group. The
supersymmetric (i.e. changes only by total derivatives) Lagrangian density is
given by
L = !DaD a - tF:vpa/lV - iIa(J/l(.@/lJe}a.

(4.25)

This is the supersymmetric generalization of the pure Yang-Mills-Shaw


Lagrangian.
5. Spontaneous Breakdown of Supersymmetry
Since Nature does not appear to exhibit supersymmetry, at least not at the energy
scales of present-day physics, supersymmetry must be broken - but, of course,
such a breaking should not spoil some of the desirable features of supersymmetry,
in particular the naturalness of supersymmetric theories [3]. Spontaneous
breakdown is one such mechanism.
By spontaneous breakdown of the supersymmetry, we mean that the ground
state is not annihilated by the supercharges
(5.1)

which, in turn, implies that the vacuum energy of the system is nonzero, in fact
positive definite. Since Q.IO> should be a fermionic state, we may call it 11/1.> and,
hence, write

(5.2)
where S/l' is the conserved supercurrent with Q. as the charge, Q. = Jd 3 xS o.
Thus, supercurrent creates a fermion out of the vacuum with coupling f in the
same fashion as the current corresponding to spontaneously broken boson
symmetry creates a (Nambu-Goldstone) boson from the vacuum. Like the

Supersymmetry and Supergravity

503

Nambu-Goldstone boson, this fermion, sometimes called goldstino, can also be


proved to be massless.
Since the nonzero vacuum expectation value of the i[sQ, tjJ] = b(F.)tjJ signals
a spontaneous breakdown of supersymmetry and since the transformation
properties of the fermion in a chiral multiplet (cp, tjJ, F) are as given in Equation
(4.4) and of the fermion in a vector Abelian multiplet (Au' )"' D) are as given in
Equation (4.20), the spontaneous breakdown of supersymmetry can equivalently
be characterized by nonzero expectation values of the auxiliary fields F and D
<01 b(G)tjJ 10)

)2G<OI FlO) # 0,

<01 b(G)A 10) = is<OI D 10) # O.

(5.3)

The other terms in b(e)tjJ and b(s))" are prevented from having nonzero vacuum
expectation values by Lorentz symmetry.
The same conclusion can be arrived at alternatively by recognizing that the
potential of a supersymmetric field theory can be written down as the sum of the
squares of all the auxiliary fields;

II D +-II F

V~-

22'

(5.4)

so that for spontaneous breakdown some of the D's or F's have to get a nonzero
expectation value so that <01 VIO) > O.
Thus, we have established that the supersymmetry is broken when some of the
auxiliary fields obtain nonzero vacuum expectation values. The fermionic
partner of the auxiliary field that obtain the nonzero expectation value is the
massless goldstino of the broken supersymmetry. In a case where more than one
auxiliary field gets a nonzero expectation value, it is a certain linear combination
of the corresponding fermionic partners that emerges as a massless goldstino.
Whether D or F obtains the vacuum expectation value corresponds to the two
possible mechanisms of the breakdown. Thc first one relies on the presence of an
Abelian U(I) factor in the gauge group* and is called the Fayet-Illiopoulos
mechanism [12]. The second one, which gives the auxiliary field F nonzero
expectation value, involves a set of at least three chiral multiplets with very
specific interactions and is called the Fayet-O'Raifeartaigh mechanism [13]. We
shall now proceed to give one example each of these two mechanisms at the
classical level.

5.1.

Fayet-Illiopoulos Supersymmetry Breaking

Let us consider a supersymmetric model consisting of an Abelian vector multiplet


(AIlJ,D) and two chiral multiplets (cp+, tjJ +,F +) and (cp_ ,tjJ -,F _). The most

Da corresponding to the vector multiplet (A~, lea, Da) in the adjoint representation of a non-Abelian
gauge group cannot be given a vacuum expectation value. This would otherwise break the gauge
symmetry.

R. K. Kaul

504

general Lagrangian density can be written as the sum of five pieces, each invariant
under supersymmetric transformations

Lv + L+ + L_ + L+. _ + L n ,

where
2
L v -- lD
2

IF
F~V
4 "V

L 1 = iD" If! ij"t/I

.' "a " A,


.,.
- I)''(J
(D~ CfJ)*(D~q;')

- igq+
2 -(CfJ+
- t/I-+),

Ln =

(5.5)

h.c.)

+ Ft F

+ !gq+-DCfJ!
CfJ+,
--

~D,

where

D~ CfJ

(a~ + ~q

gA" )CfJ' D"

t/I =

(a" + ~gq A~ )t/I

- q /2 are the U( I) charges of the chiral multiplets (CfJ":' t/I ,F ). The superpotential will be chosen to be W(CfJ+,<o_) = mCfJ+ CfJ-, so that the potential
function reads

V --2ID2

gD(
*
*) +
+ F*+ F ++ F*- F -+2-q+CfJ+CfJ++q-CfJ-CfJ-

+ m(CfJ+ F _

+ CfJ- F + + h.c.) + ~D.

(5.6)

Notice we have added term ~D, which is both supersymmetric and gauge
symmetric on its own. Such a term would not be gauge symmetric for the
non-Abelian case.
The auxiliary field equations of motion are

so that as promised earlier, the potential can be written as the sum ofthe squares
of the auxiliary fields
V = 1D2

+ F + F! + F _ F'!

(5.7)

which, in terms of the fields, may be rewritten as


V = 1~2

+I

(m 2

+ 1q g~)CfJt CfJ + kg 2 (q + CfJ! CfJ + + q _ CfJ'! CfJ _)2.

(5.8)

In order to see the ground state of this model, we minimize the potential function

Supersymmetry and Supergravity


with respect to the fields

V
~=

u({J

SOS

({J i :

[( m2 +2gq("
1
") +4gqq+({J+({J++q-({J-({J1. 2
(*
* ) ] ({J- 0.

There are various cases involved here, depending upon m 2 Z tlq+lg~ and/or
m2 Z tlq _Ig~ and q > 0 or < 0, which yield spontaneous breakdown of the U(l)
gauge symmetry or otherwise. But in all cases, the supersymmetry is broken
because the auxiliary field D obtains a nonzero vacuum expectation value. For
example, let us take q + > 0, q _ < 0 and consider two cases: (i) m 2 > tl q _I g~ and
(ii) m2 < tlq -Ig~
Case (i). Since m 2 > tlq-lg~, here we have, from the minimization condition
on V({J+,({J-), <<p> = o and, hence, <D) = -~,<F> =Oand<V) #O.Here,
only D gets a nonzero expectation value, other auxiliary fields F + do not. The
2 + tq + g~ and
spectrum consists of two complex scalars with masses
+ =
two chiral fermions t/J + , t/J _, which make a Dirac spinor of mass m. The gauge
fi~ld All and its partner ). stay massless so that the U(l) gauge symmetry is
unbroken. The Majorana fermion ,{ is the massless goldstino of this model,
because its supersymmetric variation o(t:)), = i[Q, ),] = iED + ()'1l"F IlV takes a nonzero expectation value

m;

<i[Q,lJ) =

-i~

= -iE[(2V)]1/2.

(S.9)

It is interesting to notice, the boson and fermion masses of this model are not
equal but are related as

2m;+

+ 2m;_ -

4m~

= [4m 2 + g~(q+ + q_)] - 4m 2


= (q + + q_)g~,

(S.10)

where we have weighted each of the complex boson's mass squared by the
number of degrees of freedom (2) and the Dirac fermion mass by its number of
degrees of freedom (4).
Case (ii). Now let us consider the other case where we have m 2 < tlq _Ig~. The
minimum of the potential corresponds not to <({J > = 0 but <({J + > = 0,
<({J _) = 9, where 9 is given by
4

9 2 = ~(tlq _Ig~
q-g

- m 2 ) > O.

(S.11)

This implies

<D> = - ~
If we rename

({J +
2

+ 21q-lg9 = -I- I ,<F +) = - m9, <F -> = O.


1

by

({J,

2m
= 22(lq-lg~
q-g

-2m 2
q- g

and

({J'

==

({J _ -

(S.12)

9, the potential takes the form

q) <p*({J +
- m2) + m 2 ( 1 -~
q-

9 2)({J1 + ({J1*)2
+ '12 (q2- g2
2
J2 + higher order terms.

(S.13)

S06

R. K. Kaul

Here, both gauge symmetry and the supersymmetry are broken and
2 _ 2m 2M2
_ 2m 2
<V)--22(lq-lg~-m )=-2-,>0,
q3
q_g".

M2 = m2 + tq~g292.

(S.14)

The spectrum of this model consists of the following particles (a) A massive
complex scalar field rp = rp + with mass, m~ = m 2(1 - q +/ q _ ); (b) A massive real
scalar field (rp' + rp'*)/j2 with mass, m'2 = g2q~ 9 2 /2; (c) The Goldstone boson
is (rp' - rp'*)/j2 which becomes the longitudinal components of All so that the
gauge field gets a mass, m~ = g2q~ 9 2/2; (d) The fermion mass terms have the
form -m(I/!+I/!_ + 1Ii+1Ii-) + i(q_g9/j2HI/!_)_ -iii_I), which can be diagonalized as - M(I/!x + llix), where we have renamed I/! _ as I/! and X ==
[ml/! + - i(gq _ 9/ j2)A] / M, so that I/! and X form a Dirac fermion of mass
M = m 2 + g2 q~ ,92/2; (f) The other combination of I/! + and A, namely, X =
[m)_ - i(gq _ 9/ j2)1/! +]/ M is the celebrated massless goldstino, whose supersymmetric variation has nonzero expectation value:
iem
igg
2iemM
<i[eQ,X]) = M <D) - M - e<F +> = ~
= ie[<2V)]1/2.

(S.lS)

Here again, the sum of the bosonic mass squares weighted with their number of
degrees offreedom is not equal to the fermionic mass-squared weighted with their
number of degrees of freedom. In fact, the difference of these two is given by
[ 2m(l-q+/q_)+4
=

g2q292]

-4[m2+1g2q~92]

+ 2m 2 [(q- +q+)/Iq-IJ.

(S.16)

These two cases (S.10) and (S.16) are, in fact, the particular situations of a general
tree-level mass- formula for a supersymmetric theory with a gauge group U(I)
and any number of chiral multiplets [14J.
I(_)2J(2J

+ l)m;

-2trQ<D),

(S.17)

where J is the spin of the particle and tr Qis the trace of the charge matrix of the
chiral supermultiplets. From this formula, it is clear that when supersymmetry is
unbroken,
= 0, there is an exact boson-fermion cancellation in the trace of
the mass-square-matrix. But when supersymmetry is spontaneously broken by
- 0, this cancellation is obtained only when tr Q = O. In particular, this is
the case, in the examples studies above when - (q + + q _) = 0. This property
(tr Q = 0) is essential to maintain naturalness of these theories at the quantum
level, because the naturalness violating effects turn out to be proportional to
S tr m2 (supertrace m2 is the left-hand side of the above equations (S.10), (S.16),
(S.17)).

<D>

<D)

507

Supersymmetry and Supergravity

Having discussed this Fayet-Illiopoulos symmetry breaking now let us turn to


the second mechanism due to Fayet and O'Raifeartaigh.

5.2.

Fayet-O'Raifeartaigh Pure F-type Supersymmetry Breaking

In this mechanism, only auxiliary fields of the chiral multiplets (F's) obtain
nonzero vacuum expectation values. We shall also see we need at least three
scalar supermultiplets to implement this type of breaking.
We have already written down a supersymmetric Lagrangian with a number of
interacting chiral supermultiplets (ffJ a' 1/1 a' Fa) in Section 4.3. The potential energy
for such a model can be written as
(5.18)

where F: = - (15 W / bffJa)' From this, it is clear that if supersymmetry has to stay
unbroken, i.e. V) = 0, all <Fa) = O. But it is possible that a consistent set of
solutions of the equations <bW/bffJa) = 0 may not exist. If that happens, then
V) would definitely be nonzero, in fact positive, and supersymmetry would
break spontaneously. As an example of this situation, consider three chiral
supermultiplets containing ffJl == X, ffJ2 == Y, ffJ3 == Z as their scalar components,
and let the super-potential be given by

<

<

(5.19)

From this, the constraint equations for the auxiliary fields F x' F y, Fz are
'2
F *x = - bW
bX = - ).(Z
- M 2 ),

F *z
.

bW
bZ

= -- =

-gY - 2),XZ

bW
F*y = - _. = - gZ

bY

.'

'

(5.20)

and the potential function is given by


V(X, Y ,Z)

IFxl2 + lFyl2 + IFzl2 = ),2IZ 2 - M212 +


+ 9 21 Z 12 + I9 Y + 2AX Z 12 .

(5.21 )

Now, if we try to solve the three equations F x = 0, F y = 0, F z = 0, we see that no


consistent solution exists. For example, F y = 0 implies Z* = 0 and F x = 0
implies Z* = M. These two ground-state values are mutually incompatible.
Hence, some of the F's have to be nonzero in the ground state and supersymmetry has to be spontaneously violated. To see what would be the ground
state, we minimize the potential function given above. This potential is sum of
three positive terms. We minimize the first two terms and the last term separately
without loss of generality. Taking the derivatives (a / az a/ az*) of the first two
terms, their minimum is given by

508

R. K. Kaul

[2AZZ*

(g2

- 2A?M2)](Z

Z*) = 0,

(5.22)

and the last term in (5.21) is minimum if


gY

+ 2AXZ = O.

(5.23)

Now we have two different cases for various values of the g2, }. and M.
Case (i). If 2A 2 M 2 < g2, the minimum of the potential is given by Z = 0, which
further implies Y = 0 and the ground-state value of X is undetermined. Hence, we
have (Fx) = AM2, (Fy) = 0, (F z ) = 0 and (V) = ).2 M4. The fermionic
partner i/J x of the auxiliary field F x which picks up a nonzero vacuum value, can
be seen to be massless. This fermion is the goldstino.
Case (ii). If 2A2 M2 > g2, the minimum of the potential is given by
Z

= [M 2 -

g2/(2A2)]1/2

and

Y=

_ (2).X /g)[M2

_ g2/(2';?)]1/2.

These imply the following ground-state values for the auxiliary fields,
(F z ) = 0

and
(V) = g2[M2 - g2/{4A 2)].

The goldstino here is a linear combination ofi/J x and i/Jy , superpartners of F x' F y
which get nonzero vacuum expectation values.
In both cases, the boson and fermion masses do not match exactly, but the
mass-formula S tr rn 2 = 0 is till valid. As mentioned earlier, it is this nice property
that is responsible for the absence of quadratically divergent mass corrections at
the one-loop level [3] and, hence, like in non broken theories, spontaneously
broken supersymmetric theories still exhibit the naturalness property. In fact, one
can do something more than this. There are some special types of explicit
symmetry breaking interactions which may also be allowed from this point of
view. These are called 'soft-breaking' and been listed in [15] as

rn 2 cp* cp,

rn 2(cp2 + cp*2),

p( Cp 3 + cp*3),

PAA

for a complex scalar field cp, and a gauge fermion A. Notice terms like rni/Ji/J
(i/J chiral fermion) are 'hard'.

6.

Pure N

= 1 Supergravity in

Four Dimensions

So far, we have studied only the global N = 1 supersymmetry. If the supersymmetry transformation parameter is made spacetime dependent, we obtain
local sypersymmetry. Local supersymmetry necessarily implies supergravity. To
see this, recall that the commutator of two global sypersymmetry transformations is a translation. For localized supersymmetry, we would instead obtain
a general coordinate transformation. Thus, gravity has to be included with local
supersymmetry invariance. In fact, the gravitino, a spin! Majorana superpartner
oi the graviton, is the gauge field of local supersymmetry.

Supersymmetry and Supergravity

509

In this section, we shall use four-component notation for the spinors instead of
the two-component notation that we have used so far. We shall also take all the
Dirac i-matrices to be Hermitian and time coordinate to be imaginary, X 4 = ix o ,
so that the spacetime metric has the same signature in all directions. The
i-matrices satisfy the Clifford algebra.
(6.1 )
In four dimensions i's are 4 x 4 and, more generally, in D-dimensions, there are
x 2[D/2] matrices. Further, we shall also use completely antisymmetric
matrices

2[D/2]

where anti symmetrization is done with weight one. A Majorana spinor is one
which satisfies, If = I/! T C T , where e is the charge conjugation matrix, ep e - 1 =
- pT. The charge conjugation matrix e is antisymmetric in 2,3,4 (mod 8)
dimensions, so that iP" .. anl/! is symmetric under the interchange of the
Majorana fermions X and I/! for n = 0,3,4,7,8... and antisymmetric for
n = 1,2,5,6,9,10,11 ....
We shall reserve the Latin indices a, b ... for the tangent space, whereas the
world indices will be denoted by Greek letters f.l, v, . .... .
Ordinarily coupling of a spin ! field I/! ~ with other matter is plagued by
pathologies because of the fact the free-field gauge invariance I/! /1 -+ I/! /1 + /1 e of
the Rarita-Schwinger action is not preserved by the interactions. This leads to
acausal propagation [16]. Further, canonical commutation relations are
inconsistent with the constraint structure of the system [17]. These can be
avoided if the coupling of the spin! field with other fields is so arranged as to have
the gauge symmetry of the free-field theory. This would, after gauge fixing, allow
for the inversion of the wave operator and the propagator would be l/p2, so that
the propagation is causal. In supergravity, this is precisely what is achieved.
The supersymmetric action for pure N = 1 supergravity is simply the sum of
Hilbert action for the vierbein field e~ and the Rarita-Schwinger action for
a Majorana spin t spinor in curved spacetime background [18]

See, w, q>] =

f x[ -2~2
d4

eR(e, w) - tlf/1 pVI' Dv(w)1/!

1'1

(6.2)

A Majorana spin 3/2-field and a graviton in four dimenions each have two
physical degrees of freedom, so that there is Bost}-Fermi matching of physical
degrees of freedom. In the action above k 2 = 4nG, G is the Newtonian constant.
The Lorentz covariant derivative is defined by

D/1(w)1/!1' = (a~

+ iW/1abPb)I/!I"

(6.3)

where pb/2 are the representation matrices of the proper Lorentz group for the
spinors. Further, r ~ have vierbein buried in them, i /1 = ea/1 P. Also e == det (e au)'
The curvature is given in terms of the connections w/ b as

RIlV ab(w)

a w ab - a w ab + w ac W
Il

11

11.

vc

b
b _ WacW
v
IlC

(6.4)

R. K. Kaul

510

In the variation of the action, the connection W,/b, vierbein eaJl and fermion t/I Jl
are to be taken to be independent fields. The variation with respect to W,/b yields
the constraint that defines W,/b as a function of t/I Jl and eall (this is so called
first-order formalism). This w-variation can be seen to be
ab
(iW,/b
J d4X~(iW
11

Jd

= -

A
ll
l1 eA e\')S d(W)_
x[!...-(e
k2 debea + le
2 a b d JlA
V

~:r:
pvp r
8'1' Jl
ab'!' P
.f,

bw ab
Jl

after a partial integration and using the indentity


DJl(m)(eetaebJ) = -e(edelbe;J

+ tet,e~Jed)SI1).d(m),

where SJl/(m) is the so-called torsion tt:nsor


SJlAd(m) = DJl(w)e1- D;,(m)e~.

Using the fact that t/l il is a Majorana fermion, IfIJlPvprabt/lp = -6If1Jle~ebPt/lp in


four dimensions, we have the m,/b - Euler- Lagrangian equation given by
d

SJl)' (m) =

k :r: b
r t/lv'
2''I'Jl
2

(6.5)

In absence of the 'matter field' t/I 1" this torsion would have been zero. This
equation can be solved for the connection
(6.6)

where wJlab(e) is the torsion free connection, i.e. solution of D Jl(m)e~ - Dv(m)e~ = 0
and K/ b is the contorsion tensor,
Kl'vP = t(SVPJl - SJlVP - SPJlv)

k2

4'( IfIv r;, t/I P -

IfIJ;, t/I v

1fI" r. t/I Il)'

(6.7)

Before studying the supersymmetric variation of the above action, let us list the
supersymmetry transformations of our fields:

(6.8)
where m(e, t/I) = w(e) + K. No independent transformation law has to be
prescribed for m/b since it is a function of e~ and t/I Jl' In order to write down the
supersymmetry algebra, notice
[(iQ(8 1 ), (iQ(82)]e~

where ~a

DJl(m)~a,

(!)e 2 ra 8 1 , Further

DJl(wW = Diw)(e~C)
= aJlCe~

+ cave~ + Cmvabe~

- CSJlv a

Supersymmetry and Supergravity

511

and using the fact

S/lv"

k2
21f1/lPl/lv

= -

k2
2If1vPI/I/l'

we have

[bQ(el)' bQ(e2)Je~

= [bg.c.W)

+ bL(Cw vab ) + bQ( -

kl/lvC)]e~,

where, bg.c.(~/l) is a general coordinate transformation by an amount ~/l, bL (C w v b )


is a local Lorentz transformation by an amount C w v b and bQ( - kC 1/1 v) is
a supersymmetry transformation by an amount - k~v 1/1 v' From this we may
abstract the supersymmetry algebra as
(6.9)

which can be also verified to hold on the gravitino field I/I/l up to equations of
motion. In contrast to the case of global supersymmetry, we notice that the
commutator of two local supersymmetric transformations gives rise to local
Lorentz and supersymmetry transformations in addition to a general coordinate
transformation. Further, all these transformations have field dependent parameters. The fact that the supersymmetry algebra closes on the fermion up to
equations of motion is very much what happens for global supersymmetry with
auxiliary field eliminated. If we had included an appropriate set of auxiliary fields,
we would have had an off-shell closure of the algebra.
Now let us study the supersymmetric variation [19J of the supergravity action
(6.2). We shall subject the variation to the constraint w/lab(e, 1/1) = w/lab(e) + K/l ab '
This trick of using this connection equation of motion in the variation is the
so-called '1.5' formalism:

bS

f 4x[ :2
f x[ -~blfl/l
(R'"

-1Re''') -

d4

2ee* 1fI/l pvp] DJw(e, I/I))I/Ip }e." +

pvp Dv(w(e, I/I))I/Ip -

~1fI1l pvp D)w(e, I/I))bl/lpJ.

Feed into these the supersymmetry variations (6.8). After a partial integration
and use of the indentity [DIl(w),D.(w)] = (ra b/4)R llvab (W), we obtain

bS =

fd4x[~(Raa
2k

lReUa)ef
./. - ekef[a./.
:r. fllv p ] Dv ./.
2
u'l'a
'II" 'I'll
'1'1'

(6.10)
where we have dropped the total derivative term and made use of an identity
D/w)(ee[~ebe~]) = - e(e~e[~ebe~]

+ te[~e~e~]ed + te[~ebe~]e~)S/l/(w).

512

R. K. Kaul

The last two terms in JS can be easily evaluated by usmg some f-matrix
identities, to be

_("fIlVPfab.l, _.T. fIlVPfab)R


=
16k
'I'p
'I'p
Ilv.ab

--("f .1, )(R aa


2k a'l'a

_l.Re aa )
2

which cancels the first term in the variation (6.10). Then the left over JS has only
two terms:
(6.11)
These can now be seen to be zero by a Fierz rearrangement

Xxt/f</>

= -

4I

n=O

cnrpt .. 11" </>t/ffllt ... .IlJ,

(6.12)
U sing this formula and the fact that tf; 11 is Majorana, only terms with C 1 and C 2
contribute in the Fierz rearrangement of the first term of the left over JS above
"f[a .1, .T. fl'vp] = [_ C 1 .T. f" .1, "f[a f _ C 2 .T. fap .1, "f[a f ]fIlVP] (6.13)
'I'a'l'l'
4 'I'll 'I'a
a
4 'I'll
'I'a
.p
.
It is easy to see that
f[a faP fllvP] = 0

and

f[a fa fl'vp] = -

2J~a

f IlVP ],

in four dimensions, so that C 2 term is also zero, and C 1 term cancels the second
term of the left over JS (6.11) exactly.

7.

N = 1, D = 11 Supergravity

The highest number of dimensions in which supersymmetry representations with


spin :::; 1 can exist is 10 and supergravity (with spin :::; 2) can exist upto
D = 11 [20]. In the following we shall discuss supergravity theory in 11
dimensions [21], then discuss how this can be reduced to N = 1, lO-dimensional
supergravity [22].
Unlike in four dimensions, where graviton and gravitino (Majorana spin 1
fermion) physical degrees offreedom match, in 11 dimensions this matching does
not take place and additional degrees of freedom have to be added. In
D dimensions, a graviton has D(D - 3)/2 physical degrees of freedom, which for
D = 11 is 44. On the other hand, a Majorana spin 1fermion has (D - 3)2[D/2]- 1
degrees of freedom, which for D = 11 is 128. The missing 84 bosonic degrees
freedom are supplied by the transverse components of an antisymmetric tensor
gauge field of third rank, Apvp. The transversality of this tensor gauge field is
ensured by a gauge invariancc.
(7.1 )

Supersymmetry and Supergravity

513

where A,p is anti-symmetric. We have put hats on the indices to distinguish them
from IO-dimensional indices which we shall use later when we obtain lO-dimensional supergravity from II-dimensional supergravity. This gauge invariance and
the requirement that there should be no more than second derivative terms in the
Lagrangian, leads to the action for this field defined as the square of the
four-index field tensor

(7.2)
The N = I supersymmetry transformation laws for the vielbein, EdfJ , the
Majorana spin t fermion 'I'p and the anti symmetric tensor field AfJ~p are given by

bE".Il

= -ira'l'.
2
Il'
(7.3)

where npd6 is the supercovariantized connection containing the pure torsion free
connection n(E) and the 'I' p-torsion
~ Il
~ il k 2 d
Dp(n)E, - D,(n)Ep = 2'1'pr '1'"
~

np,p(E, '1')

p-

= np,p(E) + 4('I'pr,'I'p + 'I',rp'l'fJ + 'I',rfJ'I'p)

(7.4)

and ft~/JyS is the supercovariantized field strength,


~

F~/11S

.3.-

= Fd{J1S + 2 'I'~ r /11'1'5'

(7.5)

The supercovariantized fields are those which transform under supersymmetry


transformations without derivatives on E.
The invariant Lagrangian is as follows

(n n)

_
_
1
1 - pp8
+
I ~
~fJ"8
L(D-ll)- -2k2ER(E,n)-"2'1'p r Dp - 2 - '1'8- 48 EFp'P8 F P
- J2k E('Ji

384

fJ

rp~P1S,'I' ~ + 12'Ji~rP1'1's)(F + ft) ~P1ti _


(7.6)

where E == det(E:) and the connection n{ld6(E, '1') to be governed by its equation
of motion should be related to supercovariant connection as
A

n pIl6 (E, '1') = ujUJ/i(E, '1')

k2

+ g'l',rp

,p

il6'1'p.

(7.7)

It should be noted that all the four-fermion terms are those contained in the terms
involving supercovariantized fields, and ft.

514

R. K. Kaul

In order to study the closure of the supersymmetry algebra, we may evaluate


the commutator of two supersymmetry transformations on the various fields just
like in the four-dimensional case. It can be easily verified that
[bQ(ed, bQ(ez)]E~ = [b g . c . (~Ii)
where ~Ii

8a6 =

+ bQ( -

k~Ii'P)1)

+ bL(B" 6)]E~

(!)8zrliel and

CQ va6 +

fik
288 8 (r
2

'fj'
a6' Y

fJ

'8

+ 24E~E1iP

)eJdPJ8.

The same commutator when applied to ApM gives the same general coordinate
and supergravity transformations. (Api. is unchanged under local Lorentz
transformations.) But in addition, there is also a gauge transformation on the
right-hand side

b~Z)(A)Appp = -o[pA pp1 , App = {;8 r


2

Vp 1 -

3~8App.t.

On the gravitino 'P p also the algebra closes but once again we need to use the
gravitino equations of motion. Thus, finally, we abstract the on-shell N = 1
supersymmetry algebra in 11 dimensions as
[bQ(el)' bQ(6 z)] = bg.c.W)

8.

+ bQ( -k';'t/lv) + bL (8 a6) + b~Z)(Avp)'

(7.8)

N= 1, D= 10 Supergravity

In the following, we shall indicate how N = 1, D = 10 supergravity can be


obtained by appropriate dimensional reduction from N = 1 D = 11 supergravity
described above [22].
Like in 11 dimensions, the physical degrees of freedom of a graviton and
a gravitino do not match in 10 dimensions; the supermultiplet has additional
degrees of freedom. A graviton has D(D -- 3)/2 = 35 physical degrees of freedom
for D = 10. A gravitino which is both Weyland Majoranahere has(D - 3)2 D / 2 - 2
= 56 physical degrees offreedom. Besides these, the N = 1 supergravity multiplet
contains the 29 bosonic degrees offreedom from the transverse components of an
antisymmetric tensor field A liv and one scalar field cp. There is also a spin 1/2
Majorana-Weyl fermion). so that there are 64 bosonic and 64 fermionic degrees
of freedom.
The N = 1 supergravity in 10 dimensions describing the dynamics of the
multiplet (e~, A llv , cp, t/I fl'),) can be obtained by the truncated dimensional
reduction of D = 11 N = 1 supergravity in the following way [22]

E~ (~ E~J,
=

(8.1 )

Supersymmetry and Supergravity

515

where the hatless indices are 10 dimensional and a dot on top of an index
indicates that it is a world index, r 1"1 = E IIII II. Since the supertransform of
\f1: would contain the right-handed transformation parameter 8 R , the above
truncation would imply 8 R = O. Then t5E~1 = 1fra \f1 1"1 = 0 and t5E~ I =
tsr 11 \fIJl = 0 (we shall take k = 1 here). Further, it can easily be verified that
FJlvap

= FJlvap = 0,

In order to have the canonical form for the kinetic energy terms for the truncated

theory, a Weyl-rescaling of the zehnbein, E~ is required, E~ = <p~1/16e~ where


Ell n == <po The II-dimensional torsion-free connection gives the kinetic energy of
the scalar field,
(8.2)

where wJlab(e) is the torsion-free 10 dimensional connection. This formula has to


be used in the
(8.3)

where wee) + r = neE) and e == det (e~). Further, use Palatini identity -teRCe,
+ r) = -teRCe, w(e)) + te(racbrbca - raaCrbbc) + total derivative, so that

wee)

-tER(E, neE))
=

-teRCe, w(e)) - {6e(8Jl<p/<p)2

+ total derivative.

(8.4)

In order to reduce the II-dimensional Rarita-Schwinger Lagrangian, the


10-dimensional fermions t/J Jl and ,J. have to be defined as

\f1L =
Jl

m~I/16(."'I' Jl + J2
})
12 r ,,-,

't'

\fiR 11 -

2j~m17/16}
3
't'

"

(8.5)

to obtain a diagonalized form


-.1E\f1 rftvPD
2

ft

(n + n)\fI.

= -tetll"PvPDv(w(e))t/J,,-

-teIfPDp(w(e)),J. - 3 f

+ four fermion

terms

etllJl(~<p/<p)rJlA
(8.6)

516

R. K. Kaul

Collecting all these terms, we can now write the Lagrangian for the N
supergravity as

1D

10

L(D = 10, N = 1)
= -

2P eR(e, w(e)) -

2et]/llpvuD.(w(e))t/lu - 4/qJ-3/2F;vu

9 e

3j2e

-1eXPDIl(w(e))), - 16 k 2(OJi.qJ/qJ)2 - -8-t]/Il(~qJ/qJ)P),

+ ~k eqJ-3/4F llvu(t]/.pIlVUPt/lp + 6t]/IlPt/lu

+ four

_ j2t]/.pVUP),)

fermion terms

+
(8.8)

where we have restored the dimensional coupling constant k, which has mass
dimensions = 4 in D = 10. Unlike in the D = 11 case, where all the four-fermion
terms could be absorbed into the covariantized fields, in 10 dimensions not all
four-fermion terms can be so absorbed.
The supersymmetry transformations which changes the above Lagrangian by
a total derivative, can again be obtained from II-dimensional supersymmetry
transformations. For example, from II-dimensional supersymmetry transformation laws, we can obtain
bQ(; D = II)e~ = (k/2)ijPt/l1l

+ ()2k/24)ijpbAebJi ,

where 1] = qJl/16. This can be put in the canonical form, if we rotate away the
second term by a Lorentz transformation, so that the prescription for lO-dimensional supersymmetry transformations is
(8.9)

With this recipe, the complete set of transformation laws for D = ION = 1
super gravity can be written as

Supersymmetry and Supergravity

517

where w(e, l/l) is the supercovariantized connection; it contains only gravitino


contorsion
k2

wJ1Vp (e, l/l) = wJ1v/e) + 4(~ rv l/lp + Ij/v rp l/lJ1 + Ij/JJ1l/lp)

(8.11 )

and the supercovariant derivative fj a is given by

Da = aa

J2

+ -3- k

z;-:

(8.12)

J.l/l.,

whereas the supercovariant field strength F. py is as defined in (8.7).


The on-shell supersymmetry algebra can be obtained exactly as earlier and
turns out to be

(8.13)
where the parameters of the general coordinate, supersymmetry, Lorentz and
gauge tranformations on the right-hand side are given by
~J1

= tih [1'1'11,

- k"v.l,
7J2k):T'
J2k ~
S
~
J + 96.160

I] -

eab -

'l'v -

_):V

'" W vab

I] 2

'"

(.1,)

e,'I'

f"PY~P I] 1 r .py~p A,

-3/4 F
_ _16.32
k_xr.pyA.) +
+ k~1]2 r ab .py 1]1 (J2
32 <p
.py

-3/4 FabJ1
~ - 32.4
5k J:.rabJ1 A) ,
+ k~ J1(9J2
-8-<P
AJ1

{~k

<p3/4

~J1 -

C AJ1j

(8.14)

The II-dimensional gauge invariance,


6~2) A~vp = -

a[11

A vp1

reduces to the gauge invariance b~2) AJ1V = - a[J1A v1 in D = 10 theory.


Whereas no matter can be coupled to D = 11 supergra vity, for the D = 10 case,
N = 1 supersymmetric Yang-Mills matter does exist [20]. Coupling of U(I)
matter multiplet to D = 10 N = 1 supergravity was written down by Bergshoff
et al. [22], but its non-Abelian generalization was developed by Chapline and
Manton [23]. An important feature that emerges here is that the supersym-

518

R. K. Kaul

metrically consistent way of coupling matter requires a redefinition of the


three-index field tensor involving antisymmetric Chern-Simons three-tensor
X~~
F llvp ~ F~vp = F llvp

Xr~

+ kXr~,

= tr[A[llavAp] -

igA[IlAvAp]],

a[llx;a~] = ttr F[IlJaP]'

(8.15)

where the gauge field of the group are written as matrices, All = A ~ P. The
anti-hermitian matrices ya are in the fundamental representation of the algebra
of the group, tr T a Tb = - bab /2. With this redefinition of the three-index field
tensor, non-Abelian gauge invariance is ensured only if the tensor field Allv is
given a nontrivial gauge transformation
b~l) A llv

=-

k tr(a[1l AAv]) ,

(8.16)

where
8~1) All = - DIlA = - (all A

- g[A",A]).

The supersymmetric Lagrangian for Yang-Mills matter coupled to supergravity is given by


L = L(N = I,D = lOSG with F aP1 ~ F~py)

t{ <p~3/4

eFllvP v + efp(w,A)x

+ 1.ke<p-3/4XfIlPP(tP
2
Il + ~f
12 Il'))F vp -

fke<p-3/4XPPYXF~pYJ +

+ four fermion terms,

(8.17)

where F llv = allAv - avAil - g[AIl,Av], is the Ypng-Mills field strength and the
superpartner of All' X = XaTa is the Weyl-Majorana fermion in an adjoint
representation of the gauge group. The covariant derivative is defined both with
respect to the Lorentz connection and the Yang-Mills gauge potential
DIl(w,A)x == (all

+ lW/brab)x -

g[AIl,X].

(8.18)

The supersymmetry transformations for the coupled system are changed. In


addition to replacing F apy by F~py containing the Yang-Mills Chern-Simons
three-tensor, A llv ' A, t/lil get the following additional pieces to their pure supergravity transformation laws (8.10)
LlQA llv =

-1 k<p3/8 tr(l/r[llxAv]),
k

LlQA = - 12.32 tr(XPPY X)faP y1J,

Supersymmetry and Supergravity


L1 t{! =
Q

11

16.16fi

cp - 3/4(['

11

519
pPY -

8b a pY )1) trC( ['


11

apy

X).

(8.19)

The transformation laws for e~ and cp stay the same. These transformation laws
have to be supplemented with those for the matter multiplet
15 Q A 11 =

3/8 i1['
lrn
2.,..
./ 11

X,

fik [3'
,1 'rap r
~ 1rapyi5 X1r apyi5 ] 1).
+64
AX-2 A1 Xl,p-24 A1

(8.20)

Further, the on-shell supersymmetry algebra closes as in the pure supergravity


case
[b Q (1)1),b Q (1)2)] = bg . c ((Il) + bQ (1))

+ b L (B~b) + b~2)(A~) + b~l)(A),

(8.21)

except for an additional field-dependent non-Abelian gauge transformation,


b~l)(A), with the transformation parameter, A = C Av. The parameters of the
general coordinate and supersymmetry transformations, (11 and 1), are the same
as for pure supergravity, but those of the local Lorentz and Abelian gauge
transformation have extra pieces

(8.22)
where in Bab the replacement F aPy -+ F~py is understood.
While classical supersymmetric invariance requires the tensor field A llv should
have nontrivial, non-Abelian transformation, its local Lorentz transformation
has been assumed to be trivial. But at the quantum level, in order that the various
anomalies be absent, this condition has to be relaxed. Green and Schwarz [24]
have recently demonstrated that in the non-Abelian case, the gravitational and
the mixed anomalies for the Yang-Mills groups SO(32) and E8 x E8 can be
cancelled through suitable local counterterms, if the gauge fields A llv are given
nontrivial local Lorentz transformations
bL(B)A llv =

~a[lleabwvJab'

(8.23)

where b L (BlWllab = - Diw)8ab' With this prescription, the three-index tensor F apy
is not any more invariant under local Lorentz transformations. This leads to yet
another redefinition,
F"
F 'apy -+,py

1 XLapy'
= F apy + kX YM
apy - jk

(8.24)

520

R. K. Kaul

where we have added the Lorentz Chern~-Simons term


' ab:1 '
2.( , , , )a
X L/lVP = w[/lUvWp]ab
- 3 W[/lWVWp] 0 '
O[aX~/lV] = tR[apob(W)R/lv]ab(W).

(8.25)

Now F~fJY is both Yang-Mills and local Lorentz gauge invariant. However, this
extra modification of the transformation properties requires that the earlier
supersymmetric transformation laws and also the Lagrangian have to be
changed. In particular, since supersymmetry charges are spinorial, the commutator of a supersymmetry transformation and a local Lorentz transformation
should obey

[c5 Q('7),(\(m

= c5Q(t~abrab'7).

(8.26)

With the prescribed nontrivial Lorentz transformation property of A/lV' this


requires the supersymmetry transformation of A/l v to be changed by an additional
piece

(8.27)
where X/lab is given by c5Q(rt)W~b = (k/2)iiX/lob and b/l v is undetermined tensor
(higher-order in k) which transforms as a spinor under Lorentz transformation.
With this additional modification, the closure of supersymmetry algebra is
ensured only up to the lowest order k - 1/2 . To obtain the closure at the next order,
additional modifications in the supersymmetry transformations of the various
fields are required. This process appears to continue order-by-order. For closure
up to second order, the Lorentz connection field used in the Lorentz transformation property of A/l v, in the Chern-Simons terms X~va and also in the additional
supersymmetry transformation piece !(rt)A/l v, has to be replaced by a new
supercovariant connection,
A ob = W ob
/l
/l

+ 3fi
km- 3 / 4 Fooh + ...
2
/l
't'

with appropriate changes in other supersymmetry laws.


An important feature of introducing nontrivial Lorentz transformation for A/l v
and other consequent modifications, is that the invariant action has to contain
higher-derivative terms. These terms, however, are such that the causal
propagation is not affected.

9.

Concluding Remarks

We have covered only some aspects of supersymmetry and supergravity in this


chapter. Except for mentioning a few results here and there, quantum aspects of
supersymmetric field theories have not been discussed. In particular, the
remarkable property of the finiteness of the N = 4 supersymmetric Yang-Mills

Supersymmetry and Supergravity

521

theory in four dimensions has not been described. Since the realization that the
gauge hierarchy problem in ordinary grand unified theory can be resolved
without fine-tuning of the parameters by requiring supersymmetry, enormous
effort has gone towards building super symmetric grand unified models. The
supersymmetry breaking in the most popular version of these models is driven by
the gravity sector. The goldstino of the spontaneously broken supersymmetry is
absorbed by the gravitino which thereby acquires mass [5].
In recent years, supersymmetric string theories have attracted much attention.
It is believed that the low energy behaviour of superstring theories is governed by
lO-dimensional supergravity [25]. Indeed, the low energy (zero mass) spectrum of
a closed superstring is the same as that of N = 2, D = 10 supergravity, whereas
the theory of open and closed superstrings contains N = 1, D = 10 supergravity
and super-Yang-Mills multiplets. The same is true with the heterotic string
construction of the superstring theory, which consists of 26-dimensional bosonic
closed strings with 16 dimensions compactified, coupled to a lO-dimensional
closed superstring. Further, the low energy limit of the interacting superstring
theories does reproduce the interactions of the lO-dimensional supergravity of
the Chapline-Manton type described earlier with gauge groups SO(32) (GreenSchwarz superstring) or E8 x E8 (heterotic string) and appropriate GreenSchwarz modifications for the absence of anomalies. However, the complete low
energy limit of the superstring theories beyond the lowest order in the gravitation
coupling, has not been obtained so far. Nor have the complete modifications
implied by the Green-Schwarz anomaly cancellation for the supersymmetry
transformation laws and the action of the Chapline-Manton supergravity been
worked out beyond the lowest orders. Hoping that the superstring theory does
indeed represent all the basic forces of nature, the reduction from 10 to four
dimensions in a phenomenologically acceptable manner has to be done 27 .
Further, expected finiteness property of the lO-dimensional superstring theory
may as well provide the so-far elusive quantum description of gravity.
References
1. P. Ramond, Phys. Rev. D3, 2415 (1971); A. Neveu and J. H. Schwarz, Nucl. Phys. 813, 1109
(1971); 1. L. Gervais and B. Sakita, Nucl. Phys. 834, 632 (1971).
2. Yu. A. Gelfand and E. P. Likhtmann ,JETP Lett. 13, 323 (1971); D. V. Volkov, and D. P. Akulov,
Phys. Letts. 468,109 (1973); 1. Wess and B. Zumino, Nucl. Phys. 870, 39 (1974); A. Salam and 1.
Strathdee, Nucl. Phys. 876,477 (1974).
3. R. K. Kaul, Phys. Lett. 8109,19 (1982); E. WitteD, Nucl. Phys. 8188, 513 (1981); N. Sakai, Phys.
Clll, 153 (1981); S. Dimopoulos. and H. Georg, Nucl. Phys. 8193,150 (1981).
4. For a review, see E. Farhi and L. Susskind, Phys. Rep. 74C, 277 (1981); R. K. Kaul, Rev. Mod.
Phys. 55,449 (1983).
5. P. Fayet and S. Ferrara, Phys. Rep. 32C, 477 (1974); A. Salam, and 1. Strathdee, Fortschr. Physik.
57 (1978); P. Nieuwenhuizen, Phys. Rep. 68,189 (1981); 1. Wess, and J. Bagger, Supersymmetry and
Supergravity, Princeton Univ. Press (1983); S. J. Gates., M. T. Grisaru" M. Rocek, and W. Siegel,
Superspace or One Thousand and One Lessons in Supersymmetry, Benjamin Cummings (1983); H.
P. Niles, Phys. Rep. 110, 1 (1984); M. Sohnius, Phys. Rep. 128, 39 (1985).

522

R. K. Kaul

6. S. Coleman and J. Mandula, Phys. Rev. 159, 1251 (1967); L. O'Raifeartaigh, Phys. Rev. Lett. 14,
575 (1965).
7. R. Hagg, J. T. Lopuszanski, and M. Sohni us, Nucl. Phys. 888, 257 (1975).
8. A. Salam and J. Strathdee, Nucl. Phys. 880,499 (1974).
9. J. Wess and B. Zumino, Nucl. Phys. B70, 39 (1974); Phys. Lett. 49B, 52 (1974).
10. J. Wess and B. Zumino, Nucl. Phys. B79, 1 (1974).
11. S. Ferrara and B. Zumino, Nucl. Phys. B79, 413 (1974); A. Salam and J. Strathdee, Phys. Lett.
SIB, 535 (1974).
12. P. Fayet and J. Illiopoulos, Phys. Lett. SIB, 461 (1974).
13. L. O'Raifeartaigh, Nucl. Phys. B96, 331 (1975); P. Fayet., Phys. Letts. 58B, 67 (1975).
14. S. Ferrara, L. Girardello, and F. Palumbo, Phys. Rev. 020,403 (1979).
15. L. Girardello and M. T. Grisaru., Nucl. Phys. B194, 65 (1982).
16. G. Velo, and D. Zwanziger, Phys. Rev. 0186, 1337 (1969).
17. K. Johnson and E. C. G. Sudarshan, Ann. Phys. (NY) 13, 126 (1961).
18 S. Deser and B. Zumino, Phys. Lett. 62B, 335 11976); D. Z. Freedman, P. van Nieuwenhuizen,
and S. Ferrara, Phys. Rev. 013, 3214 (1976);
D. Z. Freedman and P. van Nieuwenhuizen, Phys. Rev. 014, 912 (1976).
19. B. de Wit, Lectures at Trieste Spring School, 1984.
20. W. Nahm., Nucl. Phys. B135, 149 (1978),
21. E. Cremmer, B. Julia, and J. Scherk, Phys. Lett, 76B, 409 (1978).
22. A, H, Chamseddine, Nuc!. Phys. B185, 403 (1981); E. Bergshoell, M, de Roo., B. de Wit, and P.
van Nieuwenhuizen, Nucl. Phys. B192, 97 (1982),
23. G. F. Chapline and N. S. Manton, Phys, Lett, nOB, 105 (1983),
24. M. B, Green and J. H. Schwarz, Phys. Lett, 149B, 117 (1984).
25. J. H. Schwarz, Phys, Rep. 89, 223 (1982); M, B. Green, Surveys in High Energy Physics 3, 127
(1983).
26. D. Gross, J. Harvey., E. Martinec., and R. Rohm, Phys, Rev. Lett, 54, 502 (1985) and Princeton

preprints.
27, p, Candelas, G. Horowitz, A. Strominger, and E, Witten, Nucl. Phys, B258, 46 (1985).

26. An Overview of Superstring Theory


H. S. SHARATCHANDRA
I nstitute of Mathematical Sciences. Madras 600 113. India

1.

Introduction

Our aim here is to present a quick introduction to the basic ideas of the
superstring theory which have led to the hope that they might provide
a consistent theory of quantum gravity.
There are many attractive features of superstring theory which have led many
workers to regard it as a likely candidate for the Theory of Everything (TOE). By
this is meant an ultimate theory encompassing and unifying all matter and their
mutual interactions. Some of the features are as follows:
(a) In superstring theory, we might have a finite yet causal quantum field
theory, thereby realizing the dreams of Dirac and many other physicists who
regarded divergences in local quantum theory as a serious problem.
(b) Consistency of the theory (with respect to unitarity, Lorentz invariance, etc.)
necessarily results in massless spin one and spin two particles in the spectrum.
Moreover, such particles interact with each other and with other particles as if
there is an underlying local gauge in variance and general coordinate invariance,
respectively. Thus, for example, we recover the Einstein theory as the low energy
(<< Planck mass) limit.
(c) The basic interactions are essentially unique and just cubic (Figure la). The
higher-order vertices present in the Einstein theory are generated by an exchange
of massive particles (Figure 1b). In this we may see an analogy with the history of
the weak interaction theory. The current-current theory which is unrenormalizable, has been replaced by the theory with massive W gauge bosons. The
phenomenologically good current-current interaction arise as the low energy
effects of the massive W exchange.
(d) Internal consistency also puts strong restrictions on spacetime dimension
(10), gauge group (E8 x E8 or SO(32)/Zz) and requires an underlying super-

Fig. l(a). Basic interactions among the massless spin two (wavy line) and other particles (double line)
of the string spectrum.

523
B. R. Iyer et al. (eds.), Gravitation, Gauge Theories and the Early Universe, 523-538.
1989 by Kluwer Academic Publishers.

524

H. S. Sharatchandra

Fig.l(b). The four graviton interaction of the Einstein Theory is reproduced in the low energy limit
by the exchange of an infinite number of massive particles.

symmetry. Therefore, there is a possibility that a unique candidate is picked out


as a theory of all fundamental interactions.
2.

Duality

We will briefly sketch the idea of duality which motivated the study of string
theories, even though it does not appear to be a strong motivation for the present
usage. Experimental studies of scattering of hadrons (e.g. np -> np) indicates
mediation by a large number of resonances whose (mass)2 and spin J are to
a good approximation, linearly related:
J = a'M2 - a(O) .

(1 )

a' is the slope and a(O) is the intercept. Close to a resonance, the amplitude is very
well approximated by (Figure 2)

g2(Spin factors(t-dependent))
s - m 2 + ir
where sand t are the usual invariant kinematic variables, m is the mass of the
resonance, r measures its width, and 9 is the coupling constant.

TT

Tf

Fig. 2.

np

-+

np mediated by a resonance.

We now envisage a scheme in which the full strong interaction amplitude is due
only to an exchange of resonance in s, t and u channels (Figure 3) with no

525

An Overview of Superstring Theory

'~x> {X', X' X}


TT

IT

TT)
A

Fig. 3.

IT2

TT2

IT1

IT

IT3

IT]

IT2

Tt1

Tt4

A(s,I)+A(I,u)+A(u,s)

1l11l2 ---+ 1l31l4

amplitude as envisaged in the dual model.

background contribution at all. We will also assume the narrow resonance


approximation. We have to sum over s ~ t ~ u interchanges to respect crossing
symmetry. Thus,
A

A(s, t)

A(s, t)

In

+ A(t, u) + A(u, s),


Rn(t)

--2'
S - mn

For consistency, we not only require the process 1[1 1[2 ~ 1[3 1[4 be given by this
principle, but also 1[1 1[3 ~ 1[2 1[4 which is related to the previous process by s ~ t,
u~u.

This requires (Figure 4)


A(s, t)

A(t, u),

or
"

Rn(t)

L.,
n S -

2
mn

="

Rn(s)

L.,
n t -

2'
mn

TC 4

'IT 3

'IT 3

TC l
TTl

'IT4

TC 2

TT2
Fig. 4,

The duality property.

which means that the t-channel process also has a contribution from the
resonances only, and has no background. This is the duality property. Only an
infinite sum of Feynman diagrams could possess this property, although
Feynman diagrams of quantum field theory do not.

526

3.

H. S. Sharatchandra
The Veneziano Formula

Veneziano discovered a simple formula which satisfies duality and has almost all
the properties required of an amplitude. The formula is
_ l( - a(s)) l( - a(t))
A(s, t) ,
l( - a(s) - a(t))

where a(s) = a o + a' sand


A(t, s) (duality), and

A(s, t) =

is the Euler gamma function. Obviously A(s, t) =

(n!)-l( - a(t)

+ 1) ... (- aCt) + n)
( )+ n
'
as

so that the entire contribution is from resonances. This could be generalized to


arbitrary n-point amplitudes. It was realized that this has an intimate relationship
with the relativistic strings and led to the study of the latter.

4.

Free Relativistic String

We want to describe the dynamics of afree string consistent with relativity. The
choice of the action is suggested by that for a free relativistic particle

S = - rn

f2

dT

J[?tdXIl)
2

dT

i.e. action is proportional to invariant distance swept out in spacetime. By


analogy, the action S for a string which sweeps out a two-dimensional
hypersurface in a d-dimensional spacetime is taken to be proportional to the area
swept out in spacetime. We shall choose an arbitrary parametrization (cr, T) of the
T), J1 = 0, 1,2, ... ,
surface so that the two-dimensional surface is labelled by
d - 1.

XIl(cr,

Fig. 5.

The two-dimensional surface swept out by the string with a choice of coordinates (0", 0) on it.

An Overview of Superstring Theory

527

The only restriction we impose on the parametrization is that xl' == oxl'/dr is


time-like and Xli == oxl'lo(J is space-like. The area covered by two vectors a and
b is

ab I
lal Ib--a
lal 2
so that

(2)

where we have taken into account x2 > and Xl < 0. This is the Nambu-Goto
action for a relativistic string. a' is a constant of dimension (mass) - 2 and is called
the string tension.
This action is independent of the parametization ((J, r) chosen because it is
related to the area of the surface.
The variational principle gives the equation of motion

orP~

+ o(JP~

= 0,

(j!

(3)

where

p~ == bil' =
p~

b!

== -,bX Ii

(x'x)xl'

-x .xl'

2na' J(i 'X)l - i


1

= --,

'x 2) ,

(i 'x)il' - i 2xl'
,

2na J(i 'X)2 - i 2x 2

The variational principle also gives the boundary conditions. If we choose


parametrization such that (J ranges from to n, at all T, for an open string we get
p/((J = 0, r) = p/((J = n, r).

5.

(4)

Orthonormal Gauge

The equations of motion look very complicated and are in no way indicative of
a free propagation. However, as in the case of the point particle, we may use the
reparametrization in variance (i.e. the freedom in choosing (J, r coordinates), to get
very simple dynamics. We may redraw the coordinates,

a = a((J, r),

i = i((J, T),

(5)

such that they satisfy (a) orthogonality: the coordinate axes (J = constant and
T = constant are orthogonal at points of intersection, (b) normality: the spacing of
coordinate axes is changed such that il' and xl' have equal magnitude lil2 = Ix12.

H. S. Sharatchandra

528
This gives the orthonormal coordinates,

xX

= 0,

(6)

X2 = -:(2

or, equivalently,

(x

X)2 =

(7)

O.

In this orthonormal gauge, the equations of motion are

xl'- xf
6.

O.

(8)

Quantization

For quantization, we need the canonical variables and, therefore, the Hamiltonian
formalism. However, as a consequence of the reparametrization invariance, the
passage to the Hamiltonian formalism is not straightforward. This is indicated by
the constraints among the canonical coordinates xl'(O", ,) and momenta pl'(O", ,)
'2

p, + -24
,2 =0.

IX

(9)

(independently of the choice of parametrization).


There are different ways of approaching this problem, all of which give
essentially same results. We shall follow the light cone quantization, as this makes
the physics involved very transparent.

7.

Light Cone Quantization

Our aim is to extract an independent set of degrees of freedom which, moreover,


have simple dynamics. The orthonormal conditions, (6), do not exhaust the
freedom in choosing coordinates on the two-dimensional 'world sheet'. If (0", ,)
satisfy conditions (6), so do any other set (a, i), provided

(a/ - a/)(i or a) = O.
Since xl'(O", ,) also satisfies this wave equation (8) we may choose i to coincide with
a linear combination of xl', f.1 = 0, 1, ... , (d - 1). We shall choose the linear
combination x where,

1
x = j2(X O

X d - 1).

(10)

Thus, our choice of coordinates are such that in addition to (6) and (7),

A, + X+ = x+(O", r),

(11)

i.e. we are choosing sections of the world sheet by the x + = constant planes to
describe the evolution of the string. X + has the interpretation of the centre of
mass coordinate at , = O.

An Overview of Superstring Theory

529

This choice of parametrization is called the light cone gauge.


With the choice of orthonormal coordinates (6) and (7), momentum variables
take the form
1
p/l =--x/l
,
2nrx'

(12)

Therefore, the variable conjugate to x + is


+

1.+

Pr = 2nrx' x

= 2nrx'..t

This means momentum density p~ is constant along the string in this parametrization . Because (J varies from to n, the total momentum in x + direction is
p + = J./2rx', and so J. is determined in terms of the total momentum of the string
(13)

(J-parametrization is uniquely fixed by the condition p,+ = constant.


We may now use the constraints (9) to express x- and p- in terms of other
variables. Using
a/lb/l = a+b-

+ a-b+

- aIb I ,

where I = 1,2, ... ,d - 2 label the transverse directions, we get


pIXI. Using (11), (12), and (13), we may solve fod1_

1t

ptx - + p,-x + =

1'1

= +p,X.

(14)

The other constraint in (9) gives,

so that

[2 2

I
I
I 1
p, =2np+ n(p,) +4rx,2(X)

2J .

(15)

x-

Thus, the dynamics of x + and p + is completely known, while that of and pare known in terms of x + , p + , Xl, pl.
It is useful to work with the Fourier coefficients as dynamical variables. For an
open string which satisfies the boundary condition (4) and equations of motion
(8), since (J has the range (0, n), we have
(16)
x-(J,!)

= X- + rxu!

- i

L (rxI_ne,nr 00

n=l

rx~e-'"')

x -cos

nrJ.

(17)

H. S. Sharatchandra

530

Moreover, x +(a, r) has the simple behaviour as in (11) and the conjugate variables
are simply obtained as pl'(a, r) = xl'(a, r). XI' can be interpreted as the centre of the
mass coordinate of the string at r = O. By integrating pl'(a, r) over a, we see
(18)
where pll is the total momentum of the string.
Reality of Xll requires that a,"-I' = (a~)*. We have finally isolated a convenient
set of independent variables for the open bosonic string
(p+,pI,X+,X-,a~),

n= 1,2, ... ,

We note that x- is an independent variable, because the constraint equation (14)


only fixes i- . We may express the constraints (14) and (15) in terms of the new
variables
+00

a; = - 14
+
ap

n;-oo

a~-ma~,

(19)
(20)

From (20), we may obtain the invariant mass of the string in an arbitrary state of
excitation
(21)

8.

Hamiltonian formalism

As we have explicitly solved for the constraints, we may now proceed to the
Hamiltonian formalism. We postulate the standard Poisson brackets (PB) for the
independent variables
{xI(a, r), pJ(a', r)} = (yIJ l5(a - a'),
{X-,p+}

-1,

(22)

All other PB's are zero. The wrong sign of the PB in the last equation is because of
the Lorentzian metric. X+ has a zero PB with all variables because its canonical
conjugate p- is not an independent variable. Because r = X+ /(2a'p+) is the
evolution parameter, the Hamiltonian is (2a'p+) times the conjugate of X+, i.e.

(23)

The PB's (22) with the Hamiltonian (23) reproduce the correct r - evolution
equations.

531

An Overview of Superstring Theory

9.

Quantization

As we have a canonical formalism, we may now proceed to the quantization by


the standard rule {PB} --> i[ ]. We get,
[a~, a~] = a'n bI l bn -m'

[Xl, pJ]

ib IJ ,

[X, p+] = -i,

(23)

with all other commutators zero.


It is useful to define new variables

n=1,2,3, ... ,

(24)

which have the standard commutation relation


(25)
We may now define the state 10, pI, P +) as the eigenvector of pi and p + with
the labelled eigenvalues, further satisfying
(26)

The entire Fock space is then generated by applying various a! + on 10, l, p +).
However, this is not enough for quantization. There is an ordering ambiguity
in the Hamiltonian (23). When written in terms of a/ variables, (see Equation
(21)), the position of(a.r.. n) relative to a! is ambiguous. As a consequence of(23), the
various orderings give Hamiltonians which differ from each other by a c-number.
Such a c-number does not effect the equations of motion, but shifts the invariant
mass M2 (Equation (21)). For the present, we keep this c-number a(O) arbitrary
and choose the Hamiltonian H = P +

ao,

ao =

00

a'(l)2

+ I

na~ + a/ - a(O) ,

(27)

n=1
00

a'M2 =

na~+ a~ - a(O).

(28)

n=1

We now get a mass spectrum which has integer spacing in a' M2. The ground
state of the string is 10, pI, P +) with a total momentum (pI, p +) and an invariant
mass, a'M2 = - a(O).

10.

Lorentz Covariance

Though we started with an action (2) that is manifestly covariant, we chose the
light cone gauge where manifest covariance is lost and covariance of the final
theory must be explicitly verified. The generators of the Poincare group, ignoring

532

H. S. Sharatchandra

the constraints, are


pll =

f:

MIlV =

dO' pll(O', r),

f:

dO'(xllp; - xVp/).

If we replace the dependent variables, using the constraints and the assumed
PB's for the independent variables, we may verify that these generators are
conserved in r and, moreover, satisfy the PB algebra of the Poincare group
[pll, pV]

= 0,

[pll, MP"]

= i(gIlPp"

_ gll"pP),

[MIlV,MP"] = i(gVPMIl" _ gIlPM V"

+ gll"MVP _

gV"MIlP).

However, some of these generators suffer from ordering ambiguities in the


quantum mechanical case. We will symmetrize all terms and, moreover, use (24)
with the c-number IX(O) whenever 1X0 is involved. We may note that IX;, n#-O does
not, suffer from ordering ambiguities, since the product a~: a~2 involved will have
n 1 #- n. Thus,

MI -

= !(XIp-

+ pXI) -

00

X-pI - i

~L~IX; -Lnrt~).

(29)

n~l

The Poincare algebra is satisfied except for

(30)

which must be zero.


Thus, we have Lorentz invariance only if

26,

We thus have the novel phenomenon that the quantized theory is relativistic
covariant only if the spacetime dimension is 26 and the intercept:x(O) = 1.

11.

Spectrum

From (28), we see that the invariant mass of the string in the ground state is given
by

(32)
which is a tachyon. This is a serious defect of this theory. The first excited state is

An Overview of Superstring Theory

533

given by
a~ !

10, pI, P+ )

which has rx'M2 = 1 - 1 = and has a polarization state for each transverse
degree of freedom. It is, therefore a massless vector boson. The second excited
states are
a~ +

10),

Together they form a massive spin = 2 state.


The appearance of a massless vector state is an extremely appealing feature of
the quantized string. It is easy to understand its appearance. The first excited state
a{ + 10) has only a transverse degree of polarization and, therefore, the theory can
be relativistic covariant only if it is massless. The ground state is a tachyon in
order that the first excited state be massless.
12.

Closed Strings

Until now, we have only considered the string with free ends. In the case of the
closed string, the boundary conditions are
xl1(O"

+2n, T) = xl1(O" = O,T).

We may quantize the theory, as for the open string. For Lorentz covariance, we
get

d = 26,

0:(0) = 2.

Thus, the intercept is 2. The ground state is a tachyon of mass 0:' M2 = - 2. The
first excited state is massless and consists of a symmetric traceless tensor, an
anti symmetric tensor, and a scalar. The first of these excitations may be identified
with the graviton.
13.

Interacting Strings

Until now, we have only considered free strings. We saw that each string can exist
in one of an infinite number of excited states which may be identified with
relativistic particles of extension characterized by the size (0:')1/2. We want these
particles to interact. A + B -+ C must correspond to two strings in states of
excitation A and B joining to form a single string in the state C. We will postulate
that the open strings interact by two strings joining at the ends only, or by one
string splitting into two at some point on it. Thereby, even though the string itself
is an extended object, the interactions take place at just one point in spacetime
and, hence, are local. Thus, the spacetime diagram for A + B -+ C + D is as in
Figure 6. Two strings first join and later split. The intermediate states correspond

H. S. Sharatchandra

534

/~

A
Fig. 6.

World sheet diagram for A

+ B... C + D.

to states of one string and, therefore, there are just poles in the external invariant
momenta. Moreover, duality is evident from Figure 6. In fact, this procedure
reproduces the Veneziano formula discussed earlier. In the case of closed strings,
the interaction would correspond to two strings which, on contact at some point
in spacetime, join to form one closed string or vice-versa.

14.

Field Theory Limit

In the limit 1/.' ..... 0, masses of all but the massless particles divege. We may
consider the effective low energy (i.e. <cO - 1/2) interactions of such massless
particles. Remarkably, the interactions of the massless symmetric traceless
tensors with each other, and with other particles, are precisely as if there is an
underlying general coordinate invariance. In fact, the interactions are, as in the
Einstein theory, to the leading order in (1/.') . Though the basic interactions involve
just three particles, the higher interactions present in the Einstein theory are
reproduced as a consequence of the exchange of massive particles. In the case of
massless vector excitation of the open string, the interactions are as if there is an
underlying local gauge invariance.
We may also consider the loop corrections required by unitarity. In Figure 7,
for example, two strings join, split, rejoin and split again to give a 1-100p
contribution. It appears the theory is renormalizable in the sense that the
divergences may be absorbed in a redefinition of the string tension 1/.'. This is in
stark contrast to the Einstein theory which, when quantized, has orders of
divergence which increases with the number of loops.
All these features have led many workers to hope that we may finally have
a consistent theory of quantum gravity in the form of quantized strings.
Moreover, the natural appearance of massless vector bosons provides a tantalizing
possibility that strong, electromagnetic and weak interactions could also be
found in the strings, perhaps providing a unification of all interactions. There is
a fundamental length scale in the string tension and this could provide the Planck
scale needed for both gravity and grand unified theories.
However, there are also serious problems to be overcome in order to realize
this programme. In the Nambu-Goto strings, there are no fermions. There is just

<

An Overview of Superstring Theory

Fig. 7.

535

Example of a one-loop contribution.

one massless vector boson, which is not enough. The spacetime dimension is 26,
whereas we live in four dimensions. Moreover, there is a tachyon in the spectrum.
We will now turn to the superstring where many of the above problems are
overcome.

15.

Superstrings

We now give a brief introduction to the superstring theory. Most of the basic
ideas and consequences have already been encountered in the Nambu-Goto
strings. Therefore we will emphasize only the differences and their consequences.
Instead of starting with a covariant action and obtaining the constraints
a-posteriori, we will start with the independent degrees of freedom and an action
which describes their dynamics. Such an action for the superstring is

(33)
Here XI(O', 't') are the transverse coordinates of the string. However, I = 1,2, .... ,8
only, instead of 1, .... ,24 as in the Nambu-Goto string. Here we have
anticipated that the critical dimension is now d = 10 instead of d = 26 mentioned
earlier. This difference comes about because of the new fermion degrees of
freedom sAa (0', 't') living along the string. sAa for each A = 1,2 transforms like
a spinor of 10-dimensional spacetime. Normally such a spinor has 32 complex
components, but we require sAa for each A = 1,2, ... to be both Majorana and
Weyl, which reduces the dimension of 16 real components. The Majorana
condition makes the spinor real (in a suitable basis for the Dirac matrices) and the
Weyl condition projects out one of the two chiralities using the projection
operator h =!(1 YII)' (where Yll == yOyl ... y9 is the product of 10-dimensional Dirac matrices). Moreover, there is another constraint on sAa which is an
analogue of the choice of light cone gauge. This is (},+tbS Ab = 0, where
y+ = !(yO + y9).
This further halves the number of components so that altogether there are

H. S. Sharatchandra

536
eight real components for each

A = 1,2.
sAa also describes eight 2-component spinors of the world sheet, with A = 1,2 as
the corresponding spinor index. The Pauli matrices pa,'X = 1,2,3 are the
corresponding Dirac matrices. In (33) sAa = SBb+(yO)ba(pO)BA. y- is needed in the
fermionic action, for otherwise the action vanishes because S is a Majorana
spinor.
The r-evolution of S is of a free Dirac spinor in two-dimensions

Therefore sla has only left modes and s2a only the right modes on the world sheet.
In contrast, Xl has both left and right modes. As a consequence, the total number
of bosonic and fermionic degrees of freedom match. In fact, the theory has
invariance under the following supersymmetry transformations
bXI = (p+)-1/2 elS,
bS = i(p+)-1/2y_y,ip'O)XIlS,

where sAa is a Majorana-Weyl spinor like sAa. This invariance connecting


bosonic and fermionic degrees of freedom, may be interpreted either as 16
supersymmetries in two-dimensional parameter space or as two supersymmetries
in 10-dimensional physical spacetime. This follows from the commutator of two
supersymmetry transformations

[bl' b2]X I = ~aOaXI


[b 1,b 2]S =

+ aI,

~aOaS,

where
~a

= _

2i(p +) - 1 e(1)y - pa S(2),

a = _ 2iiP)pOy I S(2).
I

Thus, successive supersymmetry transformations generate both a translation in


physical spacetime and a translation in the parameter space.
In case of an open string, the boundary conditions for the fermions are

sla(o, r) = s2a(0, r),


Sla(n, r) = s2a(n, r),
which require the same handedness for both SI and S2. We have now the Fourier
expansions

sla =

+00

S~ e-in(t-u),

n= - 00

s2a=

+00

"

L...

Sae+in(t-u)
n
,

n= -00

with same Fourier coefficients for both SI and S2.

(34)

An Overview of Superstring Theory

537

For quantization, we have the (anti-) commutation relations

{S::',8::} = (y+ h)ab iSm+n,o,


[S::', d~] = 0,

(35)

The Hamiltonian H again has an ordering ambiguity in the fermionic part As


before, Lorentz invariance provides a unique choice for the ordering, The
intercept a(O) is fixed to be zero, The mass spectrum is given by
00

a 1M2

"( I
i
~ a-nan
n=1

+ 2"n S-nY -S)n'

(36)

We see that M2 ~ 0, so that there is no tachyon! This may be traced to


supersymmetry which forces H ~ 0,
We may note that So does not contribute to the mass spectrum, Moreover,
from the anticommutation relation
(37)

we see that So is its own canonical conjugate, (because S is Majorana), As


a consequence, the eigenvalue of So can never be zero and we cannot build states
starting from one annihilated by So. Instead, we must choose a representation
space of the Clifford algebra (37). In our case, this representation is labelled by II)
and Ia) describing the polarization states of a massless vector and a MajoranaWeyl spinor. Sa transforms these states into each other.
We are now in a position to construct the states of the open superstring. We
start with
10,I,pl,p+)

or

10,a,pI,p+)

which are labelled by total momenta pI, p + and by the representation space of So
and which are, moreover, annihilated by a~ and S~ with n > 0. Then other states of
the string are constructed by applying any number of various (L~ and S _~ with
n > onto the above states.
All states, 10, I, l, p + I and 10, a, pI, p +) describe various possible ground states
of the string and have mass zero. We therefore have a massless vector and
Majorana-Weyl spinor, albeit in d = 10, forming a vector representation of the
supersymmetry algebra.
We may also consider closed superstrings as we did for the Nambu-Goto
strings. In this case, we get a supergravity multiplet consisting of graviton and
gravitino.

16.

Problems and Prospects

Superstrings are free of some of the defects of the Nambu-Goto strings. There are
fermions in the spectrum, moreover, the theory is free of tachyons. However, we
still require many gauge bosons if we want to account for strong, electro-magnetic
and weak interactions. We must also be able to relate to the four-dimensional
spacetime.

538

H. S. Sharatchandra

It has been found that by appropriately combining the left sector of the closed
Nambu-Goto string with the right sector of the closed superstring, one may
construct a new string theory called the heterotic string which has d = 10 both
massless gravitons and gauge bosons (with a gauge group E8 x E8 or SO(32/Z 2 )
and chiral fermions. Moreover, this theory may be finite to every loop order.
There are also many string theories with d = 4 obtained by compactifications of
d = 10 theory of a six-torus.
Thus, string theory has a very good prospect of unifying all fundamental
interactions, including gravity. However, this programme is yet to be realized.

References
1.
2.
3.
4.
5.

M. Jacob (ed.), Dual Theory, Phys. Rep. Reprint, Vol. I, North Holland, Amsterdam, (1974).
1. Scherk, Rev. Mod. Phys. 47, 123 (1975).
J. H. Schwarz, Phys. Rep. 89, 223, (1982).
D. 1. Gross, 1. Z. Harvey, E. Martinec, and R. Rohm, Phys. Rev. Lett. 54, 502 (1985).
M. Green, J. H. Schwarz, and E. Witten, Superstrings, Vol. I and II. Cambridge University Press,
Cambridge (1986).

Index

Reissner-Nordstrom,41
Schwarzschild, 31
stability, 41
static limit, 33
stationary limit, 38
superradiance, 46
temperature, 47
thermodynamics, 45, 46
uniqueness, 41
Zeldovich-Starobinsky-Unruh effect, 47
bottom, 216
Brans-Dicke scalar, 429

action, 119
algebra of charges, 203
angular type fields, 193
angle type variable, 196
anomaly, 212
anomaly constraint, 248
anomalous dimension, 219, 222, 225
anomalous magnetic moment ofW bosons, 199
anti De Sitter, 434
asymptotic freedom, 217, 223
and fixed points, 223
in QCD, 225
atlas, 119
axial vector anomaly, 212

Cabibbo
angle, 200
Kobayashi-Maskawa matrix, 233
rotated quark, 201, 209
rotation in two dimensions, 209
universality, 200
Callan-Symanzik equation, 219
canonical
dimension, 221, 224
momentum, 136
number of representation, 196
quantization, 138, 298
Cartan
differential algebra, 122
-Killing form, 485
Casimir effect, 318, 319
causal boundary, 113
causality, 108
charged currents, 207
charge conjugation, 143
C-violation, 176
CP-violation, 177, 299
CPT theorem, 178, 200
charge independence of nuclear forces, 166
chart, 119
chemical potential, 62
Chern
class, 286
form, 286
-Simons, 518

back reaction, 368


baryon
asymmetry, 243, 274
conservation violation, 240
number, 169, 197
beauty, 210
beta decay, 208
Bianchi identity, 24, 91, 92,125,459,471
Birkhoflf's theorem, 32
Bjorken scaling, 180,216
black holes
area theorem, 44
collapse, 400
entropy, 45
ergosphere, 40
event horizon, 39
angular velocity, 44
electric potential, 44
surface gravity, 44
Hawking effect, 47, 48, 323
infinite redshift surface, 34, 39
irreducible mass, 44, 45
Kerr, 28
Kerr-Newman, 41, 43
no hair theorem, 43
null surface, 35, 39
one way membrane, 35, 39
Penrose process, 40, 45

539

540
chiral supermultiplet, 496
Christoffel symbols, 18
classical field theory, 135
classification of singularities, 278, 280
Clifford algebra, 420
clock field, 377
closed timelike lines, 102
closure density, 58
Coleman-Mandula theorem, 489
Coleman-Weinberg, 343
colour, 180,215
confinement, 228, 229
compact space, 423
compactification,91
conformal, 113
Freund-Rubin, 434
monopole, 436
complex doublet of scalars, 191
complex Klein-Gordon field, 137, 186
conformal
compactification, 113
degree of freedom, 375
transformation, 305
connection
affine, 475
spin, 473
conserved current, 212
constraint equation, 382, 402
Cooper pairs, 214
coordinate systems
advanced null, 115
Boyer-Lindquist, 37
comoving, 52
locally inertial, 21
natural,21
retarded null, 115
Copernican principle, 51
coset spaces, 442
cosmological constant, 53, 399
cosmological models
anisotropic
Bianchi type V, 104
Bianchi type IX, 104
Big Bang, 59
closed elliptic, 54
De Sitter, 55, 348, 351
dust, 53
Einstein-De Sitter, 54
Einstein static, 55
Friedmann-Robertson-Walker, 52, 53, 343,
376
Giidel,101
inflationary, 343, 348, 390

Index
Kasner, 458
open hyberolic, 54
radiation, 54
steady state, 55
cosmological principle, 53
cosmological redshift, 56
cosmogenesis, 390
creation of the universe, 390
critical temperature, 213
cubic vertices, 192
Curl,123
current algebra, 201
current-current form, 200
deceleration parameter, 57
deep inelastic scattering, 216
density perturbations, 352
derivative
coupling, 211
covariant, 15, 16
covariant exterior, 124
gauge covariant, 186,201,257,471
Lorentz covariant, 417
operator, 111
Poincare covariant, 478
SL (2, C) covariant, 473
diffeomorphism, 108, 109
dimensional reduction, 423, 454, 455,514
dimensional regularization, 157, 178
Dirac field, 137
Dirac's large number hypothesis, 461
distribution functions, 62
divergence, 123
divergences
degree of, 211
in QED, 153
duality, 524
dynamical dimension, 222
eightfold way, 180
Einstein
-Cartan equation, 131
theory, 126, 131,468
field equation, 26
-Hilbert Lagrangian, 407, 455
spaces, 434
summation convention, 8
thought experiment, 3
electromagnetic field, 138, 187
electroweak model, 198, 203
see also Glashow-Salam-Weinberg
consequences, 206
effective coupling constant, 228

Index
endomorphism, 483
entropy problem, 464
expansion, 94
explicit mass term, 212
extrinsic curvature, 401
Faddeev-Popov, 225
Fermi
contact interaction, 207
coupling constant, 207
theory, 211, 212
fermion masses, 266, 205
fermionic mass term, 204
Feynman graphs, 150, 152,211
Feynman-Gell-Mann,200
flatness problem, 83, 346
flavours, 210
FCCC weak interactions, 233
F ock space, 140
forms
connection, 123
curvature, 123
differential, 119
Freund-Rubin compactification, 434
functional Schrodinger equation, 401
galaxy formation, 84
Gamow,59
gauge
Fock-De Donder, 408
Landau, 225
light cone, 529
Lorentz, 405
gauge field, 187
non-Abelian, 192
gauge hierarchy problem, 235, 487
gauge potential, 471
translational, 478
gauge theory, 467
of gravity, 126
gauge transformation, 187,407,467
global, 186, 187
local, 187
rigid, 187
gauge invariance, 142
local, 182,449
Gell-Mann-Nishijima formula, 169
generations, 213
general coordinate transformations, 473
general invariance, 193,212
general mass matrix, 196
general SV(2) transformations, 191, 193
general V(l) transformations, 187

541
Geometry
Euclidean, 4
fundamental tensor, 12
geodesics, 5, 20
gravitation, 5
line element, 6
Lobachevsky, 5
parallel postulate, 4
Riemannian, 5
generalized Wheeler-De Witt equation, 382
generation puzzle, 234
ghosts, 225
Glashow-Illiopoulos-Maini(GIM), 209, 234
Glashow-Salam-Weinberg, 184, 198,203
see also electroweak
gluon, 215, 227
radiation, 230
goldstino, 503
Goldstone
fields, 431
-Higgs Lagrangian, 213
model, 187
theorem, 189
graded lie algebra (GLA)
Abelian, 485
nilpotent, 485
semisimple, 485
simple, 485
solvable, 485
gradient, 122
grand unification, 235, 237, 247
Grassmann algebra, 121
gravitino,410
graviton, 407
groups
Abelian, 191
adjoint representation, 289
Casimir operators, 291
commutation relations, 288
colour, 214
complex VIR, 291
compact Lie, 288
exceptional, 289
fundamental homotopy, 283
fundamental VIR, 290
isometry, 439
Lie bracket, 288
local Lie, 288
orbits, 93
potentially real representations, 291
pseudo real representations, 291
rank,289
representation

Index

542
adjoint, 484
completely reducible, 484
reducible, 484
semisimple Lie, 288
simple Lie, 288
structure constants, 92, 288, 468, 215
topological, 288
universal covering, 288
Gupta, 407
harmonic analysis, 429, 441
harmonic functions, 423
Hamiltonian, 137
Hamidew coefficients, 366
helium synthesis, 72
Higgs
field, 204
mechanism, 191
model, 190
potential, 263
high-temperature approximation, 210, 212
higher-order loop diagrams, 210, 212
homogeneous, 52, 84
horizontal lift basis, 424
horizon problem, 83, 90, 345
Hubble constant, 56
hydrogen atom, 384
ideal, 485
ideal gas, 61
ideal point boundary, 114
indecomposable past set, 114
induced gravity, 235
inflation
chaotic, 354
limits on models, 355
primordial, 392
interactions, 164
interacting fields, 145
interaction picture, 146
internal symmetry, 440
in variance under general transformation, 210
infrared
divergences, 227
fixed point, 223
problem, 228, 229
jets, 231
Kaluza-Klein theory, 423, 450, 235
Chiral fermions 444
cosmology, 456, 457, 462
ground state, 427, 433

old, 424
modern, 427
zero modes, 429
Killing vector, 33, 34, 91, 110, 111, 439
Lagrangian, 135
Landau-Ginzburg, 213
left-handed fields, 202
length-type fields, 193
lepton
conservation violation, 240
deep inelastic scattering, 216
doublet, 210
electromagnetic, weak interaction of, 200
gauge boson interaction, 206
masses, 206
number, 171
tau, 210
Lichernowicz theorem, 445
Lie algebra, 288
local translations, 479
London penetration length, 214
Loren tz in variance, 134
low-temperature approximation, 63
lowering, 13
Mach principle, 102
Majorana conditions, 421
manifest relativistic invariance, 473
manifold, 107, 119
mass matrix, 233
massive vector boson, 191, 194,211
charged, 198, 199,212
neutral, 198, 212
propogator,211
massless
gauge boson, 196
particle, 189
vector boson, 191
matter spin density, 477
measurement process, 395
Meissner effect, 214
microwave background, 60, 75, 343
see also relic radiation
anisotropies, 78, 344
minimal electromagnetic coupling, 199
minimal SU(5), 247, 250, 258, 267
minisuperspace,374
mixing angle, 208
momentum dependent coupling constant, 217,
222
monopoles, 279, 280
monopole compactification, 436
monopole problem, 84, 347

Index
moving coordinates, 221, 222
moving coupling constant, 222
naked singularity, 40
Nambu-Goldstone boson, 189
Nambu-Goto action, 527
naturalness, 487
natural units, 164
neutrinos
massive, 89
oscillations, 234
primordial, 65
scattering
coherent, 208
elastic, 208
temperature, 69
Noether current, 181
Noether theorem, 469
nonrenormalizable theories, 210
nonrelativistic potential scattering, 217
normal ordering, 141
null infinity, 114
order parameter, 214
paracompact, 119
parity, 143, 172
violation, 174
participatory observer, 383
parton, 216, 230
Pauli-Fierz Lagrangian, 431
Peccei-Quinn symmetry, 279
perfect cosmological principle, 55
perfect fluid, 26, 53
phase transition, 277, 337,213
photon, 405, 407
photon to baryon ratio, 77
photon temperature, 69
Planck dimensions, 297, 394
points at infinity, 113
point splitting, 368
polarization
vectors, 406
tensors, 409
preons.235
primordial
inflation 354, 394
magnetic fields, 90
nucleosynthesis, 61, 73
proton decay, 272
proton antiproton collider, 208
QCD,214

543
QED, 183,211
QFD,231
quantum cosmology, 374
quantum gravity, 373
quantum stationary states, 388
QFTinCST
adiabatic vacuum, 312
canonical quantization, 298
conformal
anomalies, 333
coupling, 301
vacuum, 306
De Witt--Schwinger expansion, 361
divergences, 299, 397
effective action, 363, 397
Green function, 397
history, 344
minimal coupling, 301
path integral formulation, 321
renormalization, 365
scalar field, 300
vacuum energy, 320
Quark, 214
antiquark bound states, 209
charmed, 209
EM, weak interaction, 200
gauge boson interaction, 206
lepton universality, 235
masses, 206
strange, 205, 209
quartic vertices, 192
raising, 13
Rarita-Schwinger field, 411, 509
Recombination, 77
real triplet scalar field, 194
relativity
general, 4
special, 3
relic radiation, 60
regular point, 114
renormalization, 153
renormalizability
criterion, 211
of non-Abelian theory, 212
of weak interaction, 212
renormalized coupling constant, 219, 220
renormalization group, 217, 218, 220
renormalizable theories, 220
reparametrization in variance, 381
Riemann's zeta function, 315
right-handed fields, 202
right-handed neutrino, 202

Index

544
rotation, 94
Rutherford scattering, 216
scalar curvature, 25
scaling, 230
S-matrix, 151
self coupling, 211
self energy, 153
self interacting, 192
semiclassical cosmology, 388
shear, 94
spacetimes
anistropic, 99-105
asymptotically flat, 114
axially symmetric, 37
causal boundary of, 113
conformal compactification of, 113
Kerr, 28
Lense-Thirring, 38
Robertson-Walker, 28
Schwarzschild, 31
static, 32
stationary, 37
spacetime dependent phase transformations,
187
special U(I) transformation, 187
special SU(2) transformation, 191, 193
SU(2)
gauge theory, 191
triplet vector representation, 194
SU(2) x U(I)
charges offermions, 201
general invariance, 210
model, 196
SL(2, C) gauge field strength, 475
spinors, 420
spontaneous symmetry breaking, 187, 189, 260
of discrete symmetry, 189
SU(2),193
nonabelian, 195
spontaneous compactification, 454
stable causality, 108
standard model before gauge theories, 199
standard gauge model of particle physics, 185,
231,232,247
strangeness, 169
changing decay, 200
changing neutral current, 209
strings
closed, 533
cosmic, 279
field theory limit, 534
free relativistic, 526

heterotic, 538
interacting, 533
light cone quantization, 528
Lorentz covariance, 531
spectrum, 532
superstrings, 235
subalgebra, 485
Sudarshan-Marshak,200
superconducting state, 214
supergravity, 508
supernova, 208
supersymmetry, 487
algebra, 489
gauge theory, 501
mass formula, 506
representation, 492
spontaneous breakdown of, 502
tachyons, 188
technicolour, 488
tensor, 11
contorsion,419
curvature, 21, 23,24, 113,475
Einstein, 25
energy momentum, 26, 53
field strength, 470
Lie derivative of, III
metric, 12
quotient law for, 11
Reimann, 113
see also curvature
Ricci,25
torsion, 468, 475
tetrad, 415
see also vierbein
thermodynamic equilibrium, 61
Thomson model, 216
time reversal, 144
T-invariance, 144, 176
topological currents, 282
toponium, 230
truth, 210
ultraviolet
cutoff, 220, 397
divergence, 210
fixed point, 223
unification scale, 235
U(I) gauge theory, 185
vacuum expectation value, 190
vacuum polarization, 153

Index
vector
addition, 14
contra variant, 9
covariant, 10
covariant derivative, 15, 16
field,110
parallel transport, 16
potential for EM, 187
Veneziano, 526
vertex, 153
vorticity, 96
vierbein, vielbein, 410, 415, 420, 468, 474
see also tetrad
V-A,200
very early universe, 82, 214
quantum effects, 360
W-boson, 208
weak interaction, 200
FCCC, 233
interaction stength, 208

545
hypercharge, 202
isospin, 202
mixing angle, 198
Weinberg angle, 267
Wess-Zumino supermultiplet, 497
Weyl condition, 421
Wheeler-De Witt equation, 379
Witten's model, 444

Yang-Mills, 193,439,440
potentials for SL(2, C), 474
Yukawa couplings, 206, 211, 233
Yukawa terms, 204

zero point length, 394, 396


zeta function regularization, 315, 368
Z-boson mass, 208

También podría gustarte