Está en la página 1de 92

Bibliographic & Ordering Information

ISSN: 0168-0072
Imprint: Elsevier
Commenced publication in 1969
Subscriptions for the year 2008,
Volumes 149-154, 18 issues

Register to receive a quarterly email alert on the


Top 25 Most Downloaded Articles on ScienceDirect
for Annals of Pure and Applied Logic
http://top25.sciencedirect.com

Submitting Articles
Contributions, which should be in English, may be
submitted to any editor, preferably in electronic form.
Final decision for publication will be taken by a Managing Editor.
www.elsevier.com/locate/apal/authorinstructions

www.elsevier.com/locate/apal

Editor selects

Articles
from Annals of
Pure and Applied Logic

Annals of

Pure and
Applied Logic
It is a pleasure to present you this
collection of selected papers which have
appeared in the Annals of Pure and Applied
Logic. These papers illustrate the vitality of
the field of mathematical logic, as well as
its many interactions with other parts of
mathematics and computer science.
Guiraud's paper forms a good illustration of the new
categorical, geometric and topological aspects of
logical proofs, absent in the more traditional
approaches to proof theory. The contribution by Frick
and Grohe, about lower bounds for the complexity of
model checking, is a fine example on the borderline of
logic and computer science. Hjorth's paper relates
basic concepts of descriptive set theory to ergodic
theory, while Lewis and Barmpalias explore relations
between computability and various notions of
randomness. Gitik's paper deals with the arithmetic of
singular cardinals, and brings together many
important concepts and techniques of modern set
theory. Finally, Hrushovski's seminal paper on the
Manin-Mumford conjecture is a striking illustration of
the impact of modern model theory on problems of
algebraic geometry and number theory.
I am convinced that many of you will enjoy this
present from the publisher, showing how lively and
exciting mathematical logic nowadays is.
I. Moerdijk

Aims and Scope


The Annals of Pure and
Applied Logic publishes
papers and short
monographs on topics of
current interest in pure and
applied logic, the
foundations of mathematics
and those areas of
theoretical computer science
and other disciplines which
are of direct interest to
mathematical logic.

Contents

Termination orders for


three-dimensional rewriting

Yves Guiraud
Journal of Pure and Applied Algebra,
207 (2006), Pages 341-371

The complexity of first-order and


monadic second-order logic revisited

35

Markus Frick and Martin Grohe


Annals of Pure and Applied Logic
130 (2004), Pages 3-31

A lemma for cost attained

65

Greg Hjorth
Annals of Pure and Applied Logic
143 (2006), Pages 87-102

Randomness and the linear


degrees of computability

82

Andrew E.M. Lewis and George Barmpalias


Annals of Pure and Applied Logic
145 (2007), Pages 252-257

On gaps under GCH type assumptions

89

Moti Gitik
Annals of Pure and Applied Logic
119 (2003), Pages 1-18

The Manin-Mumford conjecture and


the model theory of difference fields

108

Ehud Hrushovski
Annals of Pure and Applied Logic
112 (2001), Pages 43-115

Coordinating Editor
Annals of Pure and Applied Logic

Termination orders for three-dimensional rewriting Y. Guiraud

Termination
orders for
three-dimensional
rewriting

Journal of Pure and Applied Algebra 207 (2006) 341371


www.elsevier.com/locate/jpaa

Termination orders for three-dimensional rewriting


Yves Guiraud
Institut de mathematiques de Luminy, Marseille, France
Received 17 November 2004; received in revised form 17 May 2005
Available online 28 November 2005
Communicated by I. Moerdyk

Yves Guiraud

Abstract

Journal of Pure and Applied Algebra,


207 (2006), Pages 341-371

This paper studies 3-polygraphs as a framework for rewriting on two-dimensional words. A


translation of term rewriting systems into 3-polygraphs with explicit resource management is given,
and the respective computational properties of each system are studied. Finally, a convergent
3-polygraph for the (commutative) theory of Z/2Z-vector spaces is given. In order to prove these
results, it is explained how to craft a class of termination orders for 3-polygraphs.
c 2005 Elsevier B.V. All rights reserved.

MSC: 08A50; 08A70; 16S15; 18C10; 18D05; 68Q70

0. Outline
This paper starts with the introductory Section 1 on equational theories and term
rewriting systems. It gives notations and graphical representations that are used in the
following. Then, it focuses on one major restriction of term rewriting, namely the fact that
it cannot provide convergent presentations for commutative equational theories: equational
theories that contain a commutative binary operator.
Section 2 studies the resource management operations of permutation, erasure and
duplication: they are implicit and global in term rewriting and it is sketched there how
to make them explicit. However, the framework for rewriting in algebraic structures needs
to be extended to include this change; Section 3 proposes 3-polygraphs to fulfil this role.
E-mail address: guiraud@iml.univ-mrs.fr.
c 2005 Elsevier B.V. All rights reserved.
0022-4049/$ - see front matter
doi:10.1016/j.jpaa.2005.10.011

The
Photophone
A. G.
Bellthree-dimensional rewriting Y. Guiraud
Termination
orders
for

342

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

Here, these objects, introduced in [3], are used as equational presentations of a special case
of 2-categories: MacLanes product categories, called PROs, for short, in [11].
These first three sections do not introduce new material, but focus on the notations,
representations, terminology and philosophy of this paper. Then Section 4 gives some
relations between term rewriting systems and 3-polygraphs: a translation from the former
to the latter is built and some properties are given. The main result of the section is the
proof of a conjecture from [10]: any left-linear convergent term rewriting system can be
translated into a convergent 3-polygraph.
To prove some of these results, one needs new tools, adequate with the more
complicated structure of polygraphs. In particular, Section 5 introduces a recipe to build
termination orders for them. Section 6 consists in the application of this technique to prove
some termination results of Section 4. Finally, Section 7 applies the same technique to
prove the termination of the 3-polygraph L(Z2 ) which was introduced in [10] and, since
then, was already known to be a confluent presentation of the equational theory of Z/2Zvector spaces. It is therefore the first known convergent presentation of a commutative
equational theory.

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

343

Then, the terms are all the trees one can build from these two generating trees and which
leaves are labelled with variables. As an example, the following figure pictures terms built
from the signature , with the two representations for each one:

The equations from the theory of monoids generate equalities between terms that represent
the same operation, through a rewriting process. Let us sketch how this works. For
example, the following term contains the tree-part of the associativity rule left-member,
which has been greyed out:

1. Equational theories and term rewriting systems


Universal algebra provides different types of objects in order to modelize algebraic
structures. Among them are equational theories: these are presentations by generators (or
operators) and relations (or equations, equalities). As an example, the equational theory
of monoids is a pair ( , E 0 ) consisting of the signature (a set of operators) and the
family E 0 of equations given by:
= { : 2 1, : 0 1},

Hence, the associativity equation generates an equality between the chosen term and
another. To determine which one, let us follow the following method, which consists of
three steps: at first, the remaining (black) part of the term is copied; then, in the space left
empty, the other member of the rule is placed; finally, the two parts obtained are joined (by
dotted lines), according to the respective position of the variables in each member of the
equation. Concerning our example, this process is pictured as follows:


E 0 = ((x, y), z) = (x, (y, z)), (, x) = x, (x, ) = x .

Each operator has a finite number of inputs and of outputs. When each one has exactly one
output, which is the case here, the signature is said to be algebraic. The given equational
theory ( , E 0 ) is said to be the theory of monoids since monoids are exactly sets endowed
with a binary operation and a constant, such that the operation is associative and admits
the constant as a left and right unit.
The formal operations one can form on any set with a binary operation and a constant
are called the terms built from the signature . There exist numerous ways to build the
set T of such terms, and each one gives a different representation for them. Two are used
here, a syntactic one and a diagrammatic one. For each one, a fixed countable set V is
needed; its elements are called variables.
The classical representation of terms define them inductively with the following construction rules: the first one states that each variable is a term; furthermore, the constant
is a term; then, for any two terms u and v, the formal expression (u, v) is a term.
The diagrammatic representation starts with the assignment, for each operator with n
inputs, of an arbitrarily chosen tree of height one with n leaves. For example, one can fix
the following trees:

Note that each variable appears once and in the same position in each member of the
associativity rule, so that the links are direct. When the second term is compacted, the
following equality holds and is said to be generated by the associativity equation:

In order to study the computational properties of these rewriting processes, term rewriting
systems are useful; they can be defined as oriented equational theories. Indeed, such a
rewriting system is defined from an equational theory by keeping the same operators

Termination orders for three-dimensional rewriting Y. Guiraud

344

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

and replacing each equation by a rewrite rule: it is an oriented version of the equation,
which can only be used in one way. As an example, starting from the equational theory
of monoids, one can form the term rewriting system ( , R0 ), where is still the same
algebraic signature made of a product and a unit and R0 is the following set of three
rules:
((x, y), z) (x, (y, z)),

(, x) x,

(x, ) x.

Rewrite rules generate reductions instead of equalities, and a graph containing terms
as vertices and reductions as edges is called a reduction graph. Some geometrical
properties of reduction graphs are of particular interest since they have consequences on
computational properties of the rewriting process. Among these geometrical properties,
three are particularly studied: termination, confluence and convergence.
A rewriting system terminates if it contains no infinite length reduction paths such as:
u 0 u 1 u 2 u n u n+1
Intuitively, this means that the rewriting calculus must end after a finite time, whatever
the input is. This is formalized by the following consequence of termination: every term u
has at least one normal form u;
this means that u is a term such that there exists a finite
and u is irreducible (no rule can apply on
reduction path from u to u (denoted by u  u)
it).
A rewriting system is confluent if, whenever there exist three terms u, v and w such
that u  v and u  w, then there exists a fourth term t such that v  t and w  t.
Intuitively, this means that choices made between two rules that can transform the same
term do not have any consequence on a potential final result; equivalently, this means that
any term has at most one normal form.
Thus, one defines the last property: a rewriting system is convergent when it is both
terminating and confluent. One immediate consequence is that any term has exactly one
normal form. This property is very useful for several purposes.
One of the most well known is the following usage: let us assume that ( , E) is
an equational theory and that ( , R) is a rewriting system that is a finite convergent
presentation of ( , R), which means that it is a convergent rewriting system with a finite
number of rules and such that two terms are equal in the equational theory if and only if
there exists a non-oriented reduction path between these two terms in the rewriting system.
Then there exists a decision procedure to check if two terms u and v are equal or not.
Indeed, one computes their unique normal forms u and v.
Note that this is where the
finiteness condition is useful: it allows one to check if a term is a normal form. Then the
two normal forms u and v are compared: u and v are equal in the equational theory if and
only if u and v are (syntactically) equal.
However, term rewriting systems have a major restriction in this field: there is a large
class of equational theories for which they cannot provide a convergent presentation. These
are the commutative theories, fairly frequent in algebra, which are equational theories with
a commutative binary operator. As an example, let us take a look at one of the simplest,
namely the equational theory of commutative monoids. Its signature is still ; its set E 1
of equations is made of the same three as the ones for monoids (associativity and left and

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

345

right units) plus the following one expressing the commutativity of the product:
(x, y) = (y, x).
From this theory, one can form a number of term rewriting systems, such as the one with
as signature and with the following choice R1 of orientations for equations:
((x, y), z) (x, (y, z)),
(x, ) x,

(, x) x,

(x, y) (y, x).

Note that the last rule could have been chosen in the reverse direction, but it would not
change the following fact: this rule generates infinite reduction paths. Indeed, for any two
terms u and v, the commutativity rules generate:
(u, v) (v, u) (u, v) (v, u) .
The purpose of this paper is to provide a framework where some commutative equational
theories admit convergent presentations: 3-polygraphs. Links between term rewriting
systems and 3-polygraphs are studied and a new tool to prove termination is given and
applied on some examples.
The equational theory that provides the main example here is the one of Z/2Z-vector
spaces: it has the same operators as the previous ones (the binary product embodies the
sum and the unit is the zero) and a set E 2 of five equations made of the four from E 1
(associativity, left and right units and commutativity) plus the following fifth equation:
(x, x) = .
It expresses the fact that, in a Z/2Z-vector space, any element is its own opposite. This
theory is preferred to the theory of commutative monoids for two reasons. The first one
is theoretical: any boolean algebra has an underlying Z/2Z-vector space, so that any
convergent presentation for Z/2Z-vector spaces is a first step towards one for boolean
circuits. The second one concerns the application range of the tools developed here: this
fifth equation has some nasty computational effects and is thus important to encompass in
the new framework, so that it can be used for other applications.
From the theory of Z/2Z-vector spaces, the term rewriting system ( , R2 ) is built,
where R2 is the following choice of orientations:
((x, y), z) (x, (y, z)),
(x, ) x,

(, x) x,

(x, y) (y, x),

(x, x) .

Note that this rewriting system is neither terminating nor confluent but will serve as a
starting point to build a convergent presentation. This transformation will start with the
study of the so-called resource management operations. For further information on (term)
rewriting systems, one can refer to [2].
2. Resource management operations
Let us recall the last step of the term rewriting process: one has to draw links between
two parts of a term, according to the variables occurring in the corresponding rule. As

Termination orders for three-dimensional rewriting Y. Guiraud

346

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

mentioned earlier, the rewriting example in Section 1 is the simplest case: indeed, the
variables occur once each and in the same order in each member of the associativity rule.
However, if this is not the case, one has to use additional operations before links are drawn:
these operations are called the resource management operations and there are three of their
kind, permutation, erasure and duplication.
Permutation is used, for example, when the commutativity rule is applied. Indeed,
when in this case, one has to use a permutation operation that will exchange the two grey
subterms in any term such as the following generic one:

The second operation, erasure, is used in the following case, for example: let us consider a
theory containing a binary operator and a constant which is a right absorbing element. The
following figure displays a rule which expresses this property (on the right) together with
a generic application of this rule (on the left); this requires an intermediate operation that
erases the grey subterm:

Finally, the last operation, called duplication, can occur in the following case: let us
consider a theory containing two binary operators, one of which is left-distributive with
respect to the other. Then, when applied, a rule that expresses this property (such as the
one pictured on the right) requires the use of an operation that can duplicate the greymost
subterm (and exchange one of its copies with another subterm, but this is the alreadyencountered permutation):

Thus, in term rewriting, these three operations are both implicit (they are not specified
by rules) and global (they act immediately on subterms of any size). We are now going
to sketch how one can make them explicit and local: only the idea is given here, the full
translation is postponed to Section 4.
Let us start with the following observation: the use of the three resource management
operations is specified both by the number of occurrences and the order of appearance of
each variable in each member of a rewrite rule. Thus, in order to make these operations
explicit, variables will be replaced by some additional operators that will represent
local permutations, erasers and duplicators; furthermore, rules will guarantee the global
behaviour of these local operators.

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

347

In order to give an idea of how the translation works, let us start with the study of this
term, which represents the operation (x, y, z) 7 ((x, z), x):

Seen as an operation, it is the composite of (x, y, z) 7 (x, z, x) followed by (x, y, z) 7


((x, y), z). The first operation can be pictured as the following diagram (a shunter),
since its action is to tell where each of the three arguments goes in the term:

This diagram will be formalized as a composite of new operators and the term will be
translated this way (with some explanations below):

Variables in the term have been replaced by ordinals; indeed, we have seen that variables
are just labels corresponding to the first, second, third, etc. arguments taken by the
corresponding operation. Hence, they will be replaced by ordinals whenever it makes the
translation clearer. The second remark is also about variables, but in the translated diagram:
they will always appear, after translation, in order: 1, 2, 3, etc. Thus, they have no purpose
anymore; they will therefore vanish, as in the diagram.
Finally, let us see what operators will be added to the signature and sketch how to
translate terms and rules. One operator is added for each resource management operation:
indeed, in order to formalize our previous diagram, one must be able to exchange two
arguments, erase one or duplicate another one. Thus, we fix a (non-algebraic) signature
made of the following three resource management operators:

Each one has a representation that makes explicit the operation one wishes it to embody.
Some rules will be added to ensure their global behaviour, but they will be given in
Section 4. For the moment, the only thing we need to know is that these rules give the
following interpretations to these three operators:
(x, y) = (y, x),

(x) = (nothing),

(x) = (x, x).

Now, let us sketch how terms are translated: first, the tree-part is copied; then and
progressively, resource management operators are added on the top of the copy, according
to the variables that appear in the term.

10

Termination orders for three-dimensional rewriting Y. Guiraud

348

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

The following figure gives four sample translations (the translating map is denoted by
thereafter):

Then, let us see how to translate the five rules of our term rewriting system derived from the
theory of Z/2Z-vector spaces. Each rule is pictured in order (associativity, left and right
units, commutativity and self-inverse), has been given a name (A, L, R, C and S) and has
its translation written just below:

Note that several cases may occur. For the first three rules, no resource management
operator is added during translation: these three rules are linear (or left- and right-linear).
When translated, the commutativity rule has one operator added on its right side and none
on its left side: it is a left-linear but not right-linear rule. Finally, the self-inverse rule has
one operator added on each of its members during translation: it is neither left- nor rightlinear.
Some issues have now arisen. The first one concerns the rules to be added in order both
to describe the behaviour of our local permutation, eraser and duplicator and to ensure the
global coherence of these local rules.
The next issue is about the respective computational properties of the starting term
rewriting system and of the rewriting system one gets as a result of making the resource
management operations explicit. These first two issues are addressed in Section 4.
For the moment, we are concerned with a third issue: where does rewriting takes place
now? Indeed, starting from a term rewriting system, we have crafted another rewriting
system which is not a term one, and for two reasons. The first one is that its signature
contains non-algebraic operators, that is operators that do not have exactly one output
(the resource management operators have zero or two outputs). The second reason is that
variables have been dropped to be replaced by these new operators: this is also a step
outside term rewriting. Hence, our new object is not a term rewriting system and Section 3
recalls a notion from [3] used to describe it.

11

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

349

3. Three-dimensional polygraphs
Like equational theories, 3-polygraphs are useful objects in universal algebra, in
the sense that they allow one to present algebraic structures by generators and
relations. However, they are far more general than equational theories, and this has two
consequences: on the one hand, they can handle more general objects, like the rewriting
system sketched in Section 2, or the structure of quantum groups; but, on the other hand,
their generality comes with an increase in the structural complexity: the development of
new tools is mandatory to prove termination, for example.
Polygraphs are genuine categorical objects but we prefer a diagrammatic definition here.
For this paper, a 3-polygraph is made of a signature, that is a set of operators with a finite
number of inputs and a finite number of outputs, together with a family of rules: in fact,
this is just a special case of 3-polygraph, one with only one 0-cell and one 1-cell. For the
complete theory of n-polygraphs, the interested reader should check [3].
The operators are once again represented by fixed diagrams of size one, with as many
free edges at the top as the operator inputs and as many free edges at the bottom as the
operator outputs. For example, some usual diagram shapes are pictured here:

Some of them have already been encountered, some of the others are less algebraic: one
has zero input and output it is useful to describe Petri nets, see [5] while one has two
inputs and zero output it is used together with its dual with zero input and two outputs
to represent knots and tangles.
Here, the terms one considers are all the circuits one can build with all these
elementary diagrams: these are the Penrose diagrams (or circuits) one can build with the
size one diagrams representing the operators, such as:

Each of these circuits has a finite number of inputs (on the top) and of outputs (on the
bottom) but has no variable. Furthermore, they need not be connected, as the three-inputs
and three-outputs wire only one.
These circuits, which are also called diagrams or arrows, have an algebraic structure.
To explain it, let us use the notation f : m n to express that f is a circuit with m
inputs and n outputs. For any circuit f , s( f ) is its number of inputs and t ( f ) its number
of outputs. The following constructions and properties are valid for circuits:
Let f : m n and g : n p. Then, one can connect each output of f with the
corresponding input of g, in the same order, to form a new circuit with m inputs and p
outputs denoted by g f .
This composition operation admits local units: a circuit f : m n satisfies f m = f
and n f = f , where p is the wire-only circuit with p inputs and p outputs.
Let f : m n and g : p q. Then, one can put f and g side by side to form a new
circuit with m + p inputs and n + q outputs, denoted by f g.

12

Termination orders for three-dimensional rewriting Y. Guiraud

350

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

This product operation admits a bilateral neutral element: the empty circuit 0 with no
input nor output, represented by an empty diagram.
Finally, the composition and product are related by the exchange relations. They are
given by the following equality, that is required to hold for any two circuits f : m n
and g : p q:
(t ( f ) g) ( f s(g)) = f g = ( f t (g)) (s( f ) g).
Definition 3.1. A family C of circuits endowed with this structure and , satisfying the
aforementioned unit and exchange relations, is called a product category; the subset of
circuits with m inputs and n outputs is denoted by C(m, n). When the circuits of C are
freely built from a signature , this object is the free product category generated by ,
denoted by h i.
Remark 3.2. Product categories, or PROs, were defined in [11]. An alternative definition
is: a product category is a strict monoidal category whose underlying monoid of objects is
(N, +, 0), the one of natural numbers with addition and zero. In [5], such a category was
called a (monochromatic) operad, for this structure is a common generalization of many
universal algebra objects: Mays operads, Lawveres algebraic theories and MacLanes
PROs and PROPs.
Product categories are also a special case of 2-monoids or 2-categories with only one 0cell. A generalization of this papers results should be possible, since circuit-like diagrams
extend to general 2-cells. For this paper, we stick to MacLanes product categories, but all
this terminology will be made clear in subsequent work.
A rewrite rule on a product category C is a pair f g of parallel arrows (they have the
same number of inputs and the same number of outputs). Such a rule generates reductions
on circuits: whenever an arrow h contains f , the rule generates a reduction from h to k,
where k is the same as h, except that f has been replaced by g. The fact that f and g have
the same number of inputs and the same number of outputs ensures that one can connect
the unchanged part of the circuit with the changed part, without using implicit operations
before.
Definition 3.3. A 3-polygraph is a pair ( , R) where is a signature and R is a family of
rewrite rules on h i.
One way to formalize the reduction relation generated by rules on a free product
category h i is to define contexts. We just explain here what they are, avoiding digging
further into the technical aspects, developed in [5]. Let be a signature. Then, a context
on h i is a circuit c with a hole inside: this hole has a finite number of inputs and
outputs where one can paste a circuit f with corresponding numbers of inputs and outputs;
this pasting operation results in a circuit denoted by c[ f ]. Then, a rule f g generates a
reduction from each circuit c[ f ], with c any context, to the circuit c[g].
Finally, given two product categories C and D, a product category functor from C to D
is a map which sends each circuit of C onto a circuit of D with the same number of inputs
and outputs, and which preserves identities, products and compositions. When C is the
free product category h i, then a classical categorical argument tells us that any product

13

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

351

category functor F : h i D is entirely and uniquely given by the circuits F() in D,


for every operator in .
4. From term rewriting to 3-polygraphs
This section uses results from [3], presented in a slightly different way, in order to prove
a conjecture from [10]: this is Theorem 4.6. This is the result that allows the Definition 4.8
of a translation from any term rewriting system into a 3-polygraph. Proposition 4.11 and
Theorem 4.12 give the respective computational properties of the term rewriting system
and the 3-polygraph.
In Section 2, a 3-polygraph has been built from the term rewriting system ( , R2 ),
which presents the equational theory of Z/2Z-vector spaces. Its signature, denoted by c ,
is the one built from by addition of the three resource management operators , and
from . Its family of rules, denoted by (R2 ), consists of the translations (A), (L),
(R), (C) and (S) of the five rules from the original term rewriting system. This
construction can be generalized to any term rewriting system but is still incomplete for
the moment. It lacks two families of rules and this section starts with their description.
Let us fix an algebraic signature . The set of terms built on the signature and on
some fixed countable set V of variables is denoted by T . Let us assume that the set V is
endowed with a total order (given by a bijection with N), so that the variables can be written
x1 , x2 , x3 , etc. For any term u, the notation ]u is used for the greatest natural number i such
that xi appears in u. Then, we define T (m, n) to be the set of families (u 1 , . . . , u n ) of n
terms such that ]u i m for every i. Note that the set T (m, 0) has only one element,
denoted by (m). The following operations provide the set T with a product category
structure:
If u = (u 1 , . . . , u n ) is in T (m, n) and v = (v1 , . . . , v p ) is in T (n, p), then their
composite v u is the family (w1 , . . . , w p ) where each wi is built from vi by replacing
each x j with u j .
The identity of n, for any natural number n, is the family (x1 , . . . , xn ).
The product u v of u = (u 1 , . . . , u n ) in T (m, n) and of v = (v1 , . . . , vq ) in
T ( p, q) is the family (w1 , . . . , wn+q ) built that way: if i lies between 1 and n, then
wi is u i ; otherwise, wi+n is vi where each x j has been replaced by x j+m .
Furthermore, this product category satisfies some additional properties. The first one is
that T is a cartesian category: seen as a strict monoidal category, the monoidal product
is the functorial part of a cartesian product. In our case and informally, this means that every
circuit f : m n is entirely and uniquely determined by n circuits m 1, in the same
way that any function f : X m X n , where X is a set, is entirely and uniquely determined
by n functions X m X : its components. To check that T is indeed cartesian, one uses
a result from [3], restricted to our setting:
Theorem 4.1 (Burroni). A product category C is cartesian if and only if it contains three
arrows:

such that the two following families of equations hold:

14

Termination orders for three-dimensional rewriting Y. Guiraud

352

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

1. The family E , made of the following seven equations:

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

353

The signature is contained in T : one defines an inclusion i which sends each


: n 1 from onto the term (x1 , . . . , xn ). Hence, Corollary 4.3 extends i into a
cartesian functor F from h c i/E to T : this functor sends each from onto i()
and , and respectively onto (x2 , x1 ), (x1 , x1 ) and (1).
Conversely, let us consider an arrow f = (u 1 , . . . , u n ) in T (m, n). Each term u i can
be written u i = f i (y1i , . . . , yki i ), with ki an integer, f i an arrow in h i(ki , 1) and each y ij a
variable from {x1 , . . . , xm }. Furthermore, this decomposition of terms is unique. Thus, the
arrow f uniquely decomposes into:
f = ( f 1 . . . f n ) (y11 , . . . , yknn ).

2. The family E , made of three equations for each integer n and each arrow f : n 1
in C:

The following recursively defined arrow families (n )nN and (n,1 )nN have been used:

There remains to prove that every family (y1 , . . . , yk ) of variables in {x1 , . . . , xm } can be
uniquely written (modulo E ) with the three arrows i( ), i() and i(). This can be done
in two steps.
Let us define the sub-product category V of T by restricting ourselves to families of
variables: this is T, where denotes the signature with no operator. One also defines the
cartesian category Fo of finite sets: the arrows of Fo (m, n) are in bijective correspondence
with the functions from the finite set [n] = {1, . . . , n} to [m]. Then:
Lemma 4.4. The cartesian categories V and Fo are isomorphic.

with the initial values 0 = 0 and 0,1 = 1.


Note that the following convention is now used in diagrams: generating operators
are drawn with black diagrams, while composite arrows are grey. The union of the two
families E and E is denoted by E . Theorem 4.1 is not mandatory to get the
following proposition but yields an easy proof of it:
Proposition 4.2. The product category T is cartesian.
Proof. Let us start with the definition of the three arrows from Theorem 4.1: the arrow
is the pair (x2 , x1 ) of terms; the arrow is (x1 , x1 ); finally, the arrow is the empty
family (1). Computations to check the equations of Theorem 4.1 are straightforward. 
The next step consists in the proof that T is the free cartesian category generated
by the algebraic signature . In order to prove this fact, one starts with another use of
Theorem 4.1:
Corollary 4.3 (of Theorem 4.1). For every algebraic signature , the category h c i/E
is the free cartesian category generated by .
Hence, in order to prove that T is another version of the free cartesian category
generated by , it is sufficient to prove that there exists an isomorphism : T
h c i/E .

15

Proof. Let (y1 , . . . , yn ) be a family of variables taken in {x1 , . . . , xm }. Then, there exists
an unique function f from [n] to [m] such that yi = x f (i) for each i. Let us fix
(y1 , . . . , yn ) as the arrow f in Fo that corresponds to f . Conversely, if f is an arrow in
Fo (m, n): let us denote by f the corresponding function from [n] to [m]. Then one defines
( f ) = (x f (1) , . . . , x f (n) ). There remains to check that and are cartesian functors
which are inverse one another, which is straightforward. 
The second step uses another result from [3]:
Theorem 4.5 (Burroni). The cartesian categories Fo and hi/E are isomorphic.
Hence, the cartesian categories V and hi/E are isomorphic. Consequently, each
family (y1 , . . . , yk ) of variables taken in {x1 , . . . , xn } corresponds to a unique arrow
in hi/E . Furthermore, each arrow f in T (m, n) admits a unique decomposition
f = f f with f in h i and f in V.
Finally, one gets that the cartesian functor F from h c i/E to T is an isomorphism.
However, we want a map from T to h c i: let us find a convergent 3-polygraph
( c , R ) such that h c i/R is isomorphic to h c i/E and use the unique normal
form property.
A conjecture from [10] is proved:
Theorem 4.6. For any algebraic signature , the 3-polygraph ( c , R ) is convergent
and h c i/R is isomorphic to the free cartesian category h c i/E generated by ,
where the family of rules R is made of the following two subfamilies:

16

Termination orders for three-dimensional rewriting Y. Guiraud

354

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

1. The family R :

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

355

Definition 4.8. For every term u in T and for every integer n ]u, the term u can
be seen as an arrow u n in T (n, 1). One denotes by n (u) the arrow (u n ) of h c i and
by (u) the particular case ]u (u). If = (u, v) is a rewrite rule on T , the notation ()
is used for the rewrite rule ((u), ]u (v)) on h c i.
As an immediate consequence of the definition, one gets:
Lemma 4.9. For any algebraic signature , any term u in T and any integer n ]u,
the arrow n (u) is a normal form for the resource management rules R .
The rest of this section is devoted to the comparison of a term rewriting system ( , R)
with the 3-polygraph ( c , R c ), where R c is the union of the family R of resource
management rules and of the family (R) made of the translations by of the rules R.

2. The family R given, for each integer n and each operator in (n, 1), by:

Remark 4.7. Three families of verifications need to be done. The first one consists in
checking that the new rules are derivable from E , which is straightforward.
The second one is much more complicated: one needs to check that the 3-polygraph
terminates. However, the structural complexity of polygraphs requires new techniques
since the usual ones used in rewriting do not work. One way to craft reduction orders
for 3-polygraphs is made explicit in Section 5 and used in Section 6 in order to prove the
termination of ( c , R ).
Finally, one needs to check that this 3-polygraph is confluent. Here, this is equivalent
to computing all of its critical pairs and check that each one is confluent. Once again,
the structural complexity of polygraphs generates problems unknown with other kinds
of rewriting theories. For example, a finite 3-polygraph can produce an infinite number
of critical pairs; this is the case here. However, among these critical pairs, some have
properties that allow us to finally have only a finite number of computations to do. Critical
pairs of 3-polygraphs need to be further studied and classified according to properties of
this kind; this will be addressed in subsequent work.
The present case is discussed in Section 6 and fully studied in [5].
From Theorem 4.6, one concludes the existence of a map from T to h c i. Indeed, if f
is an arrow in the cartesian category T , then ( f ) will be the R -normal form of any
representant in h c i of the arrow F( f ) in the product category h c i/E . This map ,
which could not be proved to exist until Theorem 4.6, allows the formal definition of the
translation of terms into circuits.

17

Remark 4.10. Before stating the result, let us qualify by uniformized a rule (u, v) on T
such that u = f (y1 , . . . , yk ) with f an arrow in h i and (y1 , . . . , yk ) a family of variables
with the following property: y1 is x1 ; then, for each i in {1, . . . , k 1}, the variable yi+1
is either in {y1 , . . . , yi }, or yi+1 is x p+1 if {y1 , . . . , yi } = {x1 , . . . , x p }.
Note that any rule on T can be replaced by a uniquely defined uniformized rule that
generates the same reduction relation. Furthermore, if a left-linear rule is replaced by its
uniformized rule, this one is also left-linear.
Hence, for what follows, (left-linear) term rewriting systems can always be considered
uniformized: if they are not, they are replaced by their uniformized equivalent version, with
no consequence on the results.
This choice simplifies the translations: a rule (u, v) that is both left-linear and
uniformized satisfies u = f (x1 , . . . , x]u ), with f an arrow in h i, uniquely defined; hence,
the translation by of such a u is f and thus is an arrow of h i.
Proposition 4.11. If ( , R) is a term rewriting system, then:
1. If the term rewriting system ( , R) terminates, so does the 3-polygraph ( c , R c ).
2. The translation preserves the reduction steps generated by any left-linear rule , that
is: for any pair (u, v) of terms such that u v and any integer n ]u, there exists an
arrow f in h c i such that
n (u) () f  R n (v).
Proof. Point 1 uses the technique to be introduced in Section 5. Its proof is thus postponed
until Section 6. Point 2 requires lengthy and cumbersome though intuitively simple
computations that can be found in [5]. 
Theorem 4.12. A left-linear term rewriting system ( , R) terminates (resp. is confluent)
if and only if its associated 3-polygraph ( c , R c ) terminates (resp. is confluent).
Proof. Let us assume that the 3-polygraph ( c , R c ) terminates while the term rewriting
system ( , R) does not. Consequently, there exists some sequence (u n )nN of terms in T
such that u n R u n+1 for every n. From 4.11, since every rule in R is left-linear, one
concludes that, for every k ]u 0 :
+
+
+
+
k
k
k
k (u 0 ) +
R c (u 1 )  R c  R c (u n )  R c (u n+1 )  R c

18

Termination orders for three-dimensional rewriting Y. Guiraud

356

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

+
Rc

R c -reduction

where the notation


stands for a non-empty
path. Such an infinite
reduction path existence is denied by the termination of the 3-polygraph ( c , R c ), thus
giving this property for ( , R). The converse, which is true even if the term rewriting
system is not left-linear, is still postponed to Section 6.
Now, let us assume that the term rewriting system ( , R) is confluent. Let us consider
a branching ( f, g, h) of ( c , R c ): the arrows f , g and h have the same finite number
of inputs, say m, and the same finite number of outputs, say n, and satisfy f  R c g
and f  R c h. Let us denote by the canonical projection of h c i onto T . Then,
sends each of f , g and h on families ( f 1 , . . . , f n ), (g1 , . . . , gn ) and (h 1 , . . . , h n ) of terms
such that each one has variables in {x1 , . . . , xm }. Moreover, for each i, one gets that the
triple ( f i , gi , h i ) is a branching of ( , R). From confluence of this rewriting system, one
concludes the existence of a arrow ki that closes this branching. Let us define k as the
translation, by m , in h c i(m, n), of the family (k1 , . . . , kn ) of terms. Since ( , R) is
left-linear, Proposition 4.11 ensures that this arrow k closes the branching ( f, g, h).
Conversely, let us assume that the 3-polygraph ( c , R c ) is confluent. Let us consider
a branching (u, v, w) in ( , R); since this rewriting system is left-linear, this branching
translates to a branching ( n (u), n (v), n (w)) in ( c , R c ) for any n ]u. Since the
3-polygraph is confluent, there exists some arrow f in h c i(n, 1) closing this branching.
The projection ( f ) is an arrow in T (n, 1) and thus corresponds to a term that closes the
initial branching (u, v, w). 
Before considering what this result allows (or rather does not allow) us to conclude
about our term rewriting system ( , R2 ) presenting the theory of Z/2Z-vector spaces,
there remain some termination results to prove in the next section. However, the intrinsic
complexity of the polygraph structure prevents the use of classical techniques; rather, the
incoming section presents an adaptation to the particular case of 3-polygraph we consider
of classical interpretation techniques used to craft termination orders for terms.
5. Termination orders for 3-polygraphs
In rewriting, one of the most used technique to prove termination is the following one:
build a reduction order, which is a terminating strict order that is compatible with the term
structure; then prove that this order contains the rules. Hence any reduction path in the
corresponding rewriting system yields a strictly decreasing family for the reduction order:
the fact that such families cannot be infinite ensures that there cannot exist any infinite
reduction path or, equivalently, that the rewriting system terminates.
In term rewriting, one easy way to build reduction orders is by means of an
interpretation. The simplest ones are: each term u such that ]u = n is sent to a function u
from Nn to N (or any set equipped with a terminating strict order). Then, one says that
u > v if each n-uple of integers is sent to a strictly greater integer by u than by v . One
easy way to compute u for each term u is to fix for each operator in the considered
signature and to extend these values functorially. If one can prove that each is a strictly
monotone map and that f > g for each rule f g, then u > v whenever there is a
reduction from u to v. Since the order on N is terminating, so is the order on functions:
hence, the considered term rewriting system terminates.

19

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

357

However, in the case of 3-polygraphs, this classical interpretation technique does not
yield reduction orders in general. Indeed, it is not always possible to send each operator
of the signature onto a strictly monotone map: for example, the erasure operator will be
sent to an function from N to N0 , that is to a single-element set: this function is unique
and monotone, but not strictly. Consequently: even if a rule f g satisfies f > g , then
( n f ) = (n g) , with n the number of outputs of both f and g.
One could also consider contravariant interpretations: hence, would be sent to a
constant natural number . But, in the most interesting 3-polygraphs, such as the ones
we are concerned with, there is a constant operator which cannot be contravariantly sent
to a strictly monotone map. The interpretation technique must be adapted to the polygraph
structure in order to yield termination orders.
Here we are in front of a choice between two possible directions: the first one consists
in interpreting arrows into functions between objects equipped with a monoidal product,
rather than a cartesian one, such as vector spaces. But, when examined, this has led to
horrendous computations that did not produce any reduction order. Nonetheless, this trail
is not to be forgotten and shall be reexamined when there is a computational tool, adapted
to polygraphs.
The other path consists in using classical interpretations, both covariant and
contravariant, as tools to build a third interpretation: this one will give the desired reduction
orders. Let us present images that describe the intuition beneath the formalism. Each
arrow in the considered product category is seen as an electrical circuit whose elementary
components are the operators it is built from, such as suggested by the diagrammatic
representation used. Then, a heat production value is associated to each circuit: each of
its inputs and outputs receives a current with a fixed intensity; hence there are two types of
currents: some are descending (they come from the inputs and propagate downwards to the
outputs) and some are ascending (they propagate upwards, from the outputs to the inputs).
The heat produced by a fixed circuit is calculated this way: an operator is arbitrarily
chosen. Then, currents are propagated through the other operators to the chosen one. This
requires that choices have been made for each operator: for each one, one must be able to
compute the intensities of descending currents transmitted when one knows the intensities
of incoming descending current, and similarly with ascending currents. When one knows
the intensities of each current coming into the chosen operator, one computes the heat it
produces, according to values fixed in advance. Then, one repeats the same procedure for
each operator, and sums the results to get the heat produced by the considered circuit, for
the chosen current intensities.
Two circuits with the same number of inputs and the same number of outputs are
compared this way: if, for each family of (ascending and descending) current intensities,
one produces more heat than the other one, then the first one is said to be greater. The
goal of this section is twofold: firstly, to formalize the objects required to compute such an
order; secondly, to obtain sufficient conditions for this order to be a reduction order.
Let us describe the required materials. The first one is the object where the
interpretations take their values: this will be a product category equipped with a strict
order. In order to build it, one considers (non-empty) ordered sets X and Y to express the
current intensities, one for descending currents, one for ascending currents (for one of the
applications to be described, two different sets of values are needed). Then, a commutative

20

Termination orders for three-dimensional rewriting Y. Guiraud

358

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

monoid M will contain the possible values of heats; moreover, it is supposed to be equipped
with an order such that the addition is strictly monotone in both variables.
From the data X , Y and M, one builds a somewhat weird product category O(X, Y, M)
this way: an arrow from m to n in O(X, Y, M) is a triple f = ( f , f , [ f ]) consisting of
three monotone functions
f : X m X n ,

f : Y n Y m,

[ f ] : X m Y n M.

(X n , Y n , 0)

Xn

Yn

made from the identities of


and
and
The identity of n is the triple n =
the constant zero-function from X n Y n to M. Two arrows f : m n and g : n p

compose this way: (g f ) and (g f ) are respectively the composites g f and f g ;


for elements xE in X m and yE in Y p , the function [g f ] is given by:
x ), yE).
[g f ](E
x , yE) = [ f ](E
x , g (Ey )) + [g]( f (E
If f : m n and g : p q are two arrows in O(X, Y, M), then their product is given
by: ( f g) and ( f g) are respectively f g and f g ; if xE, yE, xE0 and yE0 are
respectively elements of X m , Y n , X p and Y q , then [ f g] is given by:
x , yE) + [g](E
x 0 , yE0 ).
[ f g](E
x , xE0 , yE, yE0 ) = [ f ](E
Then one checks that these operations return monotone functions and that they satisfy the
required equations, in order to get:
Lemma 5.1. The aforementioned object O(X, Y, M) is a product category.
On top of this product category structure, a strict order relation  is defined on parallel
arrows of O(X, Y, M). If f and g are two arrows from m to n, then f  g if, for any xE
in X m and yE in Y n , the following three inequalities hold:
x ) g (E
x ),
f (E

f (Ey ) g (Ey ),

[ f ](E
x , yE) > [g](E
x , yE).

Now, let us consider a signature . Let us assume that each operator : m n in is


associated with an arrow ( , , []) : m n in O(X, Y, M): this is the interpretation.
For any , the monotone functions , and [] respectively express how the operator
transmits descending and ascending currents and how much heat it produces, according to
the current intensities it receives.
Since h i is the free product category generated by the signature , the map sending
each in to the triple ( , , []) uniquely extends to a product category functor F
from h i to O(X, Y, M). This means that one can compute f , f and [ f ] for any circuit f
in h i, from the values , and [] given for each operator in and using the
formulas for composition and product in O(X, Y, M).
The last step consists in using F to get the order  back from O(X, Y, M) on h i: for
any two parallel arrows f and g in h i, then f  g is F( f )  F(g).
Theorem 5.2. With the aforementioned notations and if the strict part of the order on M
is terminating, then the strict order  constructed on h i is a reduction order.
Proof. One must check that the binary relation  built on h i is antireflexive, transitive,
terminating and compatible with the product category structure. Let us assume that f is an

21

359

arrow in h i(m, n) such that f  f ; let us fix any elements xE and yE respectively in the
non-empty sets X m and Y n ; then, by definition of , one gets the following strict inequality
in M:
[F( f )](E
x , yE) > [F( f )](E
x , yE).
However, this inequality cannot hold in M since > is the strict part of an order relation. The
termination is proved by a similar argument: any infinite and strictly decreasing sequence
in h i yields, through the non-emptiness of X and Y , at least one infinite strictly decreasing
sequence in M, which existence is denied by the assumed termination of the strict part of
its order. The transitivity comes from the ones of the orders on X , Y and M. Finally,
compatibility with the product category structure is checked through computations which
use the monotone quality of each f , f and [ f ] in O(X, Y, M), together with the facts
that M is a commutative monoid and F is an product category functor. 
For concrete applications, presented in the next two sections, the following corollary
will be used instead of Theorem 5.2:
Corollary 5.3. Let us consider a 3-polygraph ( , R). Let us assume that there exist:
1. Two non-empty ordered sets X and Y .
2. A commutative monoid M equipped with an order such that its strict part
is terminating and such that the sum is strictly monotone in both variables.
3. For each operator in (m, n), three monotone functions:
: X m X n ,

: Y n Y m ,

[] : X m Y n M.

If the strict order  on arrows of h i built from these data, in the aforementioned
manner, satisfies f  g for every rule f g in R, then the 3-polygraph ( , R)
terminates.

6. Application 1: Explicit resource management polygraphs


This section is devoted to the remaining unproved results from Section 4. Let us fix a
term rewriting system ( , R) for the whole section.
6.1. Convergence of the 3-polygraph of explicit resource management
The first result to prove is Theorem 4.6: the 3-polygraph ( c , R ) is convergent,
where we recall from Section 4 that c is the signature made of the algebraic signature
and the resource management signature , while R is the family of resource
management rules.
The proof is divided in three steps: the first one consists in proving its termination; then,
we recall from [5] that this 3-polygraph is locally confluent; finally, Newmans lemma
is applied to get its convergence. Let us start with termination: we use the technique
developed in Section 5. However, the considered polygraph is rather complex and needs
two applications of the technique. For the rest of this paragraph, let us fix some notations.

22

Termination orders for three-dimensional rewriting Y. Guiraud

360

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

Let us denote by the following rule:

We denote by N the set of non-zero natural numbers with its natural order relation.
The commutative monoid freely generated by N is denoted by [N ] and is considered
equipped by the multiset order generated by the usual order relation on natural numbers.
The elements of [N ] are all the finite formal sums of non-zero natural numbers; a natural
number n, seen as a generator of [N ], is denoted by n.
The
P multiset order is defined in two steps: for the first one, one says that any sum
a = i ki .n i satisfies the inequality n > a if n > n i for each i; then, the multiset order is
taken as the reflexive and structure-compatible closure of this relation.
This implies that the addition is strictly monotone in both variables; furthermore, since
the strict order > on N terminates, so does the strict part of the multiset order. Here is an
example of some strict inequalities that hold in [N ]:
0 < 127.1 < 2 < 4.1 + 2.3 < 4.
Lemma 6.1.1. The 3-polygraph ( c , R ) terminates if and only if the 3-polygraph
( c , {}) terminates.
Proof. Let us consider the product category O(N , N , [N ]) together with the termination
order  as defined in Section 5. Let us denote by F the product category functor from h c i
into O(N , N , [N ]) defined by the following values on the operators of c :

Termination orders for three-dimensional rewriting Y. Guiraud

361

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

One checks that the first two non-strict inequalities are satisfied:

((1 ) ) (i) = (i, i, i) = (( 1) ) (i)
((1 ) ) (i, j, k) = i + j + k + 2 = (( 1) ) (i, j, k).
Moreover:

[(1 ) ](i, j, k, l) = 2.i + l + k + l + 2
[( 1) ](i, j, k, l) = 2.i + l + k.
Since l +2 > 0, one gets k + l + 2 > k and the required strict inequality. Then, consider
the rule for which the chosen values do not work. One gets the two following equalities:

((1 ) ( 1) (1 )) (i, j, k) = (k, j, i)

= (( 1) (1 ) ( 1)) (i, j, k)
((1 ) ( 1) (1 )) (i, j, k) = (k, j, i)

= (( 1) (1 ) ( 1)) (i, j, k).


And also this equality:
[(1 ) ( 1) (1 )](i, j, k, l, m, n)
= jk.m + m. j + k + i.( j + k).n + n.(i + j + j + k)
= [( 1) (1 ) ( 1)](i, j, k, l, m, n).
To finish with our examples, let us consider the most complicated rule of this presentation,
namely the local duplication rule:

This is this rule that motivates the use of the rather complicated product category
O(N , N , [N ]) to interpret h c i. In order to make the computations for this rule, one
must start by proving the following equations, which is done by iteration on the integer n:

(n ) (i 1 , . . . , i n ) = (i 1 , . . . , i n , i 1 , . . . , i n )

(i 1 , . . . , i n , j1 , . . . , jn ) = (i 1 + j1 + 1, . . . , i n + jn + 1)

n
, . . . , i n , j1 , . . . , jn X
, k1 , . . . , kn )
[n ](i 1X

=
(i u + ku ) +
(i u i v .ku + ku .i u + i v ).

1un

Three diagrams are given for each operator : two represent the functions and (how
transmits the current intensities) and one represents [] (the heat produces). Now, it is
checked that, for every rule f g in R , the inequality F( f )  F(g) holds, except for
the rule : s t, for which F(s) = F(t). Let us check the (in)equalities for three
sample rules. The complete computations are in [5]. Let us start with the coassociativity
rule for :

1u<vn

Then one gets these two equalities:



( ) (i 1 , . . . , i n ) = (i 1 + + i n + 1, i 1 + + i n + 1) = (( ) n )
( ) (i, j) = (i + j + 1, . . . , i + j + 1) = (( ) n ) .
For the strict inequality to be checked:

[ ](i 1 , . . . , i n , j, k) = j + k + 1 + i 1 + + i n + 1 + k

[( ) n ](i 1 , . . . , i n , j, k)
!
X
X
X

i u i v .k +
i u + k.
= j + n+1+

1u<vn

23

1u<vn

iu + iv .

1u<vn

24

Termination orders for three-dimensional rewriting Y. Guiraud

362

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

The multiset order properties allow the conclusion: the left member of this rule is strictly
greater than its right member. Indeed, it is a consequence from the following strict
inequalities that hold in [N ]:

j +k+1> j

j +k+1>k
i + + in + 1 > iu
for every u

1
i 1 + + i n + 1 > i u + i v for every u and v.
The computations for the other rules are handled similarly, albeit more easily. Now, let us
check the equivalence between termination of the 3-polygraphs ( c , R ) and ( c , {}).
Since is a rule of R , one concludes immediately that the termination of ( c , R )
implies the termination of ( c , {}): any infinite reduction path generated by the latter
would also be an infinite reduction path in the former.
Conversely, let us assume that ( c , {}) terminates and that there exists an infinite
reduction path ( f n )nN in ( c , R ). This path yields an infinite decreasing sequence
(F( f n ))n in O(N , N , [N ]), equipped with the order . Since this order terminates, the
sequence is stationary, which means that there exists some natural number n 0 such that
F( f n ) = F( f n+1 ) whenever n n 0 . However, as proved earlier, one can have both
f R g and F( f ) = F(g) only if f g. This implies that the sequence ( f n )nn 0 is
an infinite reduction path in ( c , {}). However, the existence of such an infinite reduction
path is prevented by the termination of ( c , {}). 

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

363

We must check that : s t satisfies F(s)  F(t). The computations give, on the
one hand, the two equalities:

((1 ) ( 1) (1 )) (i, j, k) = (k, j + 1, i + 2)

= (( 1) (1 ) ( 1)) (i, j, k)
((1 ) ( 1) (1 )) (i, j, k) = (k, j, i)

= (( 1) (1 ) ( 1)) (i, j, k).


On the other hand, one gets:

[(1 ) ( 1) (1 )](i, j, k, l, m, n) = 2i + 2 j + 2k + 2
[( 1) (1 ) ( 1)](i, j, k, l, m, n) = 2i + 2 j + 2k + 1.
By Corollary 5.3, this gives the result.

Thus, one gets, as a corollary of Lemmas 6.1.1 and 6.1.2:


Proposition 6.1.3. The 3-polygraph ( c , R ) terminates.
We recall the following result from [5, Proposition 5.31]:
Proposition 6.1.4. The 3-polygraph ( c , R ) is locally confluent.
Finally, Newmans lemma [2] is applied to get Theorem 4.6.
6.2. Termination of 3-polygraph built from a terminating rewriting system

Now, there remains to prove that:


Lemma 6.1.2. The 3-polygraph ( c , {}) terminates.
Proof. This is done using the technique from Section 5. The product category considered
for the interpretations is O(N, N, N), where N is the set (or commutative monoid) of natural
numbers, equipped with its natural order. We denote by G the product category functor
from h c i to O(N, N, N) defined by the following values on the operators of c :

This paragraph contains the proof of Proposition 4.11, point 1: if a term rewriting system
( , R) terminates, then so does its associated 3-polygraph ( c , R c ). The proof once again
uses a termination order obtained with Theorem 5.2. However, integer values cannot be
used here, since rules in R are unknown. To handle this issue, the following classical result
see [2] is used:
Theorem 6.2.1. A term rewriting system terminates if and only if there exists some
mapping | | from the set of terms T to N such that |u| > |v| whenever u is a term
that reduces into another term v. Moreover, in that case, the mapping | | can be chosen
such that |u| |u 0 | whenever u 0 is a subterm of u; the mapping can also be chosen so that
it takes its values in any countable set.
Proof. If ( , R) terminates, one can choose the mapping | | to send each term u onto the
length of the longest reduction path starting from u; this mapping satisfies |u| |u 0 | if u 0
is a subterm of u, since every reduction path from u 0 yields a reduction path of the same
length from u. Conversely, if such a mapping exists, an infinite reduction path (u n )nN in
( , R) would generate a strictly decreasing infinite sequence (|u n |)nN in N, which cannot
exist; hence the term rewriting system ( , R) terminates. If this is the case, the mapping
| | can be composed with any bijection : N E, where E is any countable set. 
Hence, from our terminating term rewriting system ( , R), a mapping | | : T N
is assumed to be chosen such that |u| > |v| whenever u reduces in v and |u| |u 0 |
whenever u 0 is a subterm of u. From this mapping, one defines a binary relation > on T

25

26

Termination orders for three-dimensional rewriting Y. Guiraud

364

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

by u > v if, for every term context c, the inequality |c[u]| > |c[v]| holds. From the fact
that the usual order > on N is a terminating strict order, this binary relation is proved to
satisfy:

365

Furthermore, if u k > vk for some k, then this inequality is strict for the same k; in this
case:
|c[(u 1 , . . . , u n )]| > |c[(v1 , . . . , vn )]|.

Lemma 6.2.2. The aforementioned binary relation > on T is a terminating strict order.
Then, one builds the lexicographical order on T
if u > v or if u = v and i j. This order satisfies:

N : for this order, (u, i)

(v, j)

Lemma 6.2.3. This relation is an order on T N . Moreover, its strict part > is a
terminating strict order on T N .

Consequently, (u 1 , . . . , u n ) > (v1 , . . . , vn ). Otherwise, if u k = vk and i k jk for


every k, then:

|c[(u 1 , . . . , u n )]| = |c[(v1 , . . . , vn )]|
i 1 + + i n j1 + + jn .
Thus, in both cases:
((u 1 , . . . , u n ), 2.(i 1 + + i n )) ((v1 , . . . , vn ), 2.( j1 + + jn )).

The set T N , together with the aforementioned order, is taken as the first set used
in the interpretation. The second one is a one-element set {} with the only possible order.
Finally, the commutative monoid is once again [N ] with its already-used multiset order.
The product category O(T N , {}, [N ]) is denoted by O.
Sometimes, the two elements ((u 1 , i 1 ), . . . , (u n , i n )) of (T N )n and (u 1 , . . . , u n ; i 1 ,
. . . , i n ) of T n (N )n are identified.
The considered product category functor F from h c i to O is given by the following
values (only two are given for each operator since the contravariant interpretation is trivial):

In order to prove that [] is monotone, let us fix some k in [n]. Then, either u k > vk or
u k = vk and i k jk . In the first case:
|(v1 , . . . , vk1 , u k , . . . , u n )| > |(v1 , . . . , vk , u k+1 , . . . , u n )|.
Thus, by definition of the multiset order on [N ]:
( j1 + + jk1 + i k + + i n ).|(v1 , . . . , vk1 , u k , . . . , u n )|
>

( j1 + + jk + i k+1 + + i n ).|(v1 , . . . , vk , u k+1 , . . . , u n )|.

In the second case, where u k = vk and i k jk :


( j1 + + jk1 + i k + + i n ).|(v1 , . . . , vk1 , u k , . . . , u n )|

( j1 + + jk + i k+1 + + i n ).|(v1 , . . . , vk , u k+1 , . . . , u n )|.

Finally:
(i 1 + + i n ).|(u 1 , . . . , u n )| ( j1 + + jn ).|(v1 , . . . , vn )|.
If is a constant in (0, 1) or for operators in , proofs are direct. Furthermore, for each
operator in either or , the operation is the only map from {} to itself, and it is
monotone, so that:

There are two steps to check the conditions given in Corollary 5.3: the first one consists
in ensuring that each given operation is monotone; the second part is about computing if
F( f ) > F(g) holds for every rule f g in R c .
For the first part, consider, for example, the functions and [] for some fixed
operator in (n, 1), n 1. Let us consider terms u 1 , . . . , u n , v1 , . . . , vn and non-zero
integers i 1 , . . . , i n , j1 , . . . , jn . Let us assume that (u k , i k ) (vk , jk ) for every k. In order
to prove that is monotone, one must check that either (u 1 , . . . , u n ) > (v1 , . . . , vn ) or
both are equal and i 1 + +i n j1 + + jn . Let c be a context. Since, for every k, u k vk
and c (v1 , . . . , vk1 , , u k+1 , . . . , u n ) is a context, one gets the following inequality:

Lemma 6.2.4. For every operator in c , the aforementioned functions , and []


are monotone.
Then, we must check if F( f )  F(g) for every rule f g in R c . Let us recall that
this family of rules consists of three subfamilies: R , R and (R). For any rule f g
in the first family R , one gets F( f ) = F(g), except for left and right counit rules, where
F( f )  F(g). Computations for rules in R are more complicated; let us examine, for
example, the rule for local duplication and one of the rules for local permutation:

|c[(v1 , . . . , vk1 , u k , . . . , u n )]| |c[(v1 , . . . , vk , u k+1 , . . . , u n )]|.

27

28

Termination orders for three-dimensional rewriting Y. Guiraud

366

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

Let us fix some natural number n 1 and some in (n, 1); for constants in (0, 1),
computations are direct. By iteration on n, the following equalities are proved:
(n ) (E
u , E) = (E
u , uE; E, E) and

[n ](E
u ; E) = i 1 .u 1 + + i n .u n .

This gives, at first:


u , E) = ((E
u ), (E
u ); 2.(i 1 + + i n ), 2.(i 1 + + i n ))
( ) (E
= (( ) n ) (E
u , E).
u )|. To be compared with:
Then: [ ](E
u , E) = 3.(i 1 + + i n ).|(E
[( ) n ](E
u , E) = 2.(i 1 + + i n ).|(E
u )| + i 1 .u 1 + + i n .u n .
u ) for every k, and by assumption on | |, the inequality
Since u k is a subterm of (E
|(E
u )| |u k | holds. Hence, for every k:

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

367

[ n (g)](E
u , E)

[( f )](E
u , E) | f uE |. Moreover, point 2 gives
< |g uE + 1|. Finally,
since the reduction f uE guE holds in ( , R) and by properties of ||: | f uE | > |gvE |.
There remains to concatenate these three inequalities to get [( f )] > [ n (g)] and, as a
consequence F ( f )  F n (g). The product category functor F from h c i to O gives
us F( f )  F(g) for every rule f g in (R) and F( f )  F(g) for every rule f g
in R . This yields the following result:
Proposition 6.2.6. If the term rewriting system ( , R) terminates, then termination of
the 3-polygraph ( c , R c ) is equivalent to termination of ( c , R ).
Since we already know that ( c , R ) always terminates, this concludes the proof of
Theorem 4.12.
7. Application 2: A convergent 3-polygraph for a commutative equational theory

u )| i k .u k .
i k .|(E
u )| i 1 .u 1 + + i n .u n . This gives the inequality [ ]
Finally: (i 1 + + i n ).|(E
[( ) n ]. Now, let us consider the first rule for local permutation; the first step is to
prove, by iteration on n:
u , v; E, j) = (v, uE; j, E)
(n,1 ) (E

and [n,1 ](E


u , v; E, j) = 0.

u , v; Ei, j) = (v, (E
u ); j, 2.(i 1 + +i n )) = ((1)n,1 ) (E
u , v; Ei, j).
Then: ( (1)) (E
u )| = [(1 ) n,1 ](E
u , v; Ei, j).
And: [ ( 1)](E
u , v; Ei, j) = (i 1 + + i n ).|(E
The other rules in R are similarly handled and give similar results: for every rule
f g in R , the inequality F( f )  F(g) holds in O. The final part concerns the family
(R) of rules. Let us assume that : f g is a rule in R; its translation by is the rule
() : ( f ) ] f (g). Let us prove that F ( f )  F ] f (g). The first step is to
prove, by iteration on the degree of terms in T , the following lemma:
Lemma 6.2.5. Let u be a term in T , n be an integer such that n ]u, vE a family of
n terms in T and E a family of n non-zero natural numbers. Let us denote by vE the
substitution defined by xk v = vk if k n and xk otherwise. Then:
1. There exists some non-zero integer k such that ( n (u)) (u vE , k).
2. The inequality [ n (u)](E
v , E) < |u vE + 1| holds in [N ].
3. If u is not a variable, then the inequality [ n (u)](E
v , E) |u vE | also holds in [N ].
Point 1 gives, when applied to f and g with n = ] f , the existence of non-zero natural
u , E) = ( f uE , k) and n (g) (E
u , E) = (g uE , k 0 ).
numbers k and k 0 such that ( f ) (E
Let us consider some context c. By definition of the reduction relation generated
by the rule , one gets c[ f uE ] c[g uE ]. Consequently, the properties of | | give
|c[ f uE ]| > |c[g uE ]|. This holds for any context; thus, by definition of > on T , one
gets f uE > g uE . Finally, using the definition of > on T N :

This final section is devoted to give a convergent presentation of the equational theory
of Z/2Z-vector spaces, which is, as mentioned before, a commutative equational theory
and thus does not have any convergent presentation by a term rewriting system.
In Section 1, we have considered three term rewriting systems ( , R0 ), ( , R1 ) and
( , R2 ) that respectively present the equational theories of monoids, of commutative
monoids and of Z/2Z-vector spaces. All three have two operators, a product and a unit,
and they have respectively three, four and five rules. Thus, their associated 3-polygraphs
have five operators together with twenty-three rules for ( c , R0c ), twenty-four for ( c , R1c )
and twenty-five for ( c , R2c ).
Since ( , R0 ) is a left-linear convergent term rewriting system, Theorem 4.12 ensures,
in particular, that ( c , R0c ) is a convergent presentation of the theory of monoids,
with explicit resource management. The term rewriting system ( , R1 ) is left-linear,
non-terminating (due to the commutativity rule) and non-confluent (though it could
be completed to get a confluent rewriting system), hence Theorem 4.12 gives us that
( c , R1c ) is a non-terminating and non-confluent presentation of the equational theory
of commutative monoids, with explicit resource management. Finally, the term rewriting
system ( , R2 ) is a non-left-linear, non-terminating and non-confluent term rewriting
system: non-left-linearity denies us any information coming from Theorem 4.12 about this
presentation.
However, there is, in [10], an equivalent 3-polygraph called L(Z2 ). Its signature contains
a sixth operator, called and pictured this way:

This new operator is said to be superfluous since it represents, in a Z/2Z-vector space, the
concrete operation (x, y) = ((x, y), x) that can be expressed in terms of , and . In
the presentation, this relation is enforced by means of the following extra rule:

( f ) > (g).
Let us prove now that [( f )] > [ n (g)]. Since is a term rewrite rule, its source f
is a non-variable term. Hence, point 3 of the previous lemma gives the inequality

29

30

Termination orders for three-dimensional rewriting Y. Guiraud

368

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

369

The main objective of this new operator and rule is to make proof of termination easier (if
not just possible). Then, one has to add a certain amount of rules in order to complete the
presentation, to finally obtain the 3-polygraph L(Z2 ), discovered and baptized in [10].
This polygraph has six operators:

and sixty-seven rules:

The chosen values simplify the computations greatly. Indeed, normally, there are three
inequalities to check for each rule: hence, there should be 201 inequalities to check here.
The first reduction comes from the fact that F identifies and : there are 24 rules that can
be dropped since, for each one, there is another rule that is sent to the same image. Thus
there remain 43 rules and 129 inequalities to check.
Moreover, the rules of L(Z2 ) have some interesting symmetries that one can exploit:
indeed, whenever f g is a rule of L(Z2 ), then f o g o is also a rule of L(Z2 ), where
the duality ()o is the involution defined by:
o = ,

o = ,

(g f )o = f o g o ,

o = ,

o = ,

n o = n,

( f g)o = f o g o .

Another way to define this duality is by its action on diagrams: there, it is the top-down
symmetry. Furthermore, the functor F is compatible with this symmetry, in the sense that,
for every arrow f , the functor F sends f o onto F( f )o , where the duality on O is defined
x , xE0 ) = [ f ](E
x 0 , xE). Note that this
that way: ( f , f , [ f ])o = ( f , f , [ f ]o ), with [ f ]o (E
only has a meaning because the two sets X and Y are the same here (both equal to N ).
Thus, if some rule f g in L(Z2 ) satisfies F( f ) > F(g), then so does f o g o .
As a consequence, this reduces the number of rules to study: 18 of the remaining rules
have a distinct dual, hence only 25 rules need to be studied (75 inequalities). Furthermore,
when a rule f g is self-dual, the inequality F( f ) F(g) holds if and only if
F( f ) F(g) holds: 8 of the remaining rules are in that case, which means there still
are 67 inequalities from the former 201 to check. Computations do not raise any difficulty.
For example, let us study the following (self-dual) rule:

From [10], we already know that this presentation is confluent but termination was
still a conjecture. The technique presented in Section 5 now allows us to prove that
it is also terminating, hence convergent. The interpretation product category we use is
O(N , N , [N ]), once again denoted by O. The interpretation functor F is given by the
following values on generating operators:

31

One computes

( ) (i, j) = (2i + j, i + j)
((1 ) ( 1) (1 )) (i, j) = (i + j, i + j).

32

Termination orders for three-dimensional rewriting Y. Guiraud

370

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

Since i and j are non-zero natural numbers, the following inequality holds:
( ) > ((1 ) ( 1) (1 )) .
Then

[ ](i, j, k, l) = i + i + j + k + k + l
[(1 ) ( 1) (1 )](i, j, k, l) = 2i + j + 2k + l.
Since i and j are non-zero natural numbers, the inequalities i + j > i and i + j > j
always hold. Thus, by property of the multiset order on [N ], the inequality i + j > i + j
always holds. Similarly, so does k + l > k + l. Finally, the multiset order on [N ] is
compatible with addition, yielding:
[ ] > [(1 ) ( 1) (1 )].
The other rules are studied in a similar way [5], which leads to the following result, proving
that commutative equational theories can admit polygraphic convergent presentations:
Theorem 7.1. The 3-polygraph L(Z2 ) is a convergent presentation of the equational
theory of Z/2Z-vector spaces, with explicit resource management.
8. Comments and future directions
The study of (3-)polygraphs had been started by Albert Burroni and Yves Lafont, as an
algebraic model for three-dimensional calculus on two-dimensional objects. Foundations
were laid in [8,3,9]. In [10], rewriting systems generated by 3-polygraphs were considered
and many known equational presentations are studied in order to be completed into
convergent rewriting systems (or, at least, rewriting systems with the unique normal form
property). Discussions with Albert Burroni, Yves Lafont and Philippe Malbos have been
essential in order to achieve the results presented here. Comments from the referee were of
great help to make this paper clearer.
There exist many research paths concerning polygraph. The first one is about
confluence: as mentioned earlier, there exist theoretical issues with critical pairs of
3-polygraphs; exploration and classification are mandatory in order to achieve some
automated completion procedure for these objects. Such a tool (which implementation in
Caml has already started) would be very useful since, starting from an equational theory,
one could use the constructions described in Section 4 in order to obtain a 3-polygraph;
then a completion procedure could be applied to correct termination and confluence issues.
Suggested by Pierre Lescanne, other usual techniques for building reduction orders in
term rewriting could also be examined, in order to see if they could also be adapted to
polygraphs. Among the most useful results to be studied are the ones concerning path
orders, see [2], and dependency pairs, see [1].
A second theme to be explored is the study of higher-dimensional polygraphs. For
an example of application, 4-polygraphs provide a categorical framework for proof
transformations in the calculus of structures [4]. Such an approach could yield results such
as proof decompositions or normal forms, given by a convergent 4-polygraph. At least,
it suggests that formulas are two-dimensional objects, proofs are three-dimensional and

33

Termination orders for three-dimensional rewriting Y. Guiraud

Y. Guiraud / Journal of Pure and Applied Algebra 207 (2006) 341371

371

computation on them (such as cut elimination) lives in dimension 4. This point of view
is conjectured to yield a new class of objects describing formal proofs, giving a different,
categorical and geometrical way to approach proof theory.
Theoretical studies can also be directed at pursuing the synthesis started in [5] on rewriting systems: one of the main goals is to have a framework where one can compare two
rewriting systems, regardless of the algebraic structure of their terms. The reduction space
associated to each rewriting system is an algebraico-geometric object (a cubical object in
some category of algebras) and one could use the underlying cubical sets of these objects
to compare rewriting systems, geometrically. Notions of (co)fibrations from Quillen model
categories see [6] theory could be useful for a better understanding of results such as
the ones of Section 4; since many rewriting systems are special cases of polygraphs, this
study will start with the construction of homotopical tools for these objects.
Still another question is the following: is there some n for which there exists a finite
n-polygraph yielding a calculus with both explicit substitutions and explicit resource
management for the -calculus? When n = 3, the answer seems to be negative, since
theoretical results deny the existence of any non-trivial product category that is both
cartesian (for resource management) and sovereign (for substitutions). An equational
description of the structure of closed category (such as the one Albert Burroni has given
for cartesian categories) should be the first step of this work. Another possibility is to use
a three-dimensional interpretation of proofs, together with the links between -terms and
proofs.
Finally, 3-polygraphs have the interesting property to modelize computational circuits.
Indeed, both classical and quantum algorithms accept representations as circuits which
are, albeit not in their usual presentation, genuine operators of a 3-polygraph. Furthermore,
equational presentations are known for both kinds of circuits. Questions that can be studied
with this point of view concern the existence of convergent 3-polygraphs for classical or
quantum circuits, thus leading to canonical representations of programs. One can take a
look at [7] for more information on circuits and [10] for their links with polygraphs.
References
[1] T. Arts, J. Giesl, Termination of term rewriting using dependency pairs, Theoretical Computer Science 236
(2000) 133178.
[2] F. Baader, T. Nipkow, Term Rewriting and All That, Cambridge University Press, 1998.
[3] A. Burroni, Higher-dimensional word problems with applications to equational logic, Theoretical Computer
Science 115 (1) (1993) 4662.
[4] A. Guglielmi, L. Straburger, Non-commutativity and MELL in the Calculus of Structures, in: Lecture
Notes in Computer Science, vol. 2142, 2001, pp. 5468.
[5] Y. Guiraud, Presentations doperades et syst`emes de ree criture, Th`ese de doctorat, Montpellier, 2004.
[6] M. Hovey, Model categories, Mathematical Surveys and Monographs 63 (1999).
[7] A. Kitaev, A. Shen, M. Vyalyi, Classical and quantum computation, Graduate Studies in Mathematics 47
(2002).
[8] Y. Lafont, Penrose Diagrams and 2-dimensional Rewriting, in: London Mathematical Society Lecture Notes
Series, vol. 177, 1992, pp. 191201.
[9] Y. Lafont, Equational Reasoning with 2-dimensional Diagrams, in: Lecture Notes in Computer Science, vol.
909, 1995, pp. 170195.
[10] Y. Lafont, Towards an algebraic theory of boolean circuits, Journal of Pure and Applied Algebra 184 (2003)
257310.
[11] S. MacLane, Categorical algebra, Bulletin of the American Mathematical Society 71 (1965) 40106.

34

The complexity of first-order and monadic second-order logic revisited


M. Frick and M. Grohe

The complexity of
first-order and
monadic
second-order
logic revisited
Markus Frick and Martin Grohe
Annals of Pure and Applied Logic
130 (2004), Pages 3-31

of Pure
andand
Applied
Logic
Annals
Annals
of Pure
Applied
Logic112
130(2001)
(2004)43115
331

ANNALS OF
PURE AND
APPLIED LOGIC
www.elsevier.com/locate/apal
www.elsevier.com/locate/apal

TheThe
ManinMumford
conjectureand
andmonadic
the model
complexity of first-order
theory of di&erence
'elds
second-order
logic revisited
Markus Frick , Martin Grohe

b, Israel
Hebrew University, Jerusalem,
Department of Mathematics, a

Ehud Hrushovski 1

1995; accepted 17 April 2001


Received
a SAP AG, September
Neurottstr. 15a, 69190 Walldorf, Germany
b Institut fur Informatik, Humboldt-Universit
Communicated
by A.J.
Wilkie
at zu Berlin,
Unter
den Linden 6, 10099 Berlin, Germany

Abstract

Available online 20 July 2004

Using methods of geometric stability (sometimes generalized to 'nite S1 rank), we determine


the structure of Abelian groups de'nable in ACFA, the model companion of 'elds with an
Abstract
automorphism. We also give general bounds on sets de'nable in ACFA. We show that these
Thecan
model-checking
problem
a logic
on a class
C of among
structures
asksresults,
whether
given
tools
be used to study
torsionforpoints
on LAbelian
varieties;
other
we adeduce
L-sentence
holds case
in a given
in of
C. In
thisand
paper,
we give
lower
bounds
for
a fairly general
of a structure
conjecture
Tate
Voloch
on super-exponential
p-adic distances of
torsion
points
fixed-parameter
tractable
model-checking
problems
first-order
and monadic second-order logic.
c 2001
from subvarieties.

Elsevier Science
B.V.for
All
rights reserved.
We show that unless PTIME = NP, the model-checking problem for monadic second-order
MSC: 03C; 11G; 12H10
logic on finite words is not solvable in time f (k) p(n), for any elementary function f and any
polynomial
p. Here k varieties;
denotes theTorsion
size of the
inputDi&erence
sentence and
n the Geometric
size of the input
word. We
Keywords: Abelian
points;
'elds;
stability;
establish
a number
of similarequations
lower bounds for the model-checking problem for first-order logic, for
Model theory
of di&erence
example, on the class of all trees.
2004 Elsevier B.V. All rights reserved.

1. Introduction

1. Introduction
This paper extends and specializes the general model theory of di&erence equations
in characteristic
0, developed
1.1.
Model-checking
problems in [9]. We investigate the induced structure on de'nable
Abelian groups of 'nite dimension. We also give general bounds on the number of soWe study the complexity of a fundamental algorithmic problem, the so-called modellutions to a 'nite set of di&erence equations. As a corollary, we obtain a model-theoretic
checking problem: given a sentence of some logic L and a structure A, decide whether
proof inofA.the
ManinMumford
conjecture.
The
proof yields
new number-theoretic
inholds
Model-checking
and closely
related
problems
are of importance
in several areas
formation,
respect
to p-adictheory,
and algebraic
uniformities and
of the
bounds
of
computerparticularly
science, forwith
example,
in database
artificial intelligence,
automated
obtained. In this paper, we prove new lower bounds on the complexity of the modelverification.
checking problems for first-order and monadic second-order logic.
1.1.It is
The
ManinMumford
conjecture
known
that model-checking
for both first-order and monadic second-order logic is
PSPACE-complete [17,20] and thus most likely not solvable in polynomial time. While this
The ManinMumford conjecture states that if C is a curve of genus two or more,
Corresponding
embedded
in its
Jacobian J , then the set of torsion points on C is 'nite, see [18,
author.
E-mail addresses: markus.frick@sap.com (M. Frick), grohe@informatik.hu-berlin.de (M. Grohe).
The work was begun at MIT, with support from the NSF. The latter part was supported by the ISF.
E-mail address:
(E. Hrushovski).
0168-0072/$
- seeehud@math.huji.ac.il
front matter 2004 Elsevier
B.V. All rights reserved.
doi:10.1016/j.apal.2004.01.007
c 2001 Elsevier Science B.V. All rights reserved.
0168-0072/01/$ - see front matter 
PII: S 0 1 6 8 - 0 0 7 2 ( 0 1 ) 0 0 0 9 6 - 3
1

35

36

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

M. Frick and M. Grohe

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

result shows that the problems are intractable in general, it does not say too much about
their complexity in practical situations. Typically, we have to check whether a relatively
small sentence holds in a large structure. For example, when evaluating a database query,
we usually have a small query and a large database. Similarly, when verifying that a finite
state system satisfies some property, the specification of the property in a suitable logic
will usually be small compared to the huge state space of the system. When analysing the
complexity of the problem, we should take this imbalance between the size of the input
sentence and the size of the input structure into account.
1.2. Parameterized complexity theory

Parameterized complexity theory (see [5]) is a relatively new branch of complexity


theory that provides the framework for a refined complexity analysis of problems whose
instances consist of different parts that typically have different sizes. In this framework, a
parameterized problem is a problem whose instances consist of two parts of sizes n and k,
respectively. k is called the parameter, and the assumption is that k is usually small, small
enough that an algorithm that is exponential in k may still be feasible. A parameterized
problem is called fixed-parameter tractable if it can be solved in time f (k) p(n) for an
arbitrary computable function f and some polynomial p. The motivation for this definition
is that, since k is assumed to be small, the feasibility of an algorithm for the problem mainly
depends on its behaviour in terms of n. Under this definition, a running time of O(2k n)
is considered tractable, but running times of O(n k ) or O(k 2n ) are not, which seems
reasonable.
Let us remark that although fixed-parameter tractability has proven to be a valuable
concept allowing fine distinctions on the borderline between tractability and intractability,
it seems somewhat questionable to admit all computable functions f for the parameter
dependence of a fixed-parameter tractable algorithm. If f is doubly exponential or worse,
an O( f (k) n)-algorithm can hardly be considered tractable. The main contribution of this
paper to parameterized complexity theory is to show that there are natural fixed-parameter
tractable problems requiring parameter dependences f that are doubly exponential or even
non-elementary.
1.3. The parameterized complexity of model-checking problems

Model-checking problems have a natural parameterization in which the size k of the


input sentence is the parameter. We have argued above that k is usually small in the practical situations we are interested in, so a parameterized complexity analysis is appropriate. Unfortunately, it turns out that the model-checking problem for first-order logic is
complete for the parameterized complexity class AW[], which is conjectured to strictly
contain the class FPT of all fixed-parameter tractable problems. Thus probably modelchecking for first-order logic is not fixed-parameter tractable. Of course this implies that
model-checking for the stronger monadic second-order logic is also most-likely not fixedparameter tractable. As a matter of fact, it follows immediately from the observation that
there is a monadic second-order sentence saying that a graph is 3-colourable that modelchecking for monadic second-order logic is not fixed-parameter tractable unless P = NP.

37

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

It is interesting to compare these intractability results for first-order logic and monadic
second-order logic with the following: the model-checking problem for linear time
temporal logic LTL is solvable in time 2 O(k) n [14], making it fixed-parameter tractable and
also tractable in practise. On the other hand, model-checking for LTL is PSPACE-complete
(as it is for first-order and monadic second-order logic). So parameterized complexity
theory helps us in establishing an important distinction between problems of the same
classical complexity.1 We may argue, however, that the comparison between LTL modelchecking and first-order model-checking underlying these results is slightly unfair. As
the name linear time temporal logic indicates, LTL only speaks about a linearly ordered
sequence of events. On an arbitrary structure, an LTL formula can thus only speak about
the paths through the structure. First-order formulas do not have such a restricted view. It
is therefore more interesting to compare LTL and first-order logic on words, which are the
natural structures describing linear sequences of events. A well-known result of Kamp [12]
states that LTL and first-order logic have the same expressive power on words. And indeed,
model-checking for first-order logic and even for monadic second-order logic is fixedparameter tractable if the input structures are restricted to be words. This is a consequence
of Buchis theorem [2], saying that for every sentence of monadic second-order logic one
can effectively find a finite automaton accepting exactly those words in which the sentence
holds. A fixed-parameter tractable algorithm for monadic second-order model-checking
on words may proceed as follows: it first translates the input sentence into an equivalent
automaton and then tests in linear time whether this automaton accepts the input word. But
note that since there is no elementary bound for the size of a finite automaton equivalent
to a given first-order or monadic second-order sentence [18] (also see [15]), the parameter
dependence of this algorithm is non-elementary, thus it does not even come close to the
2 O(k) n model-checking algorithm for LTL. Of course this does not rule out the existence
of other, better fixed-parameter tractable algorithms for first-order or monadic second-order
model-checking.
1.4. Our results

Our first theorem shows that there is no fundamentally better fixed-parameter tractable
algorithm for first-order and monadic second-order model-checking on the class of words
than the automata based one described in the previous paragraph.

Theorem 1. (1) Assume that PTIME = NP. Let f be an elementary function and p a
polynomial. Then there is no model-checking algorithm for monadic second-order
logic on the class of words whose running time is bounded by f (k) p(n).
(2) Assume that FPT = AW[]. Let f be an elementary function and p a polynomial.
Then there is no model-checking algorithm for first-order logic on the class of words
whose running time is bounded by f (k) p(n).
1 A critical reader may remark that this distinction between the complexities of LTL model-checking and firstorder model-checking was known before anybody thought of parameterized complexity-theory. This is true, but
how can we be sure that there is no 2 O(k) n model-checking algorithm for first-order model-checking? The role
of parameterized-complexity theory is to give evidence for this.

38

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

M. Frick and M. Grohe

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

Here k denotes the size of the input sentence of the model-checking problem and n the
size of the input word.

Recall that a function f : N N is elementary if it can be formed from the successor


function, addition, subtraction, and multiplication using 
compositions, projections,
boun
z) and zy g(x,
z)).
ded additions and bounded multiplications (of the form zy g(x,
The crucial fact for us is that a function f is bounded by an elementary function if,
and only if, it is bounded by an h-fold exponential function for some fixed h (see, for
example, [4]).
To prove the theorem, we use similar coding tricks as those that can be used to prove
that there is no elementary algorithm for deciding the satisfiability of first-order sentences
over words [18].
Model-checking for first-order and monadic second-order logic is known to be fixedparameter tractable on several other classes of structures besides words: model-checking
for monadic second-order logic is also fixed-parameter tractable on trees and graphs of
bounded tree-width [3]. The latter is a well-known theorem due to Courcelle [3] playing a
prominent role in parameterized complexity theory. Theorem 1 implies that the parameter
dependence of monadic-second-order model-checking on trees and and graphs of bounded
tree-width is also non-elementary. In addition to trees and graphs of bounded tree-width,
model-checking for first-order logic is fixed-parameter tractable on further interesting
classes of graphs such as graphs of bounded degree [16], planar graphs [10], and more
generally locally tree-decomposable classes of structures [10]. Theorem 1(2) does not
imply lower bounds for the parameter dependence here. The reason for that is a peculiar
detail in the encoding of words by relational structures. The standard encoding includes
the linear order of the letters in a word as an explicit relation of the structure. If we omit
the order and just include a successor relation, Theorem 1(1) still holds, because the order
is definable in monadic second-order logic. However, the order is not definable in firstorder logic, and Theorem 1(2) does not extend to words without order. Indeed, we give a
model-checking algorithm for first-order logic on words without order, and more generally
O(k)
n, that is, with a doubly exponential
on structures of degree 2, with a running time 22
parameter dependence. We also give a model-checking algorithm for first-order logic on
structures of bounded degree d 3 with a triply exponential parameter dependence. We
match these upper bounds by corresponding lower bounds:
Theorem 2. Assume that FPT = AW[], and let p be a polynomial.

(1) There is no model-checking algorithm for first-order logic on the class of words without
order whose running time is bounded by
22

o(k)

p(n).

(2) There is no model-checking algorithm for first-order logic on the class of binary trees
whose running time is bounded by
22

2o(k)

p(n).

Again, k denotes the size of the input sentence and n the size of the input structure.

39

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

Finally, we obtain a non-elementary lower bound for first-order model-checking on


trees, which implies lower bounds for planar graphs and all other classes of graphs that
contain all trees.

Theorem 3. Assume that FPT = AW[]. Let f be an elementary function and p a


polynomial. Then there is no model-checking algorithm for first-order logic on the class of
trees whose running time is bounded by f (k) p(n).
2. Preliminaries

A vocabulary is a finite set of relation, function, and constant symbols. Each relation and
function symbol has an arity. always denotes a vocabulary. A structure A of vocabulary
, or -structure, consists of a set A called the universe, and an interpretation T A of each
symbol T : relation symbols and function symbols are interpreted by relations and
functions on A of the appropriate arity, and constant symbols are interpreted by elements
of A. All structures considered in this paper are assumed to have a finite universe. The
reduct of a -structure A to a vocabulary is the -structure with the same universe
as A and the same interpretation of all symbols in . An expansion of a structure A is a
structure A such that A is a reduct of A . In particular, if A is a structure and a A, then
by (A, a ) we denote the expansion of A by the constant a . We write A
= B to denote that
structures A and B are isomorphic.
Let be a finite alphabet. We let ( ) be the vocabulary consisting of a binary relation
symbol , a unary function symbol S, two constant symbols min and max, and a unary
relation symbol Ps for every s . A word structure over is a ( )-structure W with
the following properties:
W is a linear order of W, minW and maxW are the minimum and maximum element of
W , and SW is the successor function associated with W , where we let SW (maxW ) =
maxW .
For every a W there exists precisely one s such that a PsW .

We refer to elements a W as the positions in the word (structure) and, for every position
a W, to the unique s such that a PsW as the letter at a .
It is obvious how to associate a word from the set of all words over with every
word structure over and, conversely, how to associate an up to isomorphism unique word
structure with every word in . We identify words with the corresponding word structures
and write W to refer both to the word and the structure.
The class of all words (or word structures) over any alphabet is denoted by W. The
length of a word W is denoted by |W|.
A subword of a word W = s0 . . . sn1 is either the empty word or a word si . . . s j
for some i, j, 0 i j < n. We write V  W to denote that V is a subword of W.
We assume that the reader is familiar with propositional logic, first-order logic FO and
monadic second-order logic MSO (see, for example, [7]). If is a formula of propositional
logic and is a truth-value assignment to the variables of , then we write |= to
denote that satisfies . Similarly, if (x1 , . . . , xk ) is a first-order or monadic secondorder formula with free variables x1 , . . . , xk , A is a structure, and a 1 , . . . , a k A, then

40

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

M. Frick and M. Grohe

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

we write A |= (a 1 , . . . , a k ) to denote that A satisfies if the variables x1 , . . . , xk are


interpreted by a 1 , . . . , a k , respectively. A sentence is a formula without free variables. The
quantifier-rank of a formula , that is, the maximum number of nested quantifiers in , is
denoted by qr().
The model-checking problem for a logic L on a class C of structures, denoted by
MC(L, C), is the following decision problem:
Input:
Problem:

Structure A C, sentence L
Decide if A |= .

We fix a reasonable encoding of structures and formulas by words over {0, 1}. We denote
the length of the encoding of a structure A by A and the length of the encoding of a
formula by . When reasoning about model-checking problems, we usually use n to
denote the size A of the input structure and k to denote the size  of the input sentence.
Our underlying model of computation is the standard RAM-model with addition and
subtraction as arithmetic operations (cf. [1,19]). In our complexity analysis we use the
uniform cost measure.
It is well-known that if we are interested in the complexity of first-order or monadic
second-order model-checking on words, the alphabet is inessential. This can be phrased as
follows:
Fact 4. Let L {FO, MSO}. Then there is a linear time algorithm that, given a sentence
L and a word W W, computes a sentence L of vocabulary ({0, 1}) and a word
W {0, 1} such that   O(), W  O(W), and (W |=
W |= ).

N denotes the set of natural numbers (including 0). For all n, i N we let bit(i, n)
denote the i th bit in the binary representation of n. (Here we count the lowest priority bit
as the 0th bit.) lg denotes the base-2 logarithm, and, for i N, lg(i) denotes the i -fold
logarithm. More formally, lg(i) is defined by lg(0) (n) = n and lg(i+1) (n) = lg lg(i) (n).
We define the tower function T : N R R by T(0, r ) = r and T(h + 1, r ) = 2T (h,r )
for all h N, r R. Thus T(h, r ) is a tower of 2s of height h with an r sitting on top.
Observe that for all n, h N with n 1 we have T(h, lg(h) n) = n.
3. Succinct encodings

We introduce a sequence of encodings h , for h 1, of natural numbers by words


over certain finite alphabets. They are more and more succinct not in the sense that the
codewords are shorter and shorter, but in the sense that they can be decoded by shorter
and shorter first-order formulas. Decoding is actually said too much here, what we mean
is that there are shorter and shorter first-order formulas stating that two words encode the
same number. For example, if we encode numbers in unary, for every n there is a first-order
formula of length O(n) stating that two words encode the same number smaller than 2n .
If we encode numbers in binary, there is a first-order formula of length O(n) stating that
n
two words encode the same number smaller than 22 . We shall give, for every h 0, an
encoding such that for every n there is a first-order formula of length O(n) stating that two

41

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

words encode the same number smaller than T(h, n). This is what Lemma 8, the key result
of this section, states.
For all h 1 we let h = {0, 1, <1>, </1>, . . . , <h>, </h>}. The tags <i> and </i>
represent single letters of the alphabet and are just chosen to improve readability. We define
L : N N by L(0) = 0, L(1) = 1, L(n) = lg(n 1) + 1 for n 2. Note that for
n 1, L(n) is precisely the length of the binary representation of n 1.
We are now ready to define our encodings h : N h , for h 1. We let
1 (0) = <1></1> and
1 (n) = <1> bit(0, n 1) bit(1, n 1) . . . bit(L(n) 1, n 1) </1>

for n 1. For h 2, we let h (0) = <h></h> and


h (n) = <h>

</h>

h1 (0) bit(0, n 1)
h1 (1) bit(1, n 1)
..
.

h1 (L(n) 1) bit(L(n) 1, n 1)

for n 1. Here empty spaces and line breaks are just used to improve readability.
Example 5.

2 (47) =

Lemma 6.

<2>

</2>

1 (0) 0
1 (1) 1
1 (2) 1
1 (3) 1
1 (4) 0
1 (5) 1

<2>
<1></1> 0
<1>0</1> 1
<1>1</1> 1
<1>01</1> 1
<1>11</1> 0
<1>001</1> 1
</2>

|h (n)| O(h lg2 n).

Proof. We define functions Li : N N as follows: L1 (n) = L(n) for all n N and


Li (n) = Li1 (L(n)) for all i, n N with i 2. Moreover, we define Pi : N N for
i 1 by
Pi (n) =

i


j =1

L j (n).

Observe that for all i 2 and n 1 we have Pi (n) = L(n) Pi1 (L(n)).
We first prove, by induction on h 1, that for all n 1,
|h (n)| 4h Ph (n).

(1)

42

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

10

M. Frick and M. Grohe

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

We have 1 (n) = 2 + L(n) 4L(n) = 4P1 (n), so (1) is true for h = 1. Let h 2 an
suppose that (1) holds for h 1. Then
|h (n)| = 2 + L(n) +

L(n)1

i=0

= 2 + L(n) + 2 +
4 + L(n) +

i=1

L(n)1

i=1

|h1 (i )|

L(n) + 4(h 1) L(n) Ph1 (L(n))

L(n) + 4(h 1) Ph (n)


4h Ph (n).

This proves (1).


Since L(n) (lg n), to complete the proof of the lemma it suffices to show that
there is a constant c such that for all h, n 1 we have Ph (n) c L(n)2 . Since
L(L(n)) O(lg lg n) and L(n) (lg n), there is an n 0 such that for all n n 0 we
have
L(L(n))2 L(n).

Note that P = {Ph (m) | m < n 0 , h 1} is a finite set and let c = max(P ).
We prove that Ph (n) c L(n)2 by induction on h 1. Since P1 (n) = L(n),
this statement is true for h = 1. For h 2, we have Ph (n) = L(n) Ph1 (L(n)). If
L(n) < n 0 , we have Ph1 (L(n)) c and thus Ph (n) cL(n). If L(n) n 0 , we have
L(L(n))2 L(n). By induction hypothesis, Ph1 (L(n)) c L(L(n))2 . Thus
Lemma 7. There is an algorithm that, given h, n N, computes h (n) in time
O(|h (n)|) = O(h lg2 n).


Proof. The algorithm computes h (n) in a straightforward recursive manner. We get the
following recurrence for the running time R(h, n):
L(n)

i=0

R(h 1, L(i )).

This recurrence is very similar to the one we obtained in the proof of Lemma 6 and can
easily be solved using the same methods. 
Observe that for all m 1 we have

2m = max{n N | L(n) m}.

43

T(h, ) = max{n N | L(n) T(h 1, )}.

(2)

m = n.

Furthermore, the formula h, can be computed from h and in time O(h lg h + ).

4 + L(n) + 4(L(n) 1) (h 1) Ph1 (L(n))

R(h, n) O(L(n)) +

Recall that T(h, ) is a tower of 2s of height h with an on top. Thus, in particular, for all
h, 1 we have

W |= h, (a , b)

4(h 1) Ph1 (i )

Ph (n) = L(n) Ph1 (L(n)) L(n) c L(L(n))2 c L(n)2 .

11

Lemma 8. Let h 1, 0. There is a first-order formula h, (x, y) of size O(h lg h + )


such that for all words W, a , b W, and m, n {0, . . . , T(h, )} the following holds:
If a is the first position of a subword U  W with U
= h (m) and b is the first position
of a subword V  W with V
= h (n), then

|h1 (i )|

L(n)1


M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

Proof. Let h = 1. Recall that the 1 -encoding of an integer p 1 is just the binary
encoding of p 1 enclosed in <1>, </1>. Hence to say that x and y are 1 -encodings of
the same numbers, we have to say that for all pairs x + i, y + i of corresponding positions
between x respectively y and the next closing </1>, there are the same letters at x + i
and y + i . For numbers p in {0, . . . , T(1, )}, there are at most L( p) positions to be
investigated. To express this, we let
1, (x, y) = x1 . . . x y1 . . . y

1

Sx = x1
((P</1> xi xi = xi+1 ) (P</1> xi Sxi = xi+1 ))
Sy = y1

i=1

i=1

1

i=1

((P</1> yi yi = yi+1 ) (P</1> yi Syi = yi+1 ))

((P0 xi P0 yi ) (P1 xi P1 yi )) .


Now let h 2 and suppose that we have already defined h1, (x, y). It will be convenient
to have the following auxiliary formulas available:
h
(x, y) = x < y z ((x < z z y) P</h> z) ,
int

h
last
(x, y) = x < y P</h> y z ((x < z z < y) P</h> z) .

h
Intuitively, int
(x, y) says that y is in the interior of the subword of the form h ( p) starting
h
at x and last
(x, y) says that y is the last position of the subword of the form h ( p) starting
at x, provided such a subword indeed starts at x.
To say that the subwords starting at x and y are h -encodings of the same numbers, we
have to say that for all positions w between x and the next closing </h> and all positions z
between y and the next closing </h>, if w and z are first positions of subwords isomorphic
to h1 (q) for some q N, then the positions following these two subwords are either both
1s or both 0s. For all subwords of h ( p) of the form h1 (q) we have q {0, . . . , L( p)}.
In order to apply the formula h1, to test equality of such subwords, we must have
q L( p) T(h 1, ). By (2), the last inequality holds for all p T(h, ). Thus for
such p we can use the formula h1, to test equality of subwords of h ( p) of the form

44

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

12

M. Frick and M. Grohe

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

h1 (q). As a first approximation to our formula h, , we let





h
(x, y) = w int
(x, w) P<h - 1> w
h,


h
z int (y, z) P<h - 1> z h1, (w, z)


h
(y, z) P<h - 1> z
z int
h


w int
(x, w) P<h - 1> w h1, (w, z)


h
h
(x, w) P<h - 1> w int
(y, z) P<h - 1> z h1, (w, z)
wz int
h1


h1
w z last
(w, w ) last
(z, z ) (P1 Sz P1 Sw ) .

The first line of this formula says that every subword of the form h1 (q) in the subword
of the form h ( p) starting at x also occurs in the subword of the form h ( p) starting at y.
The second line says that every subword of the form h1 (q) in the subword of the form
h ( p) starting at y also occurs in the subword of the form h ( p) starting at x. The third
and fourth lines say that if w and z are the first positions of isomorphic subwords of the
form h1 (q), then they are either both followed by a 1 or both by a 0 (since the only two
letters that can appear immediately after a subword h1 (q) in a word h ( p) are 0 and 1).
This formula says what we want, but unfortunately it is too large to achieve the desired
bounds. The problem is that there are three occurrences of the subformula h1, (w, z).
We we can easily overcome this problem. We let


h1
h1
(w, w ) last
(z, z ) P1 Sz P1 Sw
(w, z) = w z last
and

h, (x, y) = wz

h
h
(x, w) int
(y, z)
int

h
int
(y, w) int
(x, z)


P<h - 1> w P<h - 1> z


h

h
int (y, w) int

(x, w) P<h - 1> w



h1, (w, z) (w, z)

It is not hard to see that h, (x, y) has the desired meaning.


Observing that 1,  O( ) and that h,  = h1, +c lg h for some constant c,
we obtain the desired bound on the size of the formulas. To see why we need the factor lg h
here, recall that h,  is the length of a binary encoding of h, . The vocabulary of the
formula h, is of size O(h), thus the binary encoding of the symbols in this vocabulary
will require O(lg h) bits.
The fact that h, can be computed in time linear in the size of the output is immediate
from the construction. 

45

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

4. Encodings of propositional formulas

13

In this section, we give a sequence encoding of propositional formulas in conjunctive


normal form and assignments to the variables of these formulas such that there are shorter
and shorter first-order formulas stating that the encoded assignment satisfies the encoded
formula. The key idea is to use the encodings h of the natural numbers to encode
propositional variables by their index. Then by Lemma 8, we can check with a very
short first-order formula if two subwords of a codeword that represent variables actually
represent the same variable. This way we can look up the value of a variable in a table
representing the assignment.
The class of all formulas in conjunctive normal form is denoted by CNF, and for every
k 1 the class of all formulas in k-conjunctive normal form, that is, conjunctions of
clauses of size at most k, is denoted by k-CNF.
We assume that propositional formulas only contain variables X i , for i N. For a set
of propositional formulas, we let (n) denote the set of all formulas in whose variables
are among X 0 , . . . , X n1 .
To encode formulas and assignments, we will use an alphabet that is obtained from the
alphabet h introduced in the previous section by adding a number of symbols in several
stages throughout this section. We start by adding the symbols
We fix h and define an encoding of CNF-formulas by words as follows: For a literal , we
let

<lit> h (i ) + </lit> if = X i
h () =
<lit> h (i ) - </lit> if = X i
+, -, <lit>, </lit>, <clause>, </clause>, <cnf>, </cnf>.

(for every i N). For a clause = (1 m ) we let

h () = <clause> h (1 ) h (m ) </clause>,

and for a CNF-formula = (1 m ) we let

h ( ) = <cnf> h (1 ) h (m ) </cnf>.

Next, we need to encode assignments. Let A(n) denote the set of all assignments
: {X 0 , . . . , X n1 } {TRUE, FALSE}.

We add the symbols <val>, </val>, <asn>, </asn>, true, false to our alphabet. For
an assignment A(n), we let
h ()

= <asn>

</asn>.

<val>h (0) (X 0 )</val>


..
.

<val>h (n 1) (X n1 )</val>

46

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

14

M. Frick and M. Grohe

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

Of course what is meant by (X i ) here is the symbol true if (X i ) = TRUE and the
symbol false otherwise. For a pair ( , ) CNF(n) A(n) we simply let h ( , ) =
h ( ) h ().
The following lemma is an immediate consequence of Lemmas 6 and 7:
Lemma 9. Let h N and ( , ) CNF(n) A(n). Then |h ( , )| = O(h lg2 n ( +
n)) and there is an algorithm that computes h ( , ) in time O(h lg2 n (  + n)) (that
is, linear in the size of the output).

Lemma 10. For all h, N there is a first-order sentence h, of size O(h lg h + ) such
that for all n T(h, ) and ( , ) CNF(n) A(n),
h ( , ) |= h,

Furthermore, the formula h, can be computed in time O(h lg h + ).


|= .

Proof. Let h, (x, y) be the formula defined in Lemma 8. Recall that it says that the
subwords of the form h (m) and h (n) starting at x, y, respectively, are identical,
provided that such subwords start at x and y and that n, m T(h, ). Also recall the
formula
h
last
(x, y) = x < y P</h> y z((x < z z < y) P</h> z),

defined in the proof of Lemma 8, which says that y is the last position of the subword of
the form h (n) starting at x.
lit
We first define a formula h,
(x) such that if the subword of starting at x is the
encoding of a literal, then it is satisfied by . We let
lit
h
h
h,
(x) = yx y (P<val> y h, (Sx, Sy) last
(Sx, x ) last
(Sy, y )

(P+ Sx Ptrue Sy )).

(x) looks for


Suppose that the encoding of the literal ()X i starts at x. The formula h, ,0
a y such that the encoding of a pair ( j, (X j )) starts at y, then compares i and j , and if
they are equal, checks that the symbol indicating the sign of the literal is + if, and only if,
clause
(X j ) = TRUE. Next, we define a formula h,
(x) such that if the subword of starting
at x is the encoding of a clause, then it is satisfied by . We let
clause
lit
h,
(x) = y(z((x < z z y) P</clause> z) P<lit> y h,
(y)).

It simply says that there is a position y which is still within the boundary of the clause
starting at x such that a literal starts at y and this literal is satisfied. Finally, we let
clause
h, (x) = y(P<clause> y h,
(y)).

This formula says that all clauses and thus the whole CNF-formula are satisfied. 

For reasons that will become clear in the next section, we will also have to encode tuples
( , V1 , . . . , Vt ), where CNF(n) and V1 , . . . , Vt is a partition of {1, . . . , n}. We add

47

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

15

symbols V1, . . . , Vt to the alphabet. So now our alphabet depends on the two parameters
h and t. For every i {0, . . . , n 1} and 1 j t we let part(i ) = Vj if X i Vj . Then
we let
h ( , V1 , . . . , Vt ) = h ( )<asn>

<val> h (0) part(0) </val>

<val> h (n 1) part(n 1) </val>


</asn>.

Even in the case t = 1 it will be useful to work with the encoding h ( , {0, . . . , n 1})
instead of just h ( ), because the word h ( , {0, . . . , n 1}) already provides
the infrastructure for an assignment. For brevity, we write h ( , ) instead of
h ( , {0, . . . , n 1}).
5. Satisfiability testing through model-checking
In this section, we prove Theorem 1.

5.1. Monadic second-order logic

Theorem 11. Assume that PTIME = NP. Let h N and p a polynomial. Then there is no
algorithm for MC(MSO, W) whose running time is bounded by
T(h, k) p(n).

As usual, k denotes the size of the input sentence and n the size of the input word.

Proof. Suppose that there is an algorithm A for MC(MSO, W) whose running time is
bounded by
T(h, k) p(n),

for some h N and polynomial p.


We shall prove that the satisfiability problem for 3-CNF-formulas is in polynomial time,
which, by contradiction, proves the theorem. For all N, let


h+1, = X (x(X x PV1 x) h+1,
),


where h+1,
is the formula obtained from the formula h+1, of Lemma 10 by replacing
the subformula Ptrue Sy by X Sy . Recall that Ptrue Sy is the only subformula of h+1,
that involves either Ptrue or Pfalse . The subformula x(X x PV1 x) says that X only
contains elements that are at a position with symbol V1, which may simply be viewed as a
placeholder for true or false in an assignment. The intended meaning of X is to indicate
all variables set to TRUE. It is easy to see that for every n T(h+1, ) and 3-CNF(n )
we have

h+1,
h+1 ( , ) |= 

is satisfiable.

(3)

48

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

16

M. Frick and M. Grohe

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

Consider the algorithm displayed in Fig. 1, which decides if the input formula is
satisfiable. The correctness of the algorithm follows from (3) and
n = T(h + 1, lg

(h+1)

(n )) T(h + 1, lg

(h+1)

For the running time analysis, without loss of generality we can assume that n  
O((n )3 ), that is, that   and n are polynomially related. We claim that the running time
of the algorithm is bounded by q(n ) for some polynomial q depending only on the fixed
constant h.
Lines 13 of the algorithm can be implemented in time polynomial in h, n . Recall that
by Lemma 9, |h+1 ( , )| is polynomially bounded in terms of h and n . Thus by our
assumption on the algorithm A, Line 4 requires time
(n )!).

h+1, ) p (h, n ),
T(h, 
h+1, ) p(|h+1 ( , )|) T(h, 

for some polynomial p . By Lemma 10 and the definition of 


h+1, we have 
h+1, 
O(h lg h + ), that is, 
h+1,  c(h lg h + ) c(h lg h + lg(h+1) (n ) + 1) for some
constant c. Since
lg lg m
= 0,
lim
m lg m
there is an n 0 (depending on c, h) such that for all n n 0 we have
c(h lg h + lg(h+1) (n ) + 1) lg(h) (n ).

Thus for n n 0 we have T(h, 


h+1, ) T(h, lg(h) (n )) n . This proves the
polynomial time bound. 
5.2. First-order logic

We need a few preliminaries from parameterized complexity theory. A parameterized


problem is a set P N for some finite alphabet . If (x, k) N is an
instance of a parameterized problem, we refer to x as the input and to k as the parameter.
A parameterized problem P N is fixed-parameter tractable if there is a computable
function f : N N, a polynomial p, and an algorithm that, given a pair (x, k) N,
decides if (x, k) P in time at most f (k) p(|x|) steps. The class of all fixed-parameter
tractable problems is denoted by FPT.

49

17

The alternating weighted satisfiability problem for a class of propositional formulas


is a parameterized version of the satisfiability problem for quantified Boolean formulas
defined as follows:
AWS AT [ ]
Input:
Parameter:
Problem:

Fig. 1.

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

, t N, a partition V1 . . . Vt of the variables of


k, t N
Decide if there exists a size k subset U1 of V1 such that
for all size k subsets U2 of V2 there exists . . . such that the
truth assignment setting all variables in U1 . . . Ut to
TRUE and all other variables to FALSE satisfies

The parameterized complexity class AW[] is defined in terms of the alternating


weighted satisfiability problem for a hierarchy of classes of propositional formulas. All
we need to know here, however, is the following theorem:

Theorem 12 (Downey et al. [6], Flum and Grohe [9]). If AWS AT[3-CNF] is fixed-parameter tractable then
AW[] = FPT.

We are now ready to prove our theorem:

Theorem 13. Assume that FPT = AW[]. Let h N and p a polynomial. Then there is
no algorithm for MC(FO, W) whose running time is bounded by
T(h, k) p(n).

As usual, k denotes the size of the input sentence and n the size of the input word.

To prove this theorem, we will use the following alternative characterization of fixedparameter tractability. A parameterized problem P N is eventually in polynomial
time if there is a computable function f and an algorithm, whose running time is
polynomial in |x| that, given an instance (x, k) N of P with |x| f (k) correctly
decides if (x, k) P . (The behaviour of the algorithm on instances (x, k) N with
|x| < f (k) is irrelevant.)
Lemma 14 (Flum and Grohe [8]). A parameterized problem is fixed-parameter tractable
if, and only if, it is computable and eventually in polynomial time.

Proof of Theorem 13. Suppose that there is an algorithm A for MC(FO, W) whose
running time is bounded by
T(h, k) p(n),

for some h N and polynomial p. We shall prove that AWS AT[3-CNF] is in FPT.

be the formula obtained from the formula h+1, of
For all h, , k, t N, let h+1, ,k,t
 
Lemma 10 by replacing the (unique) subformula Ptrue Sy by ti=1 kj =1 Sy = xi j , for

50

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

18

M. Frick and M. Grohe

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

19

Lines 13 of the algorithm can be implemented in time polynomial in h, n . By our


assumption on the algorithm A, Line 4 requires time
T(h, 
h+1, ,k ,t ) p(n) = T(h, 
h+1, ,k ,t ) p (h, n ),

for some polynomial p , because n = |h+1 ( , V1 , . . . , Vt )| is polynomially bounded



in
h. Since we only replace one subformula Ptrue Sy by the disjunction
kof n and
tterms

i=1
j =1 Sy = xi j , we have

h+1, ,k ,t  p (h, k , t) + O( )

Fig. 2.

for a suitable polynomial p . Using a similar argument as in the proof of Theorem 11, we
can now derive that there is a computable n 0 depending on h, k , t such that for all n n 0
we have

new variables xi j , 1 i t, 1 j k. Let


 k
k1



h+1, ,k,t = x11 . . . x1k
PV1 x1i
x1i < x1(i+1)
x21 . . . x2k

i=1



Qxt 1 . . . Qxt k

k


i=1



i=1

PV2 x2i

k


i=1

..
.

k1

i=1

PVt xt i

x2i < x2(i+1)

k1

i=1

T(h, 
h+1, ,k ,t ) T(h, lg(h) (n )) n .

This proves our claim that if n is sufficiently large, then the running time of the algorithm
is bounded by q(n ) for some polynomial q and thus the theorem. 

xt i < xt (i+1)


h+1, ,k,t
.

Here Q is if t is even and otherwise. Moreover, represents if t is even and if t


is odd.
Then for every n T(h + 1, ), 3-CNF(n), k N, and for every partition
V1 , . . . , Vt of {0, . . . , n 1} we have

h+1 ( , V1 , . . . , Vt ) |= 
h+1, ,k,t
( , V1 , . . . , Vt )
with parameters (k, t) is a yes-instance AWS AT[3-CNF].

(4)

To see this, note that the first line of 


h+1, ,k,t says there exists a subset U1 =
{x11, . . . , x1k } of V1 of size k (the inequalities are used to make sure that the x1 j are
distinct). The second line says for all subsets U2 = {x21 , . . . , x2k } of V2 of size k, etc.

Finally, by Lemma 10, the formula h+1, ,k,t
in the last line of 
h+1, ,k,t says that is
satisfied if precisely the variables in U1 Ut are set to TRUE.
Consider the algorithm displayed in Fig. 2. The correctness of the algorithm follows
from (4) and
n = T(h + 1, lg(h+1) (n )) T(h + 1, lg(h+1) (n )!).

For the running time analysis, without loss of generality we assume that n  
O((n )2 ). We claim that if n is sufficiently large, then the running time of the algorithm
is bounded by q(n ) for some polynomial q. More precisely, we claim that there is a
polynomial q and an n 0 N, which is computable from h, k , t, such that for n n 0 the
running time of the algorithm is bounded by q(n ). Since h is fixed and since AWS AT[3CNF] is computable, by Lemma 14 this implies that AWS AT[3-CNF] is in FPT.

51

Remark 15. For readers familiar with least fixed-point logic, let us point out that with the
same techniques it can be proved that there is no model-checking algorithm for monadic
least fixed-point logic on words whose running time is bounded by T(h, k) p(n), for any
h N and polynomial p, under the weaker assumption that AW[P ] = FPT.
AW[P ] is a parameterized complexity class that contains AW[]. A complete problem
for AW[P ] is the alternating weighted satisfiability problem for arbitrary Boolean circuits
(as opposed to bounded depth circuits for AW[]).
6. First-order model-checking on structures of bounded degree

In this and the next section, we investigate the parameterized complexity of first-order
model-checking over structures of bounded degree. Let A be a -structure for some
vocabulary . We call two elements a , b A adjacent if they are distinct and there is
an R , say, r -ary, and a tuple a 1 . . . a r RA such that a , b {a 1 , . . . , a r }. The degree
of an element a A in the structure A is the number of elements adjacent to a , and the
degree of A is the maximum degree of its elements. For d 1, we denote the class of all
structures of degree at most d by D(d).

Theorem 16 (Seese [16]). Let d 1. Then there is a function f : N N and an


algorithm solving MC(FO, D(d)) in time f (k, d) n, where, as usual, k denotes the size of
the input sentence and n the size of the input structure.

It is quite easy to derive from Seeses proof a triply-exponential upper bound on f for a
non-uniform version of this theorem, stating that for every fixed first-order sentence there
is a triply exponential function f and an algorithm checking whether a given structure A
of degree at most d satisfies . We shall prove a uniform version of this result, which has
the additional benefit that our algorithm is quite simple.
The crucial idea, which has also been explored by Seese, is to use the locality of firstorder logic. Without loss of generality we assume that vocabularies only contain relation

52

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

M. Frick and M. Grohe

20

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

and MC(FO, D(d)) for d 3 in time


22

lg d2 O(k)

21

n,

where as usual k denotes the size of the input sentence and n the size of the input
structure.
Proof. We denote the running time of model-check(, A) by R(n, p, q), where n =
A, q = qr(), and p is the size of the quantifier-free part of . Note that p + q k(=
). Let r = r (q) = 2q ,
s(q) =

Fig. 3.

and constant symbols. (Functions can easily be simulated by relations.) We need some
additional notation. A path of length l is a sequence of vertices a 0 , . . . , a l A such that
a i1 , a i , i = 1, . . . , l are adjacent in A. The distance between two elements a , b A of
the universe is 0, if a = b and r , if the shortest path between a and b has length r . Let
r 1 and a A. The r -neighbourhood of a in A, denoted by NrA (a ) is the set of b A
such that a , b have distance at most r . Let NrA (a ) denote the substructure induced by A
on NrA (a ). For elements a , b of a structure A we write a
=rA b if there is an isomorphism
from NrA (a ) to NrA (b) that maps a to b.
Recall that qr() denotes the quantifier-rank of a formula .
Lemma 17 ([11,13]). For every first-order formula (x) there is an r 1 such that for
(A |= (a )
A |= (b))).
every structure A and a , b A we have (a
=rA b
Furthermore, r can be chosen to be 2qr() .

Fig. 3 displays a recursive model-checking algorithm for first-order sentences in prenex


normal form that is based on Lemma 17. Since we can easily transform arbitrary first-order
sentences into sentences in prenex normal form (algorithmically, this can be done in linear
time), this also gives us an algorithm for arbitrary sentences.
Note that in the recursive calls model-check((a ), (A, a )) of the algorithm, we
replace all occurrences of x in by a new constant symbol which is interpreted by the
element a A and check if this new sentence holds in the expanded structure (A, a ). The
correctness of the algorithm follows from an easy induction on the structure of the input
formula applying Lemma 17 in each step. Note that this algorithm works for arbitrary
input structures A.
Theorem 18. The algorithm model-check (displayed in Fig. 3) decides MC(FO, D(2))
in time
22

53

O(k)

n,

max NrA (a ),

a A,AC

the maximal size of an r -neighbourhood, and let t (q) denote the number of equivalence
classes of
=rA . Note that there exist upper bounds for s(q) and t (q) only depending on the
degree of the input structure (and not on n or ). Remember that the degree is constant for
the classes under consideration.
Now consider the algorithm displayed in Fig. 3. Line 1 only requires constant time. If
Line 2 is executed, it requires time O( p n), and the algorithm stops. Otherwise, it proceeds
to Line 3, which can be executed in constant time. To execute Line 4, we maintain a list
of pairs (NrA (a ), a ) such that no induced substructure (NrA (a ), a ) occurs twice. The size
of this list never exceeds t (q), hence for each a in turn, we simply compute the induced
substructure, and look if it is already in the list. This requires time O(n f (s(q)) t (q)),
if we denote the time to check isomorphism of structures of size m by f (m). The loop in
Lines 59 requires time
O(t (q)) + t (q) R(n, p, q 1).

Putting everything together, we obtain the following recurrence for R:


R(n, p, 0) c1 p n

R(n, p, q) c2 n f (s(q)) t (q) + t (q)R(n, p, q 1)

(for q 1),

for suitable constants c1 , c2 . To solve this equation, we use the following simple lemma:

Lemma 19. Let F, g, h : N N such that


F (0) g(0)

F (m) g(m) + h(m) F (m 1)

for all m N. Then


F (m)

for all m N.

m

i=0

g(i )

m


j =i+1

h( j )

54

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

22

M. Frick and M. Grohe

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

The lemma can be proved by a straightforward induction on q.


Applied to our function R, the lemma yields
R(n, p, q) c1 p n

q


j =1

q


j =1

t( j) +

q


t ( j ) c1 p n +


i=1
q

i=1

c2 n f (s(i )) t (i )
c2 n f (s(i )) .


q


j =i+1

t( j)

Degree 2: The size of an r -neighbourhood in a structure A D(2) is at most 2r + 1.


Thus
s(q) 2 O(q) 2 O(k) .

To give an upper bound on t (q), we have to take into account the number u of symbols in
the vocabulary. Since we only have to consider symbols that actually appear in , we can
assume that u k. Moreover, without loss of generality we can assume that the vocabulary
only contains unary and binary relation symbols (because we are considering structures of
degree 2).
Let us count the number of isomorphism types of an m-vertex structure B of degree 2
whose vocabulary contains u 1 unary relation symbols and u 2 binary relation symbols.
The unary relations can take at most 2 u 1 m different values. There are at most m pairs
of elements which can be connected by a binary relation, thus the binary relations can take
at most 2u 2 m different values. Thus the overall number of isomorphism types is bounded
by 2(u 1 +u 2 )m .
Our r -neighbourhoods have size at most 2r + 1, so we obtain
Thus

t (q) 2 O(kr ) = 2 O(k2 ) .


q

q


j =1

t( j)

q


j =1

2 O(k2 ) 2 O(k
j

q

j =1

2j)

22

O(k)

Since isomorphism of structures of degree 2 can be decided in polynomial time, we


obtain


q

O(k)
c1 p n +
c2 n f (s(i )) t (i ) O(22
n)

and thus

i=1

R(n, p, q) 22

O(k)

n.

Degree at least 3: The calculations are similar in this case, the only important difference
being that an r -neighbourhood may be of size (d r ) and thus doubly exponential in q,
which yields a triply exponential bound for R. 

55

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

7. Lower bounds for first-order model-checking on structures of bounded degree

23

In this subsection we prove lower bounds for first-order model-checking on two


particularly simple classes of structures of degree two and three, respectively: The class
of words without order and the class of ordered binary trees.
7.1. Words without order

Formally, a word without order over an alphabet is a reduct of a word over to


the vocabulary S ( ) = ( )\{}. We denote the class of all words without order by S.
Since we will only consider words without order in the following, for simplicity we often
just refer to them as words.
In this section we will only work with the encoding 1 (recall the definition from
Section 3), but we need a refined version of Lemma 8 for h = 1:

Lemma 20. Let 1 and let 1 . There is a first-order formula (x, y) of


vocabulary S (1 ) and size O( ) such that for all words without order W , a , b W ,

and m, n {0, . . . , 22 } the following holds:
If a is the first position of a subword U  W with U
= 1 (m) and b is the first position
of a subword V  W with V
= 1 (n), then
W |= (a , b)

m = n.

Furthermore, the formula can be computed from in time O( ).

Note that Lemma 8 only provides a formula 1,l (x, y) that works for m, n 2 .
Before we prove the lemma, we define a few basic formulas and notations that we need
in dealing with words without order. Let (x, y) be a formula. For a structure A, elements
a , b A, and 0, a -path of length from a to b is a sequence a 0 , a 1 , . . . , a of
elements of A such that a 0 = a , a = b, and A |= (a i , a i+1 ) for 0 i < . We let
b a be the minimum length of a -path from a to b if there is such a path. If there is
no -path from a to b, we let b a = .
Lemma 21. Let 1 and (x, y) a first-order formula.

(1) There exists a first-order formula (x1 , x2 ) of size O( ) such that for every structure
A and all a 1 , a 2 A,

A |= (a 1 , a 2 )

a 2 a 1 2 .

(2) There exists a first-order formula (x1 , x2 , y1 , y2 ) of size O( ) such that for every
structure A and all elements a 1 , a 2 , b1 , b2 A,

A |= (a 1 , a 2 , b1 , b2 )

a 2 a 1 2 a 2 a 1 = b 2 b 1 .

Proof. We only prove (2); the proof of (1) is similar, but simpler. We let

0 (x1 , x2 , y1 , y2 ) = (x1 = x2 y1 = y2 )

x1 = x2 y1 = y2 (x1 , x2 ) (y1 , y2 ) ,

56

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

M. Frick and M. Grohe

24

and for 1

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

25

(x1 , x2 , y1 , y2 ) = 0 (x1 , x2 , y1 , y2 )


x3 y3 xx yy

(x = x1 x = x3 y = y1 y = y3 )

(x = x3 x = x2 y = y3 y = y2 )

1 (x, x , y, y ) . 

Fig. 4.

Proof of Lemma 20. We let (x, y) = (P</1> x Sx = y) (P</1> x x = y) and


(x, y) = x y

( (x, x , y, y )

((P0 x P0 y ) (P1 x P1 y ))),

where is taken from Lemma 21(2). 

Recall that 3-CNF(n) denotes the set of all formulas in 3-conjunctive normal form
whose variables are among X 0 , . . . , X n1 and that A(n) denotes the set of all truth-value
assignments to these variables. Recall further the encodings of propositional formulas
introduced in Section 4.
Lemma 22. For all l N there is a first-order sentence l of size O(l) such that for all
l
|= . Furthermore,
n 22 and ( , ) 3-CNF(n) A(n) we have 1 ( , ) |= l
l can be computed in time O(l).

Proof. Essentially, we proceed as for words with order. Suppose that there is an algorithm
A for the problem MC(FO, W) whose running time is bounded by
22

Observe that the length of an encoding 1 (n) for an n 2 is in O(2 ). We have seen
above that we can describe subwords of length up to 2 by formulas of length O( ) that
h (x, y) by a formula of length O( )
only use the successor relation. Therefore, replace last
that only involves the successor relation.

Moreover, since we are only considering 3-CNF(n) formulas for n 22 , subwords
describing clauses have length O( ). Thus again we can replace the subformulas involving
the order symbol by suitable formulas of length O( ). 
Note that the previous proof does not work for arbitrary CNF-formulas; it is crucial that
the clauses have bounded length.
We are now ready to prove the main result of this section (which is Theorem 2(1)):

Theorem 23. Assume that FPT = AW[], and let p be a polynomial. Then there is no
algorithm for MC(FO, S) whose running time is in
22

o(k)

p(n),

where k denotes the size of the input sentence and n the size of the input word.

57

p(n),

for some polynomial p and a function f (k) o(k). We shall prove that AWS AT[3-CNF]
is in FPT.
For all , k, t N, let
 k
k1


PV1 x1i
x1i < x1(i+1)

,k,t = x11 . . . x1k
x21 . . . x2k

Proof. Recall the proof of Lemma 10. Instead of the formula h, we now use of
Lemma 20. We have to eliminate all occurrences of the order symbol <, which is used
h
clause
in the formulas last
(x, y) and h,
.
2

f (k)

i=1



Qxt 1 . . . Qxt k

k


i=1



i=1

PV2 x2i

k


i=1

..
.

k1

i=1

PVt xt i

x2i < x2(i+1)

k1

i=1

xt i < xt (i+1)

,k,t


,


where ,k,t
is the formula obtained from the formula of Lemma 22 by replacing
 

the (unique) subformula Ptrue Sy by ti=1 kj =1 Sy = xi j . Then for every n 22 ,
3-CNF(n), k N, and for every partition V1 , . . . , Vt of {0, . . . , n 1} we have

1 ( , V1 , . . . , Vt ) |= 
,k,t
( , V1 , . . . , Vt )
with parameters (k, t)is a yes-instance of AWS AT[3-CNF].

(5)

The algorithm deciding k -satisfiability of 3-CNF is displayed as Fig. 4.


The correctness of this algorithm follows from (5). For the analysis, without loss of
generality we assume that n   O((n )2 ). We claim that if n is sufficiently large,
then the running time of the algorithm is bounded by q(n ) for some polynomial q. Then
Lemma 14 implies that AWS AT[3-CNF] is in FPT.
Lines 13 of the algorithm can be done in time polynomial in n . The crucial part is
Line 4. By the assumption on algorithm A this line requires time
22

f (
,k )

p(n),

58

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

M. Frick and M. Grohe

26

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

27

Lemma 25. Let 1. There is a formula (x, y) of vocabulary B ({0, 1}) of size O( )

such that for all ordered binary trees T B, a , b T and m, n {0, . . . , 22 } the
following holds:
If the depth 2 subtree below a is isomorphic to (n) and the depth 2 subtree below b
is isomorphic to (m) then
2l

T |= (a , b)

Fig. 5. The tree (38).

where n = |1 ( , )| is polynomial in n . It follows from Lemma 22 that



,k  p (k , t) + c .

l,k 
for some polynomial p and constant c. Hence for sufficiently large n we have 
c lg lg n , say, for c = 2c. Since f (k) o(k), there is an n 0 such that for all n n 0 we
have f (c lg lg n ) lg lg n and thus
22

f (
l,k )

22

f (c lg lg n )

22

lg lg n

n .

This gives us the desired upper bound on the running time of our algorithm. 
7.2. Ordered binary trees

We view ordered binary trees as {S0 , S1 }-structures T , with S0T and S1T being the left
child and right child relations. We allow nodes to only have one child. For a finite alphabet
, we let B ( ) = {S0 , S1 } {Ps | s }, where Ps , for s , is a unary relation
symbol. An ordered binary tree over is a B ( )-structure whose -reduct is an ordered
binary tree in which each vertex is contained in precisely one PsT , for s . We denote
the class of all ordered binary trees over some finite alphabet by B. For a node a of a tree
T B and d 1, the depth d subtree below a is the subtree of T whose nodes are all
descendants of a of distance at most d from a .
To proceed as in the word cases, we will encode natural numbers by trees and provide
short formulas allowing to compare large encoded numbers. For N, let T be the
ordered binary tree with vertex set {0, . . . , } and root 0 in which the children of i are 2i +1
and 2i + 2. Recall that L(n) denotes the length of the binary encoding of n N. We let
(n) be the ordered binary tree over {0, 1} whose underlying tree is T L(n) and in which, for
i = 0, 1,
Pi

T (n)

= { j L(n) | bit( j, n) = i }.

Example 24. Fig. 5 shows the encoding of 38, the binary representation of which is
100110.
The next lemma corresponds to Lemmas 8 and 20.

59

m = n.

Furthermore (x, y) can be computed in time O( ).

Proof. We construct a formula (x, y) characterizing depth 2 subtrees up to



isomorphism. This formula identifies binary encodings of length up to 22 , which proves
the claim. We proceed as in the proof of Lemma 21. First, we say that to go from vertices
x1 to x2 and from y1 to y2 we must follow the same sequence of S0 , S1 -successors. Let
0 (x1 , x2 , y1 , y2 ) = (S0 x1 x2 S0 y1 y2 )

and for l 1

(S1 x1 x2 S1 y1 y2 )
(x1 = x2 y1 = y2 ),

l (x1 , x2 , y1 , y2 ) = x3 y3 xx yy ((x1 = x x3 = x y1 = y y3 = y )
(x3 = x x2 = x y3 = y y2 = y )

Using this formula we let

l1 (x, x , y, y )).

l (x, y) = x y (l (x, x , y, y ) ((P1 x P1 y ) (P0 x P0 y )),

which is the sought formula. 

Now we proceed as before and encode formulas of 3-CNF(n) for some n as an ordered
binary tree over some alphabet . For 3-CNF let ( ) be the binary tree T constructed
as follows: let W be the word without order 1 ( ), and consider W as a tree of S1 successors without any S0 -successors. To get T we substitute each subword U of W of
the form 1 (m) by a single vertex v such that vs S0 -successor is the root of a copy of
(m), while its S1 -successor is the first position after U in W. v itself carries the new
symbol var.
We extend the definition of to pairs ( , ) 3-CNF(n) A(n) and tuples
( , V1 , . . . , Vt ) by applying the same substitution process. This encoding gives us the
following lemma, whose proof is omitted since it resembles the proof of Lemma 10 using
the newly introduced encoding together with the decoding formulas (x, y).
Lemma 26. For all N there is a first-order sentence of size O(l) such that for all

n 22 and ( , ) 3-CNF(n) A(n) we have ( , ) |=


can be computed in time O( ).
2

|= . Furthermore,

Now we are ready to state the second main result of this section, which is Theorem 2(2).
We omit the proof, which is analogous to the proof of Theorem 23.

60

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

M. Frick and M. Grohe

28

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

Fig. 6. The tree 3 (40 961).

Theorem 27. Assume that FPT = AW[], and let p be a polynomial. Then there is no
algorithm for MC(FO, B) whose running time is in
22

2o(k)

p(n),

where k denotes the size of the input sentence and n the size of the input tree.
8. L ower bounds for first-order m odel-check ing on trees

In this last section we prove a non-elementary lower bound for first-order modelchecking over unranked trees. We need the same ingredients as before: suitable encodings
of natural numbers and small formulas for comparing two numbers.
For simplicity, we work with directed labelled trees . In Remark 33 we describe how to
get rid of labels and directed edges in order to transfer the results to plain undirected trees.
But for now we view a tree as an {E}-structures T with E T being the child-relation. For
a finite alphabet we let T ( ) = {E} {Ps | s }. Then a tree over is a T ( )structure T whose {E}-reduct is a tree and in which each vertex is contained in precisely
one PsT , for s . We denote the class of all trees over some alphabet by T.
Recall that T(h, 2) denotes a tower of 2s of height h + 1 and that bit(i, n) denotes the
i th bit in the binary representation of n. For every h 0 and n {0, . . . , T(h, 2) 1} we
define h (n) to be the following tree over {0, 1, *}:
(1) If h = 0, we let 0 (0) be a single node labelled by 0. Likewise, let 0 (1) be a single
node labelled by 1.
(2) If h 1, we let h (n) be the tree formed by taking a new root, labelling it by *, and
attaching to it the tree h1 (i ) for each i such that bit(i, n) = 1.
Exam ple 28. Fig. 6 shows the 3 -encoding of 40 961 = 215 + 213 + 20 . The tree is
constructed as follows:

61

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

29

To construct 3 (40 961), by clause (2), we take a new root labelled by * and attach three
trees to this root: 2 (0), 2 (13), 2 (15).
The binary representation of 0 consists of 0s only. Thus to construct 2 (0), we take a
new root labelled by * and attach no children. This explains the leftmost leaf labelled *.
We have 13 = 20 + 22 + 23 . Thus to construct 2 (13), we take a new root labelled by *
and attach three children labelled 1 (0), 1 (2), and 1 (3).
1 (0) is again a tree consisting of just one node labelled *. This explains the second leaf
labelled *.
We have 2 = 21 . Thus to construct 1 (2), we take a new root labelled by * and attach
one child labelled by 0 (1).
0 (1) is the 1-node tree labelled 1.
The remaining subtrees are constructed similarly.
Lemma 29. There is an algorithm that, given h and n {0, . . . , T(h, 2)}, computes h (n)
in time O(h lg2 n). Furthermore, |h (n)| O(h lg2 n).

Proof. A simple recursive procedure will do. The running time analysis uses the same
ideas as the proofs of Lemmas 6 and 7. 
The next lemma corresponds to Lemmas 8 and 20.

Lemma 30. Let h 1. There is a first-order formula h (x, y) of size O(h) such that for
all trees T over , a , b T, and m, n {0, . . . , T(h, 2) 1} the following holds:
If the subtrees of T rooted at a , b are isomorphic to h (m) and h (n), respectively, then
T |= h (a , b) if, and only if, m = n.

Proof. 0 (x, y) simply is the formula P0 x P0 y. Let h (x, y) already be defined.


h+1 (x, y) says that for each successor x1 of x there is a successor y1 of y such that
h (x1 , y1 ) and vice versa. As usual, we have to take care to avoid duplication of the
subformula h . We let
h+1 (x, y) = z1 ((E x z1 E yz1 ) z2 ((E x z1 E yz2 )
(E yz1 E x z2 ) h (z1 , z2 ))),

which has the intended meaning and the desired size. 

We encode 3-CNF-formulas as trees over a suitable alphabet in essentially the same


way we did with binary trees in Section 7.2, using the encoding h instead of . Then for
every h we get an encoding h of formulas in 3-CNF(n) for n < T(h, 2). We extended the
definition of h to pairs ( , ) 3-CNF(n) A(n) and to tuples ( , V1 , . . . , V ).

Lemma 31. For all h N there is a first-order sentence h of size O(h) such that for all
n < T(h, 2) and ( , ) 3-CNF A(n) we have h ( , ) |= h |= . Furthermore,
h can be computed in time O(h).
We omit the simple proof.

Theorem 32. Assume that FPT = AW[]. Let h N and p a polynomial. Then there is
no algorithm for MC(FO, T) whose running time is bounded by

62

The complexity of first-order and monadic second-order logic revisited

The complexity of first-order and monadic second-order logic revisited

M. Frick and M. Grohe

30

M. Frick and M. Grohe

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

T(h, k) p(n),

where k denotes the size of the input sentence and n the size of the input tree.
The proof is analogous to our earlier lower bound proofs.

Remark 33. Even though we only stated the lower bound result for labelled binary trees,
it also holds for unlabelled undirected trees, that is, connected acyclic undirected graphs.
To see this, we first note that the alphabet and thus the vocabulary of the formula h of
Lemma 31 does not depend on h. Suppose the vocabulary of h is {E, P1 , . . . , P p }. To get
rid of the directed edges, we replace each directed edge from a vertex v to a vertex w by
the following subgraph:
To get rid of the unary relations, we attach (i + 2) new children to each node in Pi and
delete Pi .

9. Conclusions

It is interesting to observe that the complexity-theoretic assumptions we use to prove


our theorems, that is, PTIME = NP for the theorem on MSO and FPT = AW[] for the
theorems on MSO, are precisely the assumptions needed to prove that the model-checking
problem for the respective logic on arbitrary structures is not FPT. It remains an open
problem to weaken the complexity-theoretic assumptions to PTIME = PSPACE. Note
that PTIME = PSPACE is a necessary assumption for all our lower bounds, because if
PTIME = PSPACE then model-checking for monadic second-order logic is in PTIME.
There is a significant gap between the lower bounds for model-checking on words
provided by Theorem 1 and the upper bound T(O(k), 1) n (a tower of 2s of height O(k)).
It would be nice to narrow this gap, maybe by proving that there is no T(o(k), 1) p(n)
algorithm for first-order or monadic second-order model-checking on words.

M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331

31

[9] J. Flum, M. Grohe, Model-checking problems as a basis for parameterized intractability, Technical Report
23/2003, Fakultat fur Mathematik und Physik, Albert-Ludwigs-Universitat Freiburg, 2003.
[10] M. Frick, M. Grohe, Deciding first-order properties of locally tree-decomposable structures, Journal of the
ACM 48 (2001) 11841206.
[11] W. Hanf, Model-theoretic methods in the study of elementary logic, in: J. Addison, L. Henkin, A. Tarski
(Eds.), The Theory of Models, North Holland, 1965, pp. 132145.
[12] H. Kamp, Tense Logic and the theory of linear order, Ph.D. Thesis, University of California, Los Angeles,
1968.
[13] L. Libkin, Logics with counting and local properties, ACM Transactions on Computational Logic 1 (2000)
3359.
[14] O. Lichtenstein, A. Pnueli, Finite state concurrent programs satisfy their linear specification, in: Proceedings
of the Twelfth ACM Symposium on the Principles of Programming Languages, 1985, pp. 97107.
[15] K. Reinhardt, The complexity of translating logic to finite automata, in: E. Gradel, W. Thomas, T. Wilke
(Eds.), Automata, Logics, and Infinite Games, Lecture Notes in Computer Science, vol. 2500, SpringerVerlag, 2002, pp. 235242 (Chapter 13).
[16] D. Seese, Linear time computable problems and first-order descriptions, Mathematical Structures in
Computer Science 6 (1996) 505526.
[17] L.J. Stockmeyer, The Complexity of Decision Problems in Automata Theory, Ph.D. Thesis, Department of
Electrical Engineering, MIT, 1974.
[18] L.J. Stockmeyer, A.R. Meyer, Word problems requiring exponential time, in: Proceedings of the 5th ACM
Symposium on Theory of Computing, 1973, pp. 19.
[19] P. van Emde Boas, Machine models and simulations, in: J. van Leeuwen (Ed.), Handbook of Theoretical
Computer Science, vol. 1, Elsevier Science Publishers, 1990, pp. 166.
[20] M.Y. Vardi, The complexity of relational query languages, in: Proceedings of the 14th ACM Symposium on
Theory of Computing, 1982, pp. 137146.

References

[1] A.V. Aho, J.E. Hopcroft, J.D. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley,
1974.
[2] J.R. Buchi, Weak second-order arithmetic and finite automata, Zeitschrift fur Mathematische Logik und
Grundlagen der Mathematik 6 (1960) 6692.
[3] B. Courcelle, Graph rewriting: an algebraic and logic approach, in: J. van Leeuwen (Ed.), Handbook of
Theoretical Computer Science, vol. B, Elsevier Science Publishers, 1990, pp. 194242.
[4] N.J. Cutland, Computability, Cambridge University Press, 1980.
[5] R.G. Downey, M.R. Fellows, Parameterized Complexity, Springer-Verlag, 1999.
[6] R.G. Downey, M.R. Fellows, K. Regan, Parameterized circuit complexity and the W-hierarchy, Theoretical
Computer Science 191 (1998) 97115.
[7] H.-D. Ebbinghaus, J. Flum, W. Thomas, Mathematical Logic, 2nd edition, Springer-Verlag, 1994.
[8] J. Flum, M. Grohe, Describing parameterized complexity classes, in: H. Alt, A. Ferreira (Eds.), Proceedings
of the 19th Annual Symposium on Theoretical Aspects of Computer Science, Lecture Notes in Computer
Science, vol. 2285, Springer-Verlag, 2002, pp. 359371.

63

64

A lemma for cost attained G. Hjorth

A lemma for
cost attained

Annals of Pure and Applied Logic 143 (2006) 87102


www.elsevier.com/locate/apal

A lemma for cost attained


Greg Hjorth
Department of Mathematics, MSB 6363, UCLA, Los Angeles, CA 90095-1555, USA

Greg Hjorth
Annals of Pure and Applied Logic
143 (2006), Pages 87-102

Received 2 January 2005; accepted 30 May 2005


Available online 26 May 2006

Abstract
A treeable ergodic equivalence relation of integer cost is generated by a free action of the free group on the corresponding
number of generators. Every countable treeable ergodic equivalence relation is induced by the free action of some countable group.
c 2006 Elsevier B.V. All rights reserved.

Keywords: Ergodic theory; Treeable equivalence relations; Free groups; Measure equivalence of groups

1. Introduction
Given an equivalence relation E one can consider the graphings of E, consisting of partial functions included in
the graph of E whose various compositions enable us to trace out a path between any two equivalent points. Levitt in
[5] defines the cost of a measure preserving equivalence relation to be the infimum among graphings of the sums of
the measures of the domains of the relevant partial functions.
Here we present a result, which in an unpublished form has been previously cited by [4] to obtain a kind of
dichotomy theorem for amenability and [6] in an application to von Neumann algebras. The authors of [4] wrote up a
proof of 1.1, though their organization is very different to the one below.
Proposition 1.1. Let E be an ergodic measure preserving equivalence relation on a standard Borel probability space
(X, ); assume that every equivalence class is countable. Let be a graphing for E with C () n.
Then there is an alternate graphing  for E which has no greater cost and contains n morphisms which are total
that is to say:
(a) C (  ) C ();
(b) and there are distinct bijections 1 , . . . , n in  with i : X X.
One obtains additionally that if C () n + 1 then we may further conclude  = {1 , 2 , . . . , n , }, where
: A B is a bijection, some A, B X.
Recall that a measurable equivalence relation is treeable if there is a measurable way of assigning the structure of
a tree to each equivalence class. In the next corollary one should bear in mind that Damien Gaboriau has shown in [3]
that an E as above with finite cost is treeable if and only if it admits a graphing which actually attains its cost, and in
this case any treeing will in fact realize the infimum.
Tel.: +1 310 8255626.

E-mail address: greg@math.ucla.edu.


c 2006 Elsevier B.V. All rights reserved.
0168-0072/$ - see front matter 
doi:10.1016/j.apal.2005.05.034

65

66

A lemma for cost attained G. Hjorth

88

A lemma for cost attained G. Hjorth

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

Corollary 1.2. For E treeable and as above, the cost of E equals n if and only if there is a measure preserving action
of the free group on n generators, Fn , which is free -a.e. and has E as its orbit equivalence relation.
We also show that in the case that E is treeable with infinite cost one may find a free action of F giving rise to E.
Appealing to the connections made in [2] between orbit equivalence in the ergodic setting and measure equivalence,
this implies that a countable non-amenable group is measure equivalent to a non-abelian free group if and only if it has
a free measure preserving action on some standard Borel probability space which gives rise to a treeable equivalence
relation.
This paper finishes with a comment on a deep theorem of Alex Furmans.
Corollary 1.3. If E is a treeable ergodic measure preserving equivalence relation on a standard Borel probability
space with countable classes, then there is a countable group G acting -a.e. freely giving rise to E as its orbit
equivalence relation.
Moreover, the group G can be chosen solely as a function of the cost of E.

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

3. Proof
We set about proving 1.1 for n = 2. It should be more or less clear how to extend it to larger n. We organize this
into a series of small technical lemmas, omitting proofs when they resemble earlier arguments.
The first of these lemmas, at 3.1, states that we may find a new morphism 0 included in E and a corresponding
partition of the space up into an infinite array of measurable sets,
B1,0 , B1,1,
B2,0 , B2,1 , B2,2,
...

Furman in [2] had previously obtained ergodic E which are not induced by an a.e. free action of a countable group.
His examples arose by the restriction of a non-treeable equivalence relation to a non-null set, and were therefore
known to be non-treeable.
2. Notation and definitions
We take all the usual notational shortcuts. All sets considered are measurable. All functions are measurable.
All group actions are by measure preserving transformations. We identify functions agreeing a.e. Unless otherwise
warned, the reader should assume that all non-empty sets are non-null. We tend to say everywhere when we only mean
almost everywhere. N begins with the number 0.
Definition. A standard Borel space is a set X equipped with a -algebra B, such that B is the -algebra generated by
some choice of a Polish topology on X. A standard Borel probability space is a standard Borel space equipped with
a probability measure on its Borel sets.
In general we will only be considering uncountable standard Borel spaces, and these are all isomorphic to the unit
interval in its usual Borel structure. Thus one might reasonably think of a standard Borel probability space as just
being some choice of a Borel probability measure on [0,1].
Definition. If E is an equivalence relation on a standard Borel probability space (X, ), and A, B X measurable,
then we say that a function
f :AB
is a morphism (for E) if it is a bijection and
x E f (x)
all x A. We say that E is measure preserving if every morphism is a measure preserving function. We say that E is
ergodic if every E-invariant set is either null or conull.
From [1], the measure preserving equivalence relations with countable classes are exactly those induced by a
countable group of measure preserving transformations. Even in the case that E is ergodic, [2] has shown that in
general we may not be able to choose this group so that it acts freely on the space. Below we will prove that the
additional assumption of treeability does ensure that we can choose the countable group so that it acts freely.
Definition. Given a set of morphisms, a word built from is a morphism of the form
x 1(1) 2(2) . . . n(n) (x),
where each i , each (i ) {1, 1}, and x ranges over a set on which these compositions make sense. The word
is reduced if at no i do we have (i ) = (i + 1) along with i = i+1 . A set of morphisms is said to be a
graphing of E if for any x E y there is a word mapping x to y; the graphing is said to be a treeing if there is always a

67

89

unique reduced word. Equivalently, is a treeing if the adjacency relation x Ry if there exists with 1 (x) = y
providing a tree structure on each equivalence class.
For a collection of morphisms we let 1 = { 1 : }.

Bn,0 , Bn,1 , Bn,2 , . . . , Bn,n ,


...,
such that at each n 1 and k < n
0 | Bn,k : Bn,k Bn,k+1
is a bijection. We also want to do this in such a way as we can obtain a new graphing containing 0 , for which the
cost has not increased, and such that all the parts of morphisms which have been lost from the older graphing can be
easily recovered as powers of 0 .
Lemma 3.1. Let E be as above, a graphing of E. Then there is a graphing 0 of E, 0 0 , (Bn,k )n1,kn a
partition of X , such that:
(1)
(2)
(3)
(4)

C ( 0 ) C () (the cost has not increased);


0 [Bn,k ] = B
n,k+1 all k < n (the new morphism moves the elements of the partition in the prescribed manner);
Dom( 0 ) = k<n,nN Bn,k (the new morphism has the indicated domain);
for each 0 with = 0 we have
= |C ,

some C X, (the new graph consists just of the new morphism and restrictions of the old morphisms);
(5) for each there is a partition (Ci )iN of X such that
(i) |C0 0 ;
(ii) and at i > 0, |Ci = (0 )i |Ci , some i Z (we can recover the missing pieces of the old morphisms as
powers of the new morphism).
Proof. We assume that for distinct ,  we always have (x) =  (x) when both are defined. We may also
Dom()
empty.
assume there is some with Ran()
We build transfinite sequences of graphings
( )< , ( )< ,
along with a choice of morphisms , by induction on so that:
(a) 0 is empty; 0 = ;
(b) 1 consists in a single morphism 0 with Ran(0 )Dom(0 ) = , 0 , (Dom(0 )) = 0; we set 0 = 0 , and
for all and we have Ran( ) Dom( ) empty;
 ;
(c) if + 1 < , > 0, and +1 , then Dom( ) Ran(  ) some 

(d) for we have and at a limit ordinal we have = < ;


(e) if ,  are distinct, then their ranges are disjoint and their domain are disjoint;
(f) each graphs E;

68

A lemma for cost attained G. Hjorth

90

A lemma for cost attained G. Hjorth

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

91

Fig. 1. We need to add again, and we find some 1 1 such that some restriction 1 | A or its inverse 1 |1
A has image disjoint to both the domain
1

and image of 0 . Then, as indicated later, we can find 1 either of the form 1 |
A or 1 | A 1 whose domain is disjoint from the domain of 0

and whose range is disjoint to both its domain and range. We add 1 to 1 to obtain 2 and subtract off 1 | A .

Fig. 2. We insist that 0 , 1 have disjoint domains and that the range of 1 is entirely new. After this we keep going, finding some 2 whose domain
is disjoint from the domains of 0 and 1 and is, relative to these two morphisms, interdefinable with some element of 2 . We do insist that it picks
up some more of the space in its image, though we do not object if its domain is included in the range of one of the earlier functions.

(g) is the only morphism not appearing in +1 and for this there is a partition of Dom( ) into A0 , A1
such that
(A1 ) = 0;
| A1 +1 ;

there are 1 , . . . ,  , 1 , . . . , k 1 , and +1 , with


1

Dom( ) = 1 . . . k [ A0 ]

and we have either


| A0 = 1 2 . . .  1 . . . k | A0

or

( | A0 )1 = 1 2 . . .  1 . . . k | A0
(h) if is a limit then for < and , we have if is in every earlier , or otherwise we have
| A, , where

A, = {A1 : , }.
Inrough terms, we begin the construction by taking some 0 in our original graphing which can be assumed to
have disjoint range and image and simply adding it to 1 and subtracting it from 0 to obtain 1 . We just describe
the first few steps, without giving much in the way of proofs yet.
Thus the general idea of this construction is to steadily transfer across pieces of the s to the s, so that
continues to graph E. The crucial part of this is (g). It tells us that when we remove a single piece | A0 of
a morphism then we are compensating by placing into +1 a morphism, , which can reconstruct | A0
using only pre-existing morphisms already placed into . As we continue through the construction, and survey the
construction at ever larger ordinals , the sets only get bigger, and our ability to write | A0 as a word from is
never endangered.
The other parts of this construction are less vital. (a) and (d) are bookkeeping requirements, describing how we add
to the s and remove from the s. (f) actually follows from the other clauses. (e) ensures that partial morphisms in

69

Fig. 3. The domain of 2 is included in the image of the earlier functions, but its image is new. After this we keep going to add 3 , spreading out to
new domains and so on.

the s will eventually have some morphism as their union, which in turn will give us 0 described in the statement of
3.1. (b) and (c) will enable us to obtain the (Bn,k ) sets with the structure indicated above. (h) and (d) state that at limit
ordinals we take an appropriate limit of the process so far, with a union on the side and a kind of intersection
along the s.
We continue with this construction for as long as possible, eventually arriving at some ( )< , ( )< admitting
no further extension. We will argue that this final ordinal is a successor ordinal, = + 1, and that in some natural
way will yield 0 and 0 as required.
Claim (1). is not a limit ordinal.

70

A lemma for cost attained G. Hjorth

92

A lemma for cost attained G. Hjorth

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

Claim (3). For and we have Dom( )




Bn =
Dom( ) Ran( ).
nN

Proof of Claim. By clause (c) in our construction and induction on .




Claim (4). For and a.e. x nN Bn either:

(Claim)

/ Dom( ) all ; or
(1) x
(2) there exists k and 0 , 1 , . . . , k such that k k1 . . . 0 (x) is well defined and not a member of Dom( )
any .


3
2

93

nN Bn ; and thus

Proof of Claim. Otherwise suppose not; we define Bn, to be the set of x Bn such that for every k there exists
0 , 1 , . . . k with k k1 . . . 0 (x) well defined, and observe that this set will have positive measure. It then

follows by clause (c) of our construction that we may at each m define a morphism from Bm, to Bm+1, and thus

for m m  we have (Bm, ) (Bm  , ), and thus (Bm, )mn provides a sequence of disjoint sets with measure
bounded away from zero, and a contradiction to (X) = 1. (Claim)
Fig. 4. The next morphism 3 takes its domain from the image of 0 . We do not rule out returning to the images of much earlier morphisms.

The next claim uses ergodicity for the first time.


Claim (5). X equals the union of the Dom( ), Ran( ) for .
Proof of Claim. Otherwise by ergodicity of E and Claims (3) and (4) we may find some 1 , m N, and
non-null A such that

Bn = ,
[ A]

nN

A Dom() Bm

and either
(1) Dom( ) A = all ; or
(2) there exists k and 0 , 1 , . . . , k such that k k1 . . . 0 (x) is well defined all x A and
k k1 . . . 0 [ A] and Dom( ) are disjoint for any .
We assume that (2) holds and that ; the other cases are exactly similar.


4
2
?

@
R
@

We can then let have domain k k1 . . . 0 [ A] and set

(x) = 01 11 . . . k1 (x).
We let A0 = A, A1 =Dom() \ A0 , +1 = { }, +1 = ( \ {}) {| A1 }. In this way we are able to
contest another round, contradicting the assumption that the construction ground to a halt at . (Claim)
Fig. 5. In this way we eventually ensure that all the space except for Dom(0 ) is in the range of some .

Proof of Claim. Otherwise we could simply let = < , and let consist of all | A, where some

< and as in (h) above A, = {A1 : , < , }. (Claim)

So from now on let us fix with + 1 = . For each let B0 = Dom(0 ), and for each n N let


Bn+1 = { [Bn ]| }.

Claim (2). For and n = m we have Bn , Bm disjoint.


Proof of Claim. By clause (e) in our construction and transfinite induction on .

71

(Claim)

We can then finish up the proof of the lemma by letting Bn,k be the set of x Bk such thatthere are
1 , 2 , . . . , nk with 1 . . . nk (x) well defined and n is the largest such integer. (In other words, kn Bn,k
is the set of elements whose orbit under has size exactly n + 1.) We use the disjointness of the morphisms in
to define the longed for 0 : for x Bn,k we consider the unique with x Dom( ) and let 0 (x) = (x). 

We now let B = nN Bn,0 . We are going to repeat the previous step, relativizing the process to B.
Lemma 3.2. There is a graphing of E, containing 0 along with a new morphism , with (Cn,k )nN,kn a
partition of B, such that:
(1) C ( ) C ( 0 );
(2) [Cn,k ] = C
n,k+1 all k < n;
(3) Dom( ) = k<n,nN Cn,k ;

72

A lemma for cost attained G. Hjorth

94

A lemma for cost attained G. Hjorth

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

Bn+3

@
6@
2
@
6
@
1
@
@
@
6
@
@
@
0
@
@
A

R
-@

Bn+2

Bn+1

95

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

With this granted it is straightforward to check that 3.1 applied to E| B , B =


. 
Now we can define a new morphism 1 with domain




Bn,k x|n x Bn,n , 0n (x)
nN,k<n

| B
(B) ,

, produces the requisite


Cm,k

mN,k<m

This morphism 1 will extend the old 0 , and so we simply set 1 (x) = 0 (x) for x

and 0n (x) mN,k<m Cm,k we let

nN,k<n

Bn,k . For x Bn,n

1 (x) = 0n (x).
From this we obtain a new graphing 1 of E with
1 = ( \ { 0 , }) {1 }.

Bn

Comparing 1 with 0 and 1 with 0 we discover the following:

nN Bn

Fig. 6. A typical case: the image of is what we want, but the domain intersects the domain of morphisms already in .

(4) for each with = , 0 we have


= |C ,
some C X , 0 ;
(5) for each 0 there is a partition (Di )iN of X such that
(i) | D0 ;
(ii) and at i > 0, |Ci equals some word built up from , 0 restricted to Ci .
Proof. We may first of all assume without any loss of generality that for each 0 \ { 0 } there are k,  with
Dom(( 0 )k ( 0 ) ), Ran(( 0 )k ( 0 ) ) B.

C ( 1 ) C ( 0 );
1 is still a graphing of E;
1 extends 0 and (X \ Dom(1 )) 12 (X \ Dom(0 ));
for any 0 we can partition Dom() into (Di )iN such that
(a) | D0 1 ;
(b) for each i > 0 there is i Z such that | Di = ( 1 )i | Di .
Plainly we can continue this indefinitely, obtaining a sequence ( n , n )nN where at each n

C ( n ) C ();
n is a graphing of E;
n extends n1 and (Dom(n )) 1 2n ;
for any n1 we can partition Dom() into (Di )iN such that
(a) | D0 n ;
(b) for each i > 0 there is i Z such that | Di = ( n ) | Di .
In the end we let

=
n
nN

6


and for each 1 we place into the morphism

| A, ,
k

where A, equals

{D : n  n (  , D = Dom(  ))}.
By considering the measure of the domain we actually have

- ?
B
We may then consider the graphing which for each 0 , = 0 , has the appropriate morphism
( 0 )k ( 0 ) for E| B .

73

: X X
(almost everywhere defined). By the nature of the definition of and the assumptions on the various n we have
that for any n we can partition Dom() into (Di )iN such that
(a) | D0 ;
(b) for each i > 0 there is i Z such that | Di = ( 1 )i | Di .

74

A lemma for cost attained G. Hjorth

96

A lemma for cost attained G. Hjorth

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

This gives us a new graphing of E containing a morphism : X X with C ( ) C (). We are now
going to take that whole step over again, adding in a new morphism but keeping hold of and not allowing it to be
changed. This requires relativizing 3.1 to .
0 of E which still contains , and a morphism
Lemma 3.3. Let E, , , be as above. Then there is a graphing
0 , and (Dn,k )nN,kn a partition of X , such that:
0
0 ) C ( );
(1) C (
(2) 0 [Dn,k ] = Dn,k+1 all k < n;

(3) Dom(0) = k<n,nN Dn,k ;
0 with = , 0 we have
(4) for each
= |C ,
some C X , ;
(5) for each there is a partition (Ci )iN of X such that
0;
(i) |C0
(ii) and at i > 0, |Ci equals some word in 0 , restricted to Ci .
Proof. This closely parallels the proof of 3.1. There is a difference in how we show we can continue at inductive
steps.
We build graphings

97

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

Again this construction stops at some successor ordinal = + 1, and as before at each and n N we can
let

D0 = Dom(),

Dn+1 = { [Dn ] : , n N}.





Again n = m implies Dn and Dm are


 disjoint. And again {Dom( ) Ran( ) : } equals nN Dn . And
the real battle is to show that nN Dn = X, and for this time around we have some more work. Suppose
again


nN Dn = X, and we try to show that after all we could have continued to define +1 , +1 , , .
Definition. Let F be the equivalence relation on X induced by the graphing { }.
Case 1. F is ergodic.
Then we can choose some , words , built up from { }, A0 A1 partitioning Dom( ), with


Dn \ {Dom( )| }
1 [ A0 ]
nN

[ A0 ] X \

Dn .

nN

1 .
After shrinking we may assume A0 Ran( ) some single , and then we can let = |
[A ]
0

( )< , ( )< ,
and morphisms :
(a) 0 is empty; 0 = \ { };
k [Dom()], we have
(b) 1 consists in a single morphism ,
where for some k, , and A =

A0

(c)
(d)
(e)
(f)
(g)


k

| A ;

for all < and we have Ran( ) Dom( ) empty;


if + 1 < , > 0, and +1 , then Dom( ) Ran(  ) some  ;

for we have and at a limit ordinal we have = < ;
if ,  are distinct, then their ranges are disjoint and their domains are disjoint;
each { } graphs E;
is the only morphism not appearing in +1 and for this there is a partition of Dom( ) into A0 , A1
such that

Dn \

X\

Dom( )

Dn

 

| A1 +1 ;

Fig. 7.

there are 1 , . . . ,  , 1 , . . . , k ( { }) , and +1 , with


Dom( ) = 1 . . . k [ A0 ]
and we have either
| A0 = 1 2 . . .  1 . . . k | A0
or
( | A0 )

= 1 2 . . .  1 . . . k | A0

(h) if is a limit then for < and , we either have or we have | A, , where

A, = {A1 : , }.

75

Case 2. F is not ergodic.


Then it follows that we may find Y1

nN

Dn , Y2 X \ (

nN

Dn ),

0 < (Y1 ), (Y2 ),


and for all y Y1 the equivalence class [y] F is disjoint from Y2 .
However E is ergodic and graphed by { }, and so we may find words 1 , 2 , . . . ,  from { }
and 1 , 2 , . . . , 1 with

Dn ,
 1 1 2 . . . 1 [Y1 ] X \
1 2 . . . 1 [Y1 ]

nN

Dn .

nN

76

A lemma for cost attained G. Hjorth

98

A lemma for cost attained G. Hjorth

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

99

With this general formulation observed and set to one side, let us continue with the proof for the specific case in
front of us.
of E, containing , 0 , along with a new morphism , and there is
Lemma 3.5. There is a graphing
0
(Hn,k )nN,kn , a partition of D, such that:
1


1


(1)
(2)
(3)
(4)


Dn \

X\

Dom( )

Dn

 

Z1

Fig. 8.

Again after possibly refining Y1 we may assume there is some word from such that


Dn \ {Dom( )| }.
1 2 2 . . . 1 [Y1 ]

= |C ,
0;
some C X,
0 there is a partition (Di )iN of X such that
(5) for each
0;
(i) | D0
(ii) and at i > 0, |Ci equals some word built up from , 0 , restricted to Ci .
0 for , D for A, { , 0 } for {1 , . . . , n } to obtain (Hn,k )nN,kn partitioning A
Proof. We apply the lemma to
and as required. 
With this claim granted, we can mimic earlier arguments and choose 0 with (Dom(0 )) = (Dom( )) +
(Dom( )), and { } graphing the same equivalence relation as { , 0 }. And then we may clearly continue with
n and n such that:
this over and over, obtaining at each n
n ) C (
0 );
C (
n is a graphing of E containing n , ;

n extends n1
and (Dom(n )) 1 2n ;
0 we can partition Dom() into (Di )iN such that
for any

n;
(a) | D0
(b) for each i > 0 | Di can be written in a word in n , .

nN

And we go onto another round with


=  1 1 | Z 1 ,
= 1 ,
where Z 1 = 2 . . . 1 [Y1 ].

We then let D = nN Dn,0 .

Continuing in this fashion we obtain some



=
n ,

Here is probably a good point to pause and formulate the general result. The proof of this general technical lemma
clearly follows from the above argument.
Lemma 3.4. Let F be an ergodic measure preserving equivalence relation on standard Borel space (Y, ), with all
classes countable, a finite Borel measure. Let be a graphing of F containing morphisms {1 , 2 , . . . , n }.
Let A Y be a set whose saturation under {1 , 2 , . . . , n } is conull that is to say, for almost all x Y
there is some y A and word w from {1 , 2 , . . . , n } with w (y) = x. Suppose further more that C ( )
Ran()
disjoint subsets
C ({1 , 2 , . . . , n }) + (A) and there is some \ {1 , 2 , . . . , n } having Dom(),
of A.

Then there is a graphing of F and a morphism for F such that:


(1) C ( ) C ( ); and if is a treeing then so too is ;
(2) for some partition (Yn,k )nN,kn of A we have
[Yn,k ] = Yn,k+1
all k < n, n N;

(3) Dom( ) = nN,k<n Yn,k ; and ;


(4) for each \ {1 , . . . , n , } we have = |C some C Y , ;
(5) for there is a partition (Ci )iN such that
(i) |C0 ;
(ii) at i > 0, |Ci equals some word in {1 , 2 , . . . , n , } restricted to Ci .

77

) C (
0 );
C (
0
[Hn,k ] = Hn,k+1 all k < n;

Dom( ) = k<n,nN Hn,k ;
with = , 0 , we have
for each

n , thereby completing the proof of 1.1 in the case


and as before we may define  to be the appropriate limit of the
that C () 2.
The general case of cost greater than some arbitrary integer is clearly exactly similar. We may also observe that the
last step from this argument suggests the following modification:
Lemma 3.6. Let F be an ergodic measure preserving equivalence relation on standard Borel space (Y, ), with all
classes countable, a finite Borel measure. Let be a graphing of F containing morphisms {1 , 2 , . . . , n }.
Let A Y be a set whose saturation under {1 , 2 , . . . , n } is conull that is to say, for almost all x Y
there is some y A and word w from {1 , 2 , . . . , n } with w (y) = x. Suppose furthermore that C ( )
Ran()
disjoint subsets
C ({1 , 2 , . . . , n }) + (A) and there is some \ {1 , 2 , . . . , n } having Dom(),
of A.
Then there is a graphing of F and a morphism for F such that:
(1)
(2)
(3)
(4)

C ( ) C ( ); and if is a treeing then so too is ;

: A A; ;
for each \ {1 , . . . , n , } we have = |C some C Y , ;
for there is a partition (Ci )iN such that
(i) |C0 ;
(ii) at i > 0, |Ci equals some word in {1 , 2 , . . . , n , } restricted to Ci .

78

A lemma for cost attained G. Hjorth

100

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

4. Corollaries
Lemma 4.1. A treeable ergodic measure preserving equivalence relation E with countable classes on a standard
Borel probability space has cost n if and only if it is induced by some free action of Fn .
Proof. The if direction is known from [3], so we concentrate on the converse.
We begin with a treeing of E; by [3], C () = n. Applying the argument of the last section we can find an
alternative graphing containing 1 , . . . , n , each
i : X X,
and with

Since n = C (E) C ( ) C () = n, we have equality throughout and hence = {1 , . . . , n }. 


Lemma 4.2. Let E be an ergodic measure preserving equivalence relation with countable classes on a standard Borel
probability space (X, ). If E has infinite cost and is treeable, then there is a free action of F giving rise to E as its
orbit equivalence relation.
Proof. Let = {1 , . . . , n , . . .} be a treeing of E with infinite cost. Without loss we may assume that each Dom(n )
is disjoint from Ran(n ).
Then iterating Lemma 3.6 we may find successive treeings
1 , 2 , . . . . , m , . . .
and morphism 1 , . . . , m , . . . and measurable sets (An,m )m<nN such that
each m = {1 , . . . m , m+1 | Am+1,m , m+2 | Am+2,m , . . .};
each i : X X is total;
each m can be written as a word in {1 , . . . , m };
for n > m we may partition Dom(m ) up into (Bi )iN such that
(i) B0 = An,m , and so n | B0 m ;
(ii) each n | Bi can be written as the restriction of a word in {1 , . . . , m }.

We finish with {i : i N} as a graphing of E. Since each m is a treeing so too is the limit, {i : i N}. Since
each i is total and since they jointly give rise to a treeing, we thus obtain the free action of F . 
Lemma 4.3. Let E be an ergodic measure preserving equivalence relation with countable classes on a standard Borel
probability space (X, ). If E is treeable, then there is a free action of a countable group G giving rise to E as its
orbit equivalence relation.
Proof. We at once can assume the cost is finite, or else the result follows with G = F from the last lemma. By
earlier results we may assume there is a treeing and some which is total. By Dyes theorem on the orbit
equivalence of ergodic Z-actions, we can assume that there is a sequence of subsets of X, (Ai )iN , such that each
(Ai ) = 2i ,
Ai+1 Ai ,
i

2 : Ai+1 Ai ,
i

and {Ai+1 , 2 [ Ai+1 ]} partitioning Ai . At each i we let i : Ai+1 Ai be given by


i

i = 2 | Ai+1 ;
note that {i : i N} graphs the same equivalence relation as {}.
We then build (ki )iN , k0 N, each ki+1 {0, 1}, and morphisms
i, j : Ai Ai
for j < ki , and treeings n for E such that:

79

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

101

(a) n = {i : i N} {i, j : i n, j < ki } { | B,n : }, each B,n some subset of Dom( );


(b) C ({i : i N} {i, j : i n, j < ki }) C ( ) 2n1 = C (E) 2n1 ;
(c) for each we may partition X into (Ci )iN such that
(i) |C0 n ;
(ii) each |Ci+1 equals some word in {i : i N} {i, j : i n, j < ki }.
n = {i : i
It follows from Lemma 3.6 that we may indeed construct such a sequence. Given the graphing
n = {} {i, j : i n, j < ki }, we can consider
N} {i, j : i n, j < ki }, or the morally equivalent graphing
n1

. If no, we just pass on with kn+1 = 0; if yes, then we apply 3.6 to obtain some
whether C (E) C (n ) + 2
n+1,0 : An+1 An+1

C ( ) C ().

(a)
(b)
(c)
(d)

A lemma for cost attained G. Hjorth

and graphing
n+1 = {i : i N} {i, j : i n + 1, j < ki } { | B,n+1 : },
and take the process to the next round.
This construction granted it follows from (c) that
= {i : i N} {i, j : i N, j < ki }
graphs E. Since each n is a treeing it follows that is a treeing. We will use it define a group action in some
natural way.
We let G be the group with generators {ai : i N}, {bi, j : i N, j < ki }. We will ask that this group be free
subject to the relations
a bi, j = bi, j a
for i > ,
(a )2 = 1,
a ak = ak a
all k, . For each i N we define a total function Ti : X X by first choosing for a.e. x the unique
x
{1, 0} such that
m 0x , m 1x , . . . , m i1
mx

mx

mx

i1
i2
yx = i1
i2
. . . 0 0 (x) Ai

and then letting


m 0x

Ti (x) = 0

m 1x

m x

. . . i1i1 i (yx )

if yx Ai+1 ,
m 0x

Ti (x) = 0

m 1x

m x

. . . i1i1 i1 (yx )

if yx Ai \ Ai+1 . (In other words we recursively define each Ti to be the unique Ti : X X of order 2 which
extends i and commutes with T j all j < i .) Similarly we define for j < ki
m 0x

Si, j (x) = 0

m 1x

m x

. . . i1i1 i, j (yx ).

We want to show that if we let each a act on X via T and each bi, j act via Si, j then firstly it is well defined as an
action of G and secondly that it is free a.e.
Claim (1).
Si, j T = T Si, j

80

A lemma for cost attained G. Hjorth

102

G. Hjorth / Annals of Pure and Applied Logic 143 (2006) 87102

all  < i ,
T = T1
T Tk = Tk T
all , k.
Proof of Claim. This follows quickly from the definitions.

(Claim)

Thus this assignment a T , bi, j Si, j extends to a homomorphism


G M (X),
g g ,
where M (X) is the group of invertible mpts on X and gives us a measure preserving action of G on X.
Claim (2). If g (x) = x for a non-null collection of x X then g = 1.

Randomness and
the linear degrees
of computability

Proof of Claim. Suppose g is as above and for a non-null set of x we have g (x) = x. We attempt to reduce g down
to 1 using the relations imposed as part of the definition of G.
We may write the group element in the form

Andrew E.M. Lewis and George Barmpalias

g = c0 c1 . . . c M
1
where each ck equals either ai(k) , bi(k), j (k) , or bi(k),
j (k) . After possibly replacing each ck by a suitable
m 0 m 1
a1

a0

i(k)1
i(k)1
. . . ai(k)1
ck ai(k)1
. . . a0 0

we may assume that there is a positive measure set A X such that for all x A

Annals of Pure and Applied Logic


145 (2007), Pages 252-257

g (x) = x
and at each k M we have
Uk+1 Uk+2 . . . U M (x) Ak(i) ,
1
where each U is respectively Ti() , Si(), j (), Si(),
j () , depending on whether c equals either ai() , bi(), j (), or
m

i(k)1
i(k)1
1
0 m 1
bi(),
a1 . . . ai(k)1
ck ai(k)1
. . . a0 0 as a consequence
j (); the key point is that in G the element ck equals a0
of the relations imposed in the definition of the group G. It then follows from being a treeing that we may reduce
the word

U0 U1 . . . U M | A
down to the identity by the operations of canceling various U U+1 when U+1 = U1 . Thus in particular it follows
that c0 c1 . . . c M will easily reduce to the identity in G and we are done. (Claim) 
Acknowledgement
The author was partially supported by NSF grants DMS 99-70403, DMS 01-40503.
References
[1] J. Feldman, C.C. Moore, Ergodic equivalence relations and von Neumann algebras, Transactions of the American Mathematical Society 234
(1977) 289324.
A. Furman, Orbit equivalence rigidity, Annals of Mathematics 150 (1999) 10831108.
D. Gaboriau, Cout des relations dequivalence et des groupes, Inventiones Mathematicae 139 (2000) 4198.
A.S. Kechris, B. Miller, Topics in Orbit Equivalence, in: Lecture Notes in Mathematics, vol. 1852, Springer, Berlin, 2004.
G. Levitt, On the cost of generating an equivalence relation, Ergodic Theory and Dynamical Systems 15 (1995) 11731181.
S. Popa, A class of cross-product factors by free groups, having trivial fundamental group, preprint UCLA 2001.

[2]
[3]
[4]
[5]
[6]

81

82

Randomness and the linear degrees of computability A. E. M. Lewis and G. Barmpalias

Randomness and the linear degrees of computability A. E. M. Lewis and G. Barmpalias

A.E.M. Lewis, G. Barmpalias / Annals of Pure and Applied Logic 145 (2007) 252257

253

values of a 6= 1 are of small relevance in the study of computability theory. From a computational point of view, then,
the linear reducibility can be seen as formalizing the notion of length efficient oracle computation.
Annals of Pure and Applied Logic 145 (2007) 252257
www.elsevier.com/locate/apal

Randomness and the linear degrees of computability


Andrew E.M. Lewis a,,1 , George Barmpalias b,2
a Dipartimento di Scienze Matematiche ed Informatiche Roberto Magari, Pian dei Mantellini 44, 53100 SIENA, Italy
b Department of Pure Mathematics, Leeds University, Leeds, LS2 9JT, United Kingdom

Definition 1.1. We say is linear reducible to ( ` ) if there is a Turing functional and a constant c such that
= and the use of this computation on any argument n is bounded by n + c. The Turing functionals which have
their use restricted in such a way are called `-functionals.
The linear reducibility (in particular the case where c = 0) was used in the recent work of Soare, Nabutovsky and
Weinberger on applications of computability theory to differential geometry (see Soare [10]). If we consider partial
computable functionals as operators from 2< to itself, the `-functionals are also closely related to the notion of
Lipschitz continuous operators.
Definition 1.2. A partial operator from a (pseudo-)metric space (X, d) to itself is Lipschitz continuous if there is
a constant C such that
d( (x), (y)) C d(x, y)

Received 12 March 2006; received in revised form 14 August 2006; accepted 21 August 2006
Available online 18 October 2006
Communicated by R.I. Soare

(1)

for all x, y in the domain of .


We consider the pseudo-metric d on 2< such that for incompatible strings and 0 , d(, 0 ) = 2n where n is the
least position where , 0 differ, and such that d(, 0 ) = 0 if and 0 are compatible.

Abstract
We show that there exists a real such that, for all reals , if is linear reducible to ( ` , previously denoted as sw )
then T . In fact, every random real satisfies this quasi-maximality property. As a corollary we may conclude that there exists
no `-complete 2 real. Upon realizing that quasi-maximality does not characterize the random reals there exist reals which
are not random but which are of quasi-maximal `-degree it is then natural to ask whether maximality could provide such a
characterization. Such hopes, however, are in vain since no real is of maximal `-degree.
c 2006 Elsevier B.V. All rights reserved.

Proposition 1.1. An `-functional is a partial computable and Lipschitz continuous operator from (2< , d) to itself.
Conversely, every partial computable and Lipschitz continuous operator : (2< , d) (2< , d) equals an `functional on infinite strings.
Proof. If is an `-functional, it is obviously partial computable but also Lipschitz continuous as a function on 2< .
0
Indeed, suppose we are given two finite binary strings , 0 such that d( , ) = 2t . If the use of on n is n + c
for some fixed constant c, d(, 0 ) must be at least 2(t+c) . Hence
0

d( , ) d(, 0 ) 2c

(2)

Keywords: Computability; Randomness; Degree

1. Introduction
In the process of computing a real given an oracle for it is natural to consider the condition that for the
computation of the first n bits of we are only allowed to use the information in the first n bits of . It is not
difficult to see that this notion of oracle computation is complexity sensitive in many ways. We can then generalize
this definition in a straightforward way by allowing that, in the computation of  n, access is permitted to  (n +c)
for some fixed constant c.
The study of oracle computations of this kind and of the reducibility they induce on 2 was initiated by Downey,
Hirschfeldt and LaForte [6,5], the motivation being that they might serve as a measure of relative randomness.
They presented the induced reducibility as a restriction of the weak truth table reducibility and gave it the (perhaps
unfortunate!) name strong weak truth table reducibilityor sw reducibility for short. After discussions with other
researchers in the area we introduce here the terminology linear reducible in place of strong weak truth table
reduciblewhile another reasonable contender for this title would certainly be the set of reductions in which the
use on argument n is bounded by an + c for some constants a and c it would seem that reductions of this type for

Corresponding author.

E-mail address: thelewisboy@hotmail.com (A.E.M. Lewis).


1 The first author was supported by EPSRC grant No. GR/S28730/01.
2 The second author was supported by EPSRC grant No. EP/C001389/1.

and is Lipschitz continuous. On the other hand, if is partial computable and Lipschitz continuous (say with
constant 2c ) we show that we can construct an `-functional which is equal to on infinite strings. To compute a total
on n knowing the first n + c bits of we effectively find an extension of  (n + c) such that (n) . Since
(2) holds, the distance between  (n + 1) and  (n + 1) will be less than 2n . So (n) = (n). 
The following are some results from the literature on the `-degrees (induced by ` ) which are relevant to our
present considerations. For more background on this structure we refer the reader to [7,5].
P
Definition 1.3. A Solovay test is a c.e. set S of binary strings such that S 2| | < . A real number avoids S
if for almost all S, 6 . A real is (Martin-Lof) random if it avoids all Solovay tests.
Definition 1.4. A real number is computably enumerable (c.e.) if it is the limit of a computable increasing sequence
of rationals.
The main justification for ` as a measure of relative randomness was the following:
Proposition 1.2 (Downey et al. [6]). If ` then for all n, the prefix-free complexity of  n is less than or equal
to that of  n (plus a constant).
In particular, then, ` preserves randomnessif is a random real and ` then is random, so that any
`-degree either contains only random or no random reals.
Yu and Ding proved the following:
Theorem 1.1 (Yu and Ding [11]). There is no `-complete c.e. real.
By a uniformization of their proof they got two c.e. reals which have no c.e. real `-above them. Hence:

c 2006 Elsevier B.V. All rights reserved.


0168-0072/$ - see front matter
doi:10.1016/j.apal.2006.08.001

83

Corollary 1.1 (Downey et al. [6]). The structure of `-degrees is not an upper semi-lattice.

84

Randomness and the linear degrees of computability A. E. M. Lewis and G. Barmpalias

254

Randomness and the linear degrees of computability A. E. M. Lewis and G. Barmpalias

A.E.M. Lewis, G. Barmpalias / Annals of Pure and Applied Logic 145 (2007) 252257

The main idea of their proof of Theorem 1.1 can be applied for the case of c.e. sets in order to get an analogous
result. Using different ideas Barmpalias [1] proved the following stronger result.

A.E.M. Lewis, G. Barmpalias / Annals of Pure and Applied Logic 145 (2007) 252257

255

Corollary 2.2 (The equivalent of the YuDing Theorem for the 2 Reals). There exists no `-complete 2 real.
Proof. This follows immediately from the previous corollary. 

Theorem 1.2 (Barmpalias [1]). There are no `-maximal c.e. sets. That is, for every c.e. set A, there exists a c.e. set
W such that A <` W .
Note that since the Solovay degrees and the `-degrees coincide on the c.e. sets (see [5]) the following also holds.

Proof. Every Turing degree above 00 contains a random real [8].

Corollary 1.2 (Barmpalias [1]). The substructure of the Solovay degrees consisting of the ones with c.e. members
(i.e. containing c.e. sets) has no maximal elements.

Corollary 2.4. The `-degrees are not an upper semi-lattice, in fact there exists a set of two `-degrees with no upper
bound.

In Barmpalias and Lewis [2] it was shown that there are c.e. reals that cannot be `-computed by any random c.e.
real. That is, for any c.e. real ` , is not random. Also, in Barmpalias and Lewis [3] it was shown that strictly
below every random `-degree there is another random `-degree. The first aim of this paper is to prove the following
(perhaps rather surprising) result.

Proof. Just choose any and which are random and Turing incomparable. 
Theorem 2.3 below, however, tells us that quasi-maximality does not characterize the random reals.

Theorem 1.3. There exists a (globally) quasi-maximal `-degree, i.e. there exists a real such that, for all reals , if
` then T . In fact every random real satisfies this quasi-maximality property.

Theorem 2.2 (Chaitin [4]). Consider a total computable prediction function f which, given an arbitrary finite initial
segment of a real , returns either no prediction, the next bit is a 0, or the next bit is a 1. If is random and f
predicts infinitely many bits of then in the limit the proportion of correct predictions to total predictions made tends
to 12 .

The fascination of this result lies in the fact that we are generally not used to degree structures possessing anything
like maximal elements in the global sense (where we consider the degrees of all reals).

Theorem 2.3. There exists of quasi-maximal `-degree which is not random.


We make the following definitions.

2. Random reals are quasi-maximal


Let i , the ith `-functional, satisfy the condition that the use in computing argument n is n + ci + 1 (should this
computation converge).
Definition 2.1. For 2< let (, i) be the number of strings of length + ci such that = i .
Lemma 2.1. For any , i we have ( 0, i) + ( 1, i) 2 (, i).
Proof. Consider the set of all one bit extensions of those strings of length + ci such that i = . There are
2 (, i) strings in this set. 
The key to analyzing the relationship between Martin-Lof randomness and quasi-maximality lies in a theorem of
Schnorrs on effective super-martingales.
Definition 2.2. A super-martingale is a function f : 2< 7 R+ {0} such that for all , 2 f ( ) f ( 0) + f ( 1).
We say that the super-martingale succeeds on a real if limsupn f (  n) .
Definition 2.3. We say that the super-martingale f is effective if (i) for all , f ( ) is a c.e. real and (ii) there is a
computable function f 0 such that, for all , { f 0 (, s)}s is an increasing sequence of rationals with limit f ( ).
Theorem 2.1 (Schnorr [9]). A real is Martin-Lof random iff no effective super-martingale succeeds on .
The proof of Theorem 1.3. For all , i let i ( ) = (, i). Since each function i can be effectively approximated
from below, Lemma 2.1 says precisely that every i is an effective super-martingale.

So suppose given and such that is a random real and i = . By Schnorrs theorem we may define m ? to be
= m. Let Tn be all those strings of length
the maximum m such that there exist an infinite number of n, i (  n)S
n + ci such that i is the initial segment of of length n and let T = n Tn . We say that a real lies on T if all but
finitely many initial segments are in T . Since there are a finite number of 0 lying on T there exists 0 such that
if 0 6= lies on T then 0 6 0 . Now suppose we are given 0 which is not an initial segment of . Using an
oracle for we can enumerate all tuples (n, 1 , .., m ? ) such that Tn = {1 , .., m ? } until we find such a tuple with no
m compatible with whereupon we can deduce that is not an initial segment of .
Corollary 2.1. There exist low reals which are of quasi-maximal `-degree.
Proof. There exist low random reals [7].

85

Corollary 2.3. Every Turing degree above 00 contains a set of quasi-maximal `-degree.

Definition 2.4. For 2< let (, i) = min{ ( 0 , i) 0 }. Let ? (, i) be the least string 0 such that
( 0 , i) = (, i).
Lemma 2.2. Given 0 , i, let 1 = ? (0 , i). For all 2 1 we have (2 , i) = (0 , i).
Proof. By induction on the length of 2 . So suppose given 2 1 such that (2 , i) = (0 , i). Now if
(2 0, i) < (0 , i) or (2 1, i) < (0 , i) this would contradict the fact that 1 = ? (0 , i). Thus by Lemma 2.1
(2 0, i) = (2 1, i) = (0 , i). 

Lemma 2.3. Given 0 , i, let 1 = ? (0 , i). For all 1 and all such that i = we have that T .
Proof. Given and as in the statement of theS
lemma, let Tn be all those strings of length n + ci such that i is
the initial segment of of length n and let T = n Tn . The following facts follow immediately from the fact that, by
Lemma 2.2, there are precisely the same number of strings (actually (0 , i)) in Tn for all sufficiently large n.
(i) There are a finite number of reals lying on T (at most (0 , i)).
(ii) We can compute (not just enumerate) T using an oracle for .
By (i) there exists 0 such that if 0 6= lies on T then 0 6 0 . If we are given 1 0 which is not an initial
segment of then using an oracle for it follows by (ii) that we can find n such that there are no extensions of 1 in
Tn . 
For all , define f ( ) = {n : (n) = 0}. If is a random real then, by Theorem 2.2:
()limn

1
f (  n)
= .
n
2

The construction.
Let 0 beS
the empty string. Given i let i0 = ? (i , i) and then define i+1 to be i0 concatenated with 2i0 zeros.
Define = i i .
The verification.

Since i+1 it follows by Lemma 2.3 that if = i then T . We have that is not random since it
clearly does not satisfy ().

86

Randomness and the linear degrees of computability A. E. M. Lewis and G. Barmpalias

256

Randomness and the linear degrees of computability A. E. M. Lewis and G. Barmpalias

A.E.M. Lewis, G. Barmpalias / Annals of Pure and Applied Logic 145 (2007) 252257

3. Maximality
Having proved that quasi-maximality does not characterize the random reals it is natural to ask whether maximality
might provide such a characterization. With the following theorem, however, we are able to answer this question in
the negative.
Theorem 3.1. No real is of maximal `-degree.
Proof. Let the `-functionals 0 and 1 be defined inductively as follows. Suppose d {0, 1}.
(i) For both strings of length 1 we define d = d.
(ii) If is of the form 2n for some n 1 then let 0 be the initial segment of of length 2n 1. There exists a unique

1 6= 0 of length 2n 1 such that d1 = d0 . If 0 is the leftmost of 0 , 1 then define d = d0 0 and otherwise

define d = d0 1.
(iii) If is not of the form 2n for any n 0 then let 0 be the initial segment of of length 1. Let c = ( 1)

and define d = d0 c.
It is important to have an intuitive picture of the above inductive definition. Consider the range of 0 . We begin by
branching the empty sequence with two 0s. From then on, at levels 2n (for any n) we extend with either two 1s or two
0s according to whether there is another node of the identity tree which is on the left and which is 0 -mapped to the
same string as the node we are on or not. At all other levels we extend the strings as we would the identity treethat
is, a 0 on the left branch and a 1 on the right branch. It can easily be seen that 0 has the following properties.

A.E.M. Lewis, G. Barmpalias / Annals of Pure and Applied Logic 145 (2007) 252257

257

[2] G. Barmpalias, A.E.M. Lewis, A c.e. real that cannot be sw-computed by any -number, Notre Dame Journal of Formal Logic 47 (2) (2006)
197209.
[3] G. Barmpalias, A.E.M. Lewis, Random reals and Lipschitz continuity, Mathematical Structures in Computer Science 16 (5) (2006).
[4] G. Chaitin, Algorithmic Information Theory, Cambridge University Press, 2004.
[5] R. Downey, D. Hirschfeldt, G. LaForte, Randomness and reducibility, Journal of Computer and System Sciences 68 (2004) 96114.
[6] R. Downey, D. Hirschfeldt, G. LaForte, Randomness and reducibility, in: Mathematical Foundations of Computer Science, in: Lecture Notes
in Comput. Sci., vol. 2136, Springer, Berlin, 2001, pp. 316327.
[7] R. Downey, D. Hirschfeldt, Algorithmic Randomness and Complexity, Monograph (in preparation).
[8] A. Kucera, Measure, 01 Classes and Complete Extensions of PA, in: Springer Lecture Notes in Mathematics, vol. 1141, Springer-Verlag,
1985, pp. 245259.
[9] C. Schnorr, A unified approach to the definition of a random sequence, Mathematical Systems Theory 5 (1971) 246258.
[10] R. Soare, Computability theory and differential geometry, Bulletin of Symbolic Logic (in press).
[11] L. Yu, D. Ding, There is no SW-complete c.e. real, Journal of Symbolic Logic 69 (4) (2004) 11631170.

Further reading
[1]
[2]
[3]
[4]
[5]
[6]

G. Barmpalias, A.E.M. Lewis, The ibT degrees of computably enumerable sets are not dense, Annals of Pure and Applied Logic (in press).
C.S. Calude, A characterisation of c.e. random reals, Theoretical Computer Science 271 (12) (2002) 314.
R. Downey, D. Hirschfeldt, A. Nies, Randomness, computability, and density, SIAM Journal of Computation 31 (4) (2002) 11691183.
R. Downey, Some recent progress in algorithmic randomness, 2004 (preprint).
P. Odifreddi, Classical Recursion Theory, North-Holland, Amsterdam Oxford, 1989.
R. Soare, Recursively Enumerable Sets and Degrees, Springer-Verlag, Berlin, London, 1987.

For every , 0 and is a string of the same length.


For every string which begins with 0 there exist exactly two incompatible 0 , 1 such that

00 = 01 = .

If | | = 2k + c < 2k+1 consider the two i such that 00 = 01 = . Then 0 , 1 differ at their c-th bit from the
end, i.e. their | | c 1 bit. In particular, if is of length 2k they differ on their last bit.

For every real which begins with 0 there is a unique such that 0 = .
So now suppose given a real and without loss of generality that (0) = 0. Then there exists a unique such

that 0 = . If is of `-degree strictly above then we are done. So suppose instead that we are given i such
0

that i = . We shall define 2 for which there exists a total tree of reals 0 such that 2 = . This suffices to
give the result, since then we can pick 0 on this tree which is not Turing below . Pick n 0 large enough such that
2n 0 ci > 2n 0 1 + 1.
(i) For all which are of length < 2n 0 , 2 = 0 .
(ii) If > 2n 0 , but is not of the form 2n for any n 0 then let 0 be the initial segment of of length 1. Let

c = ( 1) and define 2 = 20 c.

(iii) If is of the form 2n for some n n 0 , then let 0 be the initial segment of of length 2n 1. Let = 20 ,

c = i (2n1 1) and define 2 = 20 c.


Now if n n 0 1, then for every string of length 2n such that 2 is compatible with , there exist two strings
0
0 of length 2n+1 such that 2 is compatible with the point being that (2n 1) = (2n+1 1). 
Acknowledgements
Both authors were partially supported by the NSFC Grand International Joint Project, No 60310213, New
Directions in the Theory and Applications of Models of Computation.
References
[1] G. Barmpalias, Computably enumerable sets in the Solovay and the strong weak truth table degrees, in: Proceedings of the CiE 2005
Conference in Amsterdam, in: Lecture Notes in Computer Science.

87

88

On gaps under GCH type assumptions M. Gitik

On gaps under
GCH type
assumptions

Annals of Pure and Applied Logic 119 (2003) 1 18


www.elsevier.com/locate/apal

On gaps under GCH type assumptions


Moti Gitik
School of Mathematical Sciences, Tel Aviv University, 69978 Tel Aviv, Israel

Received 12 October 2000; accepted 20 November 2001


Communicated by T. Jech

Moti Gitik
Annals of Pure and Applied Logic
119 (2003), Pages 1-18

Abstract
We prove equiconsistency results concerning gaps between a singular strong limit cardinal 
of co1nality 0 and its power under assumptions that 2 = ++1 for   and some weak form
of the Singular Cardinal Hypothesis below . Together with the previous results this basically
completes the study of consistency strength of the various gaps between such  and its power
c 2002 Elsevier Science B.V. All rights reserved.
under GCH type assumptions below. 
MSC: Primary 03E35; 03E55
Keywords: Pcf-theory; Extenders; Forcing

0. Introduction
Our 1rst result deals with cardinal gaps.
We continue [8] and show the following:
Theorem 1. Suppose that  is a strong limit cardinal of co#nality 0 ;  is a
cardinal of uncountable co#nality. If 2 + and the Singular Cardinal Hypothesis
holds below  at least for cardinals of co#nality cf , then in the core model either
(i) o()++1 + 1 or
(ii) { | o()++1 + 1} is unbounded in .
Together with [2,7] this provides the equiconsistency result for cardinal gaps of
uncountable co1nality. Surprisingly the proof uses very little of the indiscernibles theory
for extenders developed in [8]. Instead, basic results of the Shelah pcf-theory play the
crucial role.
E-mail address: gitik@math.tau.ac.il (M. Gitik).
c 2002 Elsevier Science B.V. All rights reserved.
0168-0072/03/$ - see front matter 
PII: S 0 1 6 8 - 0 0 7 2 ( 0 2 ) 0 0 0 3 1 - 3

89

90

On gaps under GCH type assumptions M. Gitik

On gaps under GCH type assumptions M. Gitik

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

Building on the analysis of indiscernibles for uncountable co1nality of [8] and


pcf-theory we show the following:

Results of Sections 2 are built on short extender-based Prikry forcings, mainly those
of [2].

Theorem 2. If for a set a of regular cardinals above 2|a|


there is an inner model with a strong cardinal.

+2

|pcf a||a| + 1 then

Using this result, we extend Theorem 1 to ordinal gaps.


Theorem 3. Suppose that  is a strong limit cardinal of co#nality 0 ;  is a

cardinal above 1 of uncountable co#nality and !. If 2 + and the Singular


Cardinal Hypothesis holds below  at least for cardinals of co#nality cf , then in
the core model either

(i) o()+ +1 + 1 or

(ii) { | o()+ +1 + 1} is unbounded in .


If the pcf-structure between  and 2 is not wild (thus, for example, it their is no
measurable of the core model between  and 2 ), then the result holds also for  = 1 .
These theorems and related results are proved in Section 1 of the paper. Actually
more general results (Theorems 1.20 and 1.21) are proved for ordinal gaps but the
formulations require technical notions Kinds and Kinds and we will not reproduce
them here. In Section 2 we sketch some complimentary forcing constructions based
on [2]. Thus, we are able to deal with cardinal gaps of co1nality 0 and show the
following which together with Theorem 1 provides the equiconsistency for the cases of
co1nality 0 .
Theorem 4. Suppose that in the core model  is a singular cardinal of co#nality 0 ;  is a cardinal of co#nality 0 as well and for every
 the set
{ | o()+
} is unbounded in . Then for every + there is a co#nalities
preserving, not adding new bounded subsets to  extension satisfying 2 + .

1. On the strength of gaps



6
Let SSH
(SSH
) denote the Shelah Strong Hypothesis below  for co1nality
(6) which means that for every singular cardinal
 of co1nality (6) pp(
) =
+

. We assume that there is no inner model with a strong cardinal. First we will prove
the following:

Theorem 1.1. Suppose that  is a singular strong limit cardinal of co#nality 0 ; 


cf 
a cardinal of uncountable co#nality, 2 + and SSH
. Then in the core model
either
(i) o()++1 + 1 or
(ii) { | o()++1 + 1} is unbounded in .
Remark 1.2. (1) in either case we have in the core model a cardinal  carrying an
extender of the length ++1 .
(2) By [7] or [2] it is possible to force, using (i) or (ii), the situation assumed in
the theorem. So this provides equiconsistency result.
Proof. If  is a regular cardinal then let A be the set of cardinals +
+1 so that
 and
either o()+
for every +
or else +
is above every measurable of the core
model smaller than + . The set A is unbounded in + since there is no overlapping
extenders in the core model. If cf  then we 1x i | icf  an increasing sequence
of regular cardinals with limit . For every icf  de1ne Ai to be the set of cardinals
+
+1 so that
i and either o()+
for every +
or else +
is above every
measurable of the core model smaller than +i . Again, each of Ai s will be unbounded
in +i since there is no overlapping extenders in the core model.

The RadoMilner paradox is used to show the following:


The following fact was proved in [8, 3.24]:
Theorem 5. Suppose that in the core model  is a singular cardinal of co#nality 0 ;  is a cardinal of uncountable co#nality and for every n! the set
n
{ | o()+ } is unbounded in . Then for every + there is co#nalities
preserving not adding new bounded subsets to  extension satisfying 2 + .
A more general result (2.6) of the same Iavor is obtained for ordinal gaps.
In the Section 3, we summarize the situation and discuss related open questions and
some further directions.
A knowledge of the basic pcf-theory results is needed for Section 1. We refer to the
BurkeMagidor [1] survey paper or to Shelahs book [13] on these matters. Results
on ordinal gaps and the strength of |pcf a||a| require in addition familiarity with
basics of indiscernible structure for extenders. See GitikMitchell [8] on this subject.

91

Claim 1.3. If B A in case cf  =  or B Ai for some icf , in case cf  then


|B|inf B implies max(pcf (B)) = (sup B)+ .

Now for every ++1 A or ++1 icf  Ai (if cf ) we pick a set {cn | n!}
++1
pcf {cn | n!}. Set
of regular cardinals below  so that 
a=


Ai otherwise :
cn | n !; ++1 A if cf  =  or ++1

icf

Removing its bounded part, if necessary, we can assume that min a|a|+ .

92

On gaps under GCH type assumptions M. Gitik

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

n; m =Dm )+ , where

Claim 1.4. For every b a | A pcf (b)|6|b| or |Ai pcf (b)|6|b|, for every icf ,
if cf .

each n; m is a regular cardinal and for every m! cf (


Dm is an ultra#lter on ! including all co#nite sets.

Proof. It follows from Shelahs Localization Theorem [13] and Claim 1.3.

Remark 1.7. The theorem implies results of the following type proved in [8]: if 2 =
+m (2m!) and GCH below , then o()+m + 1, provided that for some k!
the set of  such that o()+ is bounded in .

In particular, |a| = .
Let b+ [a] be the pcf-generator
corresponding to + . Consider a = a\b+ [a]. For

every 0, if ++1 A or icf  Ai then ++1 pcf (a ). Hence, |(pcf a ) A| = 
or |pcf (a ) Ai | = i for each icf  and by Claim 1.4, then |a | = .
Claim 1.5. Let 
n | n! be an increasing unbounded in  sequence of limit points
of a of co#nality cf . Then for every ultra#lter D on ! including all co#nite
sets

+
=D
+ :
cf
n
n!

Proof. For every n!;


n is a singular
cardinal of co1nality cf . So, by the

+
assumption pp(
n ) =
+
t=E), for every unbounded in
n set of regn . Then
n = cf(
ular cardinals with |t|
n and an ultra1lter E on it including all cobounded subset of

t. In particular,
+
n pcf (a
n ) since
n is a limit point of a .

So {
+
|
n!}

pcf
a
.
By
[13],
then
pcf
{

|
n!}

pcf
(pcf a ) = pcf
n
n
a . But
by the choice of a ; + = pcf a . Hence for every ultra1lter D on !, cf. ( n!
+
n=
D) = 4 .
Now, |a | = ; a = ; cf 0 and cf  = 0 . Hence, there is an increasing unbounded in  sequence 
n | n! of limit points of a so that for every n0 | a
(
n1 ;
n )| =  and |(a
n )\| =  for every 
n . By Claim 1.5, 
+
n | n! are
limits of indiscernibles. We refer to [8] for basic facts on this matter used here. There
+
is a principal indiscernible n 6
n for all but 1nitely many ns. By the Mitchell Weak
+
Covering Lemma,
+
n in the sense of the core model is the real
n , since
n is singular.
This implies that n 6
n , since a principal indiscernible cannot be successor cardinal
of the core model. Also, n cannot be
n , since again
+
n computed in the core model
correctly and so there is no indiscernibles between measurable now
n and its successor

+
n . Hence n
n . By the choice of
n , the interval (n ;
n ) contains at least  regular
cardinals. So n is a principal indiscernible of extender including at least  + 1 regular
cardinals which either seats over  or below . This implies that either o()++1 +1
or { | o()++1 + 1} is unbounded in .
Using the same ideas, let us show the following somewhat more technical result:



Theorem 1.6. Let  = n! n be a strong limit cardinal with 0 1 n .
1
Assume 2 ++ and SSH
(Shelah Strong Hypothesis below  for co#nality 1 ,
i.e. pp
=
+ for every singular
 of co#nality 1 ). Then there are at most countably many principal indiscernibles n; m | m; n! with indiscernibles n; m | m; n!
so that for each n; m! n 6n; m 6n; m ; n; m is the principal indiscernible of n; m ,

93

On gaps under GCH type assumptions M. Gitik

n!

Proof. Suppose otherwise.


Collapsing if necessary 2 to ++ , we can assume that 2 = ++ . Let n; i | n!;
i!1  and n; i | n!; i!1  witness the failure of the theorem. We can assume that
for every n! and ij!1
n;i 6 n;i n;j 6 nj :
a\b + [a]. Then for every i!1 the set
Let a = n; i | n!; i!1 . Consider a =
ci = a {n; i | n!} is in1nite, since cf ( n! n; i =Di ) = ++ for some Di .
The following is obvious.
Claim 1.8. There is an in#nite set d ! such that for every n d there are uncountably many is with n; i ci .
For every n d let

n = sup{n;i | n;i Ci }:
Then each such
n is a singular cardinal of uncountable co1nality. Also,

+
+

+
n pcf a for every n d, since pp
n =
n . But then pcf {
n | n d} pcf a . Hence
+
+
|
n

d}.
Now,
this
implies
as
in
the
proof
of
1.1
that

s
are
indiscernibles
 = pcf {
+
n
n
and there are principal indiscernibles for
+
n s below
n . Here this is impossible since
then there should be overlapping extenders. Contradiction.
We will use Theorem 1.6 further in order to deal with ordinal gaps.
As above, we show the following assuming that there is no inner model with a
strong cardinal.
Proposition 1.9. Suppose that 
 |  is an increasing sequence of regular cardinals.  is a regular cardinal 1 and
0 2 . Then there is an unbounded S  such
that for every  of uncountable co#nality which is a limit of points of S the following
holds:
( ) for every ultra#lter D on  S including all cobounded subsets of  S



bd

+1 ;
tcf

 =D = tcf

 =JS
S

S

bd
denotes the ideal of bounded subsets of  S.
where JS

Proof. Here we apply the analysis of indiscernibles of [8] for uncountable co1nality.
Let  | 6 be the increasing enumeration of the closure of 
 | . Let A 

94

On gaps under GCH type assumptions M. Gitik

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

be the set of indexes of all principal indiscernibles for  among  s (). Then A
is a closed subset of . Now split into two cases.
Case 1: A is bounded in .
Let  = sup A. We have a club C  so that for every  C;  ( ; ) if  is a
principal indiscernible, then it is a principal indiscernible for an ordinal below  . Now
let  be a limit point
co1nality. Then by results of [8]. pp  = 
of C of uncountable

and moreover tcf (   =Jbd


)
=

.
So
we
are done.


Case 2: A is bounded in .

Let A be the set of limit points of A. For every  A we consider +1 . Let +1
be

the principal indiscernible of +1 . Then  6+1


6+1 .
The following is the main case:

Subcase 2.1: For every  in an unbounded set S ; +1


is a principal indiscernible
for  and +1 is an indiscernible belonging to some O+1 over  of co1nality !1
in the core model.
We consider the set B = {O+1 |  S}. If |B|, then we can shrink S to set S  of
the same cardinality such that for every ;  S  O+1 = O+1 . Now projecting down to
limit points of S  of uncountable co1nality we will obtain ( ) of the conclusion of the
theorem. So, suppose now that |B| = . W.1. of g., we can assume that  implies
O+1 O+1 . Now, by [8], B (or at least its initial segments) is contained in the length
of an extender over  in the core model. There is no overlapping extenders, hence

O+1 =Jbd = (sup({O+1 |  S}))+ ;


tcf

ultrapower and i +1 the image of i which is the critical point of the embedding.
Fix for every i a sequence ci unbounded in i , in the core model and of cardinality cf i there. Take a precovering set including {ci | i}. By [8], assignment
functions can change for this new precovering set only on a bounded subset of i s.
Pick i such that i is above supremum of this set. Again, consider the ultrapower
used to move from i to i +1 . Now we have ci in this ultrapower and its cardinality
is i . Let j : M M  be the embedding. ci M  and M  is an ultrapower by extender. Hence for some
and fci = j(f)(
). Let U
= {X i |
j(X )} and j : M M
i ) by i +1 ; j(
i ) = ci and j(f)([id])

= ci .
be the corresponding ultrapower. Denote j(
Let ci = j(f) )([id]) | )) = cf i = cf i  be increasing enumeration (everything in
the core model). Then for most s (mod U
) f() = f) () | ))  will be a sequence in M co1nal in i of order type ). Which contradicts the assumption that
cf i i .

Let us use Proposition 1.9 in order to deduce the following:


Theorem 1.10. Suppose that there is no inner model with strong cardinal then for
+
every set a of regular cardinals above 2|a| +2 |pcf a|6|a| + 1 .
Remark. If a is an interval then |pcf a| = |a| by [8, 3.24].

where the successor is in sense of the core model or the universe which is the same
by the Mitchell Weak Covering Lemma. Also, for every  which is a limit point of S
of uncountable co1nality

bd
tcf
O+1 =JS = (sup{O+1 |  S })+ :

Proof. Suppose that for some a as in the statement of the theorem |pcf a||a|+1 . Let
 = |a|+ + 2 . Then |pcf a|. Pick an increasing sequence 
 |  inside pcf (a).
By 1.9 we can 1nd an unbounded subset S of  satisfying the conclusion ( ) of 1.9.
including all cobounded subsets of S. Let
= cf
Let D be an ultra1lter on  
( 
 =D). Then, clearly,
( 
 )+ . By the Localization Theorem [13], then
there is a0 {
 |  S}; |a0 |6|a| with

pcf a0 . Consider S\ sup a0 . S\ sup a0 D
since a0 is bounded in S. Hence cf ( S\ sup a0
 =D) =
. Again by the Localization
Theorem, there is a1 S\ sup a0 ; |a1 |6|a| and
pcf a1 . Continue by induction and
de1ne a sequence a | !1  such that for every !1 the following holds:

Projecting down we obtain ( ).

is not a principal indiscernible for


Subcase 2.2: Starting with some   each +1
 or it is but +1 corresponds over  to some O+1 which has co1nality  in the
core model.

is not a principal indiscernible for  , then


Suppose for simplicity that  =0. If +1
we can use functions of the core model to transfer the structure of indiscernibles over

+1 to the interval [ , length of the extender used over  ]. This will replace +1 be a

is a principal
member of the interval. So let us concentrate on the situation when +1
indiscernible for  but O+1 has co1nality 6 ().
Let us argue that this situation is impossible. Thus we have increasing sequences
i | i6; i | i and i | i such that for every i
rhoi is between i and the length of the extender used over i ; cf i i in the
core model, i is the image of i over i +1 and cf i i +1 in the core model.
Then cf i i again in the core model since i is the image of i in the

a S
|a |6|a|

pcf a
min a sup a for every .

Let  = !1 sup a . Then  is a limit of points of S and cf  = 1 . Hence ( ) of


bd
1.9 applies. Thus tcf ( S
 =JS
) exists below
+1 and is equal to tcf ( S

 =F)
F on  S including all cobounded subsets of  S. Denote
for every ultra1lter
bd
) by +. Let c = pcf (a) and b) [c] | ) pcf (a) = c be a generating
tcf ( S
 =JS
sequence. Clearly both + and
are in c and +
. Consider b = b
[c]\b+ [c].
For every !1 ; b a = , since
pcf (a ). Hence, b  S is unbounded in 
on  S including b  S.
(by (d) of the choice of a s). Let F be an ultra1lter

And all cobounded subsets of  S. Then tcf ( S
 =F) = + but this means that
+ pcf b, which is impossible by the choice of b, see for example [1, 1.2].

S

S

95

On gaps under GCH type assumptions M. Gitik

(a)
(b)
(c)
(d)

96

On gaps under GCH type assumptions M. Gitik

On gaps under GCH type assumptions M. Gitik

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

The proof of Theorem 1.10 easily gives a result related to the strength of the negation
of the Shelah Weak Hypothesis (SWH). (SWH says that for every cardinal , the
number of singular cardinals , with pp , is at most countable).

k1
Proof. We prove the statement by induction on ). Fix + . Let ) = 00 k1
,
where 0 = . Set for each .k1

k2 k1 1
n1 1
0

n2 k1
.+00 k2
k1
+1

(.) = ++0
Theorem 1.10.1. Suppose that there is no inner model with strong cardinal. Then for
every cardinal ,22

if (k1) or (k = 1 and 0 1) and


(.) = ++.+1

|{ , | cf   and pp  ,}| 6 1 :


Now we continue the task started in Theorem 1.1 and deal with ordinal gaps.
Let us start with technical de1nitions.
Kinds = {00

11

k1
k1

De'nition 1.11. Let


| k!; 1 6 0 ; : : : ; k1 !; 0 1
k1 are cardinals of uncountable co1nality} {0}, where the operations used are the
ordinals operations.
Remark 1.12. The only kinds around !1 are !1 itself, !12 ; : : : ; !1n (n!). But
already with !2 we can generate in addition to !2 ; !22 ; : : : ; !2n (n!) also !2
!15 !219 !13 etc. Note that between !1! and !2 there are no new kinds. Using the
RadoMilner paradox we will show in the next section that the consistency strength
of the length the gap does not change in such an interval.

k2
k1
|  is an ordinal of kind 00 k2
k1
(.) pcf ({
++1
n; i

; ii(n); n!}).

Let E be the set consisting of all regular cardinals of blocks Bn; i (n!; ii(n))
+
together with all regular cardinals between  and min(+ ; 2 ). Set E = pcf E. Then
|pcf E |, since  is strong limit. We can assume also that min E |pcf E |. By
[13] then pcf E = E and there is a set b/ [E ] | / E  of pcf E generators which is
smooth and closed, i.e.
b/ [E ] implies b
[E ] b/ [E ] and pcf (b/ [E ]) = b/ [E ].
Assumption (2) of the lemma implies that for every unbounded in ++) set B
consisting of regular cardinals above  and below ++) max pcf (B) = ++)+1 . In
particular max pcf {(.) | .k1 }) = ++)+1 . Denote ++)+1 by +. Let
A = b+ [E ] {(.) | . k1 }:

De'nition 1.13. Let - be an ordinal

Then, |A | = 
k1 and for every , A b, [E ] b+ [E ]. For every , A 1x a sequence
+
,n | n! n! n+1
inside b, [E ] such that

(a) - is of kind 0 if - is a limit ordinal.


(b) - is of kind 0 for a cardinal 0 Kinds if - is a limit of an increasing sequence
of length 0 . In particular, if 0 is regular this means that cf - = 0 .
k1
Kinds, with 0 1, if - is a limit of an increasing
(c) - is of kind 00 11 k1

(a) ,n Bn; i for some ii(n)


and, if ) =  then also
k2
k1 1
k1
.
(b) ,n is of kind 00 k2

k1
sequence of k1 ordinals of kind 00 11 k1

Lemma 1.14. Let  be a strong limit cardinal of co#nality 0 ;  a cardinal of


uncountable co#nality. Assume
6
SSH

(1)
+
(2) there is no measurable cardinals in the core model between  and + .

97

if k = 1 and 0 = 1, i.e. ) = .
For every .k1 , if ) =  then by induction

It is possible to 1nd ,n s of the right kind using the inductive assumption, as was
observed above.
Claim 1.16. There are in#nitely many n! such that
|{,n | , a }| = k1 :

Let 0) Kinds [; + ) and 2 ++) , for some + . Then ++)+1
|  is an ordinal of kind ); ii(n); n!}, where
ni denotes the principcf {
++1
n; i
pal indiscernible of the block Bn; i , as de#ned in 1.6.

Proof. Otherwise by removing 1nitely many ns or boundedly many ,n s we can assume that for every n|{,n | , A }|k1 . But cf k1 0 . Hence, the total number of
,n s is less than k1 . Now, pcf {,n | n!; , A } A . So, |A pcf {pn, | n!; ,
A }||A | = k1 . By (2) of the statement of the lemma this situation is impossible.

Remark 1.15. (a) The lemma provides a bit more information then will be needed for
deducing the strength of 2 = +)+1 .
(b) Condition (2) is not very restrictive since we are interested in small () gaps
between  and its power.

Suppose for simplicity that each n! satis1es the conclusion of the claim. If not,
then we just can remove all the bad ns. This will eQect less than k1 of s which
in turn eQects less than k1 of ,s.

98

On gaps under GCH type assumptions M. Gitik

On gaps under GCH type assumptions M. Gitik

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

Let us call a cardinal


reasonable, if for some n!
is a limit of k1 -sequence
k1
, since ,n s
of elements of {,n | , A }. Clearly, a reasonable
is of kind 00 k1

1nd A A of cardinality k1 (or just unbounded in +) and k ; 16k 6 such


that for every , A k(,) = k . Then A b+k [E ]. But, recall that + = max pcf (B)
for every unbounded subset B of A . In particular, + = max pcf (A ). Hence, +
pcf A pcf (b+k [E ]) = b+k [E ].
Lemma 1.14 implies the following:

10

k2
k1
k1
are of kind 00 11 k2

. The successor of such


is in pcf {,n | , A }
cf 

since cf
= cf k1 and we assumed SSH k1 , i.e. pp
=
+ . Also pp
=
+ implies
that the set {,n | , A }\b
+ [E ] is bounded in
.

11

Theorem 1.20. Let  be a strong limit cardinal of con#nality 0 ; 0) Kinds. Assume that

Claim 1.17. pcf {


+ |
is reasonable} b+ [E ].
Proof. {,n | n!} b, [E ] for every , A . Also, b, [E ] b+ [E ]. By the above, for
every reasonable
;
+ pcf {,n | , A } for some n!. But pcf (b+ [E ]) = b+ [E ] and
pcf {,n | n!; , A } pcf (b+ [E ]) since the pcf generators are closed and {,n | n
!; , A } b+ [E ]. So, {
+ |
is reasonable} b+ [E ] and again using closedness of
b+ [E ]. We obtain the desired conclusion.

6|)|

(1) SSH
(2) there are no measurable cardinals in the core model between  and +)+1 .
If 2 +) , then in the core model either
(i) o()+)+1 + 1 or
(ii) { | o()+)+1 + 1} is unbounded in .

Claim 1.18. For every + pcf {


+ |
is reasonable}; b+ [E ] b+ [E ].
Proof. By the smoothness of the generators b+ [E ] b+ [E] for every + pcf {
+ |

is reasonable}.
In order to conclude the proof we shall argue that there should be + pcf {
+ |
is
reasonable} such that + b+ [E ]. This will imply b+ [E ] = b+ [E ] and hence + = + .
Let us start with the following:

Claim 1.19. |{,n | n!; , A }\ {b
+ [E ]|
is reasonable} |k1 .

Proof. Suppose otherwise. Let S = {,n | n!; , A }\ {b
+ [E ]|
is reasonable}
and |S| = k1 . Then for some n! also {,n | ,n S} has cardinality k1 , since
cf k1 0 . Fix such an n and denote {,n | ,n S} by Sn .
But now there is a reasonable
which is a limit of elements of Sn . pp
=
+
implies that the set {,n | , A }\b
[E ] is bounded in
. In particular, Sn b
+ [E ]
is unbounded. Contradiction, since Sn S which is disjoint to every b
+ [E ] with

reasonable.
{,n

Now, removing if necessary


less than  elements, we can assume that
| n!; ,

A } is contained in {b
+ [E ] |
is reasonable}. Recall that this can eQect only less

than  ,s in A which has no inIuence on +.


Let b = pcf {
+ |
is reasonable}. Then pcf b = b and b E . By [13], there are
+1 ; : : : ; + pcf b = b such that b b+1 [E ] b+ [E ]. Using the smoothness of
generators, we obtain that for every reasonable

there is k; 16k6 such that b


+ [E ]

b+k [E ]. Now, {,n | n!; , A } {b
+ [E ] |
is reasonable}. Hence, {,n | n!;

, A } k=1 b+k [E ].
For
every
, A 1x an ultra1lter D, on ! including all co1nite sets so that

tcf ( n! ,n =D, ) = ,. Let , A . There are x, D, and k(,); 16k(,)6 such
that for every n r, ,n b+k(,) [E ]. Then , pcf (b+k(,) [E ]) = b+k(,) [E ]. Finally, we

99

Proof. By Lemma 1.14, for in1nitely many ns for some ik i(n) the length of the
++1
for an ordinal 
block Bn; in will be at least
+)+1
n; in , since it should contain some
n; in
of kind ). Clearly, ) since ) is the least ordinal of kind ).
We like now outline a way to remove (2) of Theorem 1.20 by cost of restricting
possible )s. First change De1nitions 1.11 and 1.13. Thus in De1nition 1.11 we replace
uncountable by above 1 . Denote by Kinds the resulting class. Then de1ne kind
of ordinal as in De1nition 1.13 replacing Kinds by Kinds .
Theorem 1.21. Let  be a strong limit cardinal of co#nality 0 ; 0) Kinds . As6|)|
sume SSH . If 2 +) , then in the core model either
(i) o()+)+1 + 1 or
(ii) { | o()+)+1 + 1} is unbounded in .
The theorem, as in the case of 1.20, will follow from the following:
Lemma 1.22. Let  be a strong limit cardinal of co#nality 0 ;  a cardinal of
6
. Let 0) Kinds [; + ] and 2 ++) for
co#nality above 1 . Assume SSH
some + . Then
|  is an ordinal of kind ); i i(n); n !}
pcf {
++1
ni
[++)+1 ; ++)+)+1 ] = :
Let us 1rst deal with a special case) is a cardinal. We split it into two cases: (a) )
is regular and (b) ) is singular. The result will be stronger than those of Remark 1.22.
Lemma 1.23. Let  be a strong limit cardinal of co#nality 0 ;  is a
6
. Let 2 ++ for some + .
regular uncountable cardinal. Assume SSH

100

On gaps under GCH type assumptions M. Gitik

12

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

Then
+++1 pcf ({
++1
| i i(n); n ! and  is an ordinal
ni
of co#nality }):
Proof. Let + = +++1 . We choose E and b/ [E ] | / E  as in the proof of
Lemma 1.14. Measurables of a core model between  and 2 are allowed here. So in
contrast to Lemma 1.14 we cannot claim anymore for every unbounded B [; ++ )
consisting of regulars max pcf (B) = +++1 . Hence the choice of A (the crucial for
the proof set in Lemma 1.14) will be more careful.
Set A to be the set of cardinals ++
+1 [++1 ; ++ ) such that either o()++

for every ++


or else ++
is above every measurable of the core model
smaller than ++ . Clearly, |A| = , since there is no overlapping extenders and as
in Theorem 1.1 |(pcf b) A|6|b| for every set of regular cardinals b ; |b|6.
By Claim 1.3, max pcf (B) = +++1 for every unbounded B A. This implies that
A\b+ [E ] is bounded in +++1 . De1ne A = A b+ [E ]. The rest of the proof completely repeats 1.14.
Lemma 1.24. Let  be a strong limit cardinal of co#nality 0 ;  is a regular
6
. Let 2 ++ for some + .
cardinal of uncountable co#nality. Assume SSH
Then pcf ({
+1
ni | ii(n); n! and  is a limit of an increasing sequence of the length
}) [+++1 ; ++++1 ] = .
Proof. Let i | icf  be an increasing continuous sequence of limit cardinals unbounded in . Consider the set
B = {++i + | i cf ; i limit and  i }:
Since cf 
0 , the analysis of indiscernibles of [8, Section 3.4] can be applied to show
that {cf ( B=D) | D is an ultra1lter over B extending the 1lter of cobounded subsets
of B} {+++1 | 66 + }.
We cannot just stick to +++1 alone since we like to have  cardinals below ++ .
+++1
| })
But once measurable above  allowed, it is possible
that max pcf ({
+++1 . Still by [13], for a club C cf  tcf ( C +++1 =coboundedC)=+++1 .
Unfortunately, this provided only cf  many cardinals +++1 and not -many.
De1ne a 1lter D over B:
+i +)+1
X D iQ {icf  | i is limit and {ji | {+
X } is cobounded in +
j |
j }
is cobounded in i} contains a club.

Let D be an ultra1lter extending D. Set + = cf( B=D ). By the choice of D, for
every C B of cardinality less than B\C D. So, + [+++1 ; ++++1 ]. De1ne E
as before. Set A = B b+ [E ].

Claim 1.25. If A D .

101

On gaps under GCH type assumptions M. Gitik

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

13

Proof. Otherwise the compliment of A is in D . Let A = B\b+ [E ]. Clearly, D J+


[E ] = . By [1, 1.2], then there is S D S J+ [E ]\J+ [E ]. But b+ [E ] generates
J+ [E ] over J+ [E ]. So, S b+ [E ] c for some c J+ [E ]. Hence, S b+ [E ]
D . But A D and A B (S b+ [E ]) = . Contradiction.
Now we continue as in the proof of Lemma 1.14. In order to eliminate possible
eQects of less than  cardinals, we use Theorem 1.10. At the 1nal stage of the proof
a set A was de1ned. Here we pick it to be in D . This insures that + pcf A and
we are done.
Now we turn to the proof of Lemma 1.22.
Proof. As in Lemma 1.14, we prove the statement by induction on ). Fix + . Let
k1
. The case k = 1 and 0 = 1 (i.e. ) = ) was proved in Lemmas 1.23
) = 00 k1
and 1.24. So assume that 1 or (k = 1 and 0 1). For each .k1 let
(.) pcf ({
++1
| i i(n); n ! and  is an ordinal of kind
n;i

k2
k1
00 k2
k1

}) [++)

.+) +1

; ++)

.+) +) +1

];

where
) =


k2
k1 1

 0 k2
k1

k1
if ) = 00 k1

and

(k 1

or (k = 1 and 0 1)

0;

if k = 1

and

0 = 1

In the last case the inductive assumption insures the existence of such (.).
De1ne E and b/ [E ] | / E  as in the proof of Lemma 1.14. We do not know
now if for every unbounded in ++) set B [; ++) ) consisting of regular cardinals

max pcf (B) = ++)+1 . We may consider the set {++) +1 | k1 }. If for club

++) +1
is not a principle indiscernible then by [8] cf ( B=bounded) =
many s 
++)+1
++)

for any unbounded subset B of 
consisting of regular cardinals. Note that
cf k1 0 is crucial here. In this case we de1ne A ={(.) | .k1 } b++)+1 [E ]
and proceed as in the proof of Lemma 1.14. The only diQerence will be the use
of 1.10 to eliminate a possible inIuence of k1 cardinals. Here the assumption
k1 1 comes into play. In the general case it is possible to have {(.) | .k1 }

bk ++)+1 [E ] empty. But once for a club of s below k1 +) +1 s are principal
indiscernibles, by [8] we can deduce that
pcf ({(.) | . k1 })\++)
[++)+1 ; ++)+)

+) +1

] [++)+1 ; ++)+)+1 ]:

Let D be an ultra1lter on the set {(.) | .k+1 } containing all cobounded subsets.
Set



+ = cf
{(.) | . k1 }=D :

102

On gaps under GCH type assumptions M. Gitik

On gaps under GCH type assumptions M. Gitik

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

De1ne A = b+ [E ] {(.) | .k1 }. By Claim 1.25, then A D. From now we


continue as in Lemma 1.14 only using Theorem 1.10 in a fashion explained above and
at the 1nal stage picking A inside D.

the measure an () of En . In present situation we force over the principal indiscernible
n , i.e. one corresponding to the normal measure of En . The extender-based Magidor
forcing changes its co1nality to  and adds for every -n 6-6+n+2
a sequence tnn
of order type  co1nal in n . Actually, tn; +n+2
(i) = +n+2
n; i (i), where ni | i is
n
+n+2
the sequence tnn . Now, if -n
is produced by an (), then we connect  with
the sequence tn- in addition to its connection with -. Using
about
standard arguments++
for
Prikry-type forcing notions, it is not hard to see that cf ( n! +n+2
n; i ; 1nite) = 
every i as witnessed by tn- (i)s.

14

Remark 1.26. The use of Kinds and not of Kinds in Theorem 1.21 (or actually in
Lemma 1.22) is due only to our inability to extend Theorem 1.10 in order to include
the case of a countable set. Still in view of Theorem 1.1 and also Lemmas 1.23 and
1.24, the 1rst unclear case will not be !1 but rather !1 + !1 .
2. Some related forcing constructions
In this section, we like to show that (1) it is impossible to remove SSH assumptions
from Theorem
1.6; (2) the conclusion of Theorem 1.11 is optimal, namely, starting

n
with  = n! n ; 0 1 n and o(n ) = n+ +1 + 1 we can construct a
model satisfying 2 + for every  , where  as in Proposition 1.9 is a cardinal
of uncountable co1nality; (3) the forcing construction for s of co1nality 0 will
be given. All these results based on forcing of [2] and we sketch them modulo this
forcing.
+n

Theorem 2.1. Suppose that for every n!{ | o() } is unbounded in .


Then for every  there is a cardinal preserving generic extension such that it
has at least  blocks of principal indiscernibles n;  | n!;  so that


(i) n;  n;  n+1; 0 for every n!;  



n! n;  =  for every  and

++
, for every .
(iii) tcf ( n! +n+2
n;  ; #nite) = 
(ii)

Proof. Without loss of generality, we can assume that  is a regular cardinal. We pick
an increasing sequence n | n! converging to  so that for every n! o(n ) =
n+n+2 +  + 1. Fix at each n a coherent sequence of extenders Ein | i6 with Ein of
the length n+n+2 .
We like to use the forcing of [2, Section 2] with the extenders sequence En | n!
to blow power of  to ++ together with extender-based Magidor forcing changing
co1nality of the principal indiscernible of En to  (for every n!) simultaneously
blowing its power to the double plus. We refer to Segal [12] or Merimovich [10] for
generalizations of the Magidor forcing to the extender based Magidor forcing.
The de1nitions of both of these forcing notions are rather lengthy and we would not
reproduce them here. Instead let us emphasize what happens with indiscernibles and
why (iii) of the conclusion of the theorem will hold.
Fix n!. A basic condition of [2, Section 2] is of the form an ; An ; fn , where an is
an order preserving function from ++ to n+n+2 of cardinality n ; An is a set of measure one for the maximal measure of rngan which is in turn a measure of the extender
En over n . The function of fn is an element of the Cohen forcing over a + . Each
 doman is intended to correspond to indiscernible which would be introduced by

103

15

Remark 2.2. Under the assumptions of the theorem, one can obtain 2 + for any
countable . But we do not know whether it is possible to reach uncountable gaps.
See also the discussion in the Section 3.
Theorem 2.3. Suppose that  is a cardinal of co#nality !;  is a cardinal of
n
uncountable co#nality and for every n! the set { | o()+ } is unbounded
in . Then for every + there is co#nality preserving, not adding new bounded
subsets to  extension satisfying 2 + .
Remark 2.4. By the results of the previous section, this is optimal if  [
at least if one forces over the core model.

n!

n ; + ),

Proof. Fix an increasing sequence 0 1 n converging


 to  so that
n
each n carries an extender En of the length n+ . W.l. of g.  n! n . We use
the RadoMilner
Paradox (see [9, Chapter 1, ex. 20]) and 1nd Xn (n !) such

that  = n! Xn and otp(Xn )6n . W.l. of g. we can assume that each Xn is closed
and Xn Xn+1 (n!). Now the forcing similar to those of [2, Section 5.1] will be
applied. Assign cardinals below  to the cardinals {++1 | 166} as follows: at
level n elements of the set {++1 |  + 1 Xn } will correspond to elements of the set
{n+n+-+1 | -n }.
The next de1nition repeats 5.2 of [2] with obvious changes taking in account the
present assignment.
De'nition 2.5. The forcing noting P() consists of all sequences A0 ; A1 ; F   | 6
so that
(1) A0 ; A1  | 6 is as in 4.14 of [2].
(2) For every 6F  consists of p = pn | n! and for every n(p); pn = an
An :fn  as in 4.14 of [2] with the following changes related only to an ;
(i) an (+ ) = n+n () where n is some 1xed in advance order preserving function
from successor ordinals in Xn to successor ordinals of [n + 2; n ).
(ii) Only of cardinalities + for  Xn Successors can appear in dom an .
The rest of the argument repeats those of [2].
The following is a more general result that deals with all kinds (i.e. elements of
Kinds) of ordinals and not only with n s.

104

On gaps under GCH type assumptions M. Gitik

On gaps under GCH type assumptions M. Gitik

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

16

n1
Theorem 2.6. Let  be a cardinal of co#nality ! and 00 k1
 Kinds .

} is unbounded in
Suppose that for every n! the set { | o()
k1
+ there is co#nality preserving, not adding new
. Then for every 00 k1

+
bounded subsets to  extension satisfying 2  .
Again, this is optimal by results of the previous section, if


k1
k1
00 k1
n ; 00 k1
+

17

Table 1

k1 n
+00 k1


M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

=2

o()++
or
n!{ | o()+n }
is unbounded in 

20

o()+ + 1
or
n!{ | o()+n }
is unbounded in 

at least if one forces over the core model in case  = 1 . The construction is parallel to those of Theorem 2.3, only we use the following version of RadoMilner
Paradox:

k1
k1
n ; 00 k1
+ ) there are Xn (n !) such
For every  [ n! 00 k1

k1
0
n
that  = n! Xn and otp(Xn )0 k1  .

||{ | o()+
}
is unbounded in 

cf || = 0

n!

0

cf ||0

 is a cardinal

o()++1 + 1
or
{ | o()++1 + 1}
is unbounded in 

 = || ,
for some
1!

o()+|| +1 + 1
or

{ | o()+|| +1 + 1}
is unbounded in 

!{ | o()+|| }

Under the same lines we can deal with gaps of size of a cardinal of countable
co1nality below . Thus the following result which together with the results of the
previous section provides the equiconsistency holds:

||


!

is unbounded in 

Theorem 2.7. Suppose that  is a cardinal of co#nality ! and  is a cardinal


of co#nality ! as well. Assume that for every
 the set { | o()+
} is
unbounded in . Then for every + there are co#nalities preserving, not adding
new bounded subsets to  extension satisfying 2 + .
The proof is similar to those of Theorem 2.3. Only notice that we can present  as
an increasing union of sets Xn (n!) with |Xn | since + ; cf  = ! and there is
a function from  onto .

k1
k1
+
0
00 k1
!
k 60 k1 k

k1
k Kinds
for some 00 k1

0
kk kk

n!{ | o()+0

00 kk 600 kk !1

for some 00 kk Kinds

is unbounded in 
0

k +1

o()+0 k
or

+1

+00 kk+1

+ 1}
{ | o()
is unbounded in 


{ | o()+
}
is unbounded in 

3. Concluding remarks and open questions


Let us 1rst summarize in Table 1 the situation under SSH (i.e. for every singular
+ pp + = ++ ) assuming that 2 + for some , where  as usual here in a strong
limit cardinal of co1nality 0 . For  = 1 , for 26!, in the cases dealing with
ordinals in Kinds\Kinds we assume in addition that there is no measurable of the
core model between  and + .
The proofs are spread through the papers [2 8] and the present paper. The forcing
constructions in these papers give GCH below .
Let us 1nish with some open problems.
Question 1. Let a be a countable set of regular cardinals. Does |pcf a||a| = 0
imply an inner model with a strong cardinal?

105

In view of Theorem 1.10, it is natural to understand the situation for countable a.


Recall that the consistency of |pcf a||a| is unknown and it is a major question of
the cardinal arithmetic.
The next question is more technical.
Question 2. Can the assumption that there are no measurables in the core model between  and 2 be removed in De1nition 1.11?
It looks like this limitation is due only to the weakness of the proof. But probably there is a connection with |pcf a||a|. The simplest unclear case is
2
2 +!1 .

106

On gaps under GCH type assumptions M. Gitik

M. Gitik / Annals of Pure and Applied Logic 119 (2003) 1 18

18

The situation without SSH is unclear. In view of 2.1 probably weaker assumptions
then those used in the case of SSH may work. A simplest question in this direction
is as follows.
Question 3. Is { | o()+n } unbounded in  for each n! suTcient for  strong
limit, cf  = 0 and 2 +!1 ?
If the answer is aTrmative, then the construction will require a new forcing with
short extenders, which will be interesting by itself. We then conjecture that the same
assumption will work for arbitrary gap as well.
For uncountable co1nalities (i.e. cf 0 ), as far as we are concerned with consistency strength, the only unknown case is the case of co1nality 1 . We restate a
question of [8]:
Question 4. What is the exact strength of  is a strong limit, cf  = 1 and 2 , for
a regular ,+ ?
It is known that the strength lies between o() = , and o() = , + !1 , see [8].

The ManinMumford
conjecture and the
model theory of
difference fields

Acknowledgements
We are grateful to Saharon Shelah for many helpful conversations and for explanations that he gave on the pcf-theory.
References
[1] M. Burke, M. Magidor, Shelahs pcf theory and its applications, Ann. Pure Appl. Logic 50 (1990)
207254.
[2] M. Gitik, Blowing-up power of a singular cardinal-wider gaps, Ann. Pure Appl. Logic, to appear.
[3] M. Gitik, Wide gaps with short extenders, math. LO=9906185.
[4] M. Gitik, The negation of SCH from o() = ++ , Ann. Pure Appl. Logic 43 (3) (1989) 209234.
[5] M. Gitik, The strength of the failure of SCH, Ann. Pure Appl. Logic 51 (3) (1991) 215240.
[6] M. Gitik, There is no bound for the power of the 1rst 1xed point, submitted for publication.
[7] M. Gitik, M. Magidor, The singular cardinals problem revisited, in: H. Judah, W. Just, H. Woodin
(Eds.), Set Theory of the Continuum, Springer, Berlin, 1992, pp. 243279.
[8] M. Gitik, W. Mitchell, Indiscernible sequences for extenders and the singular cardinal hypothesis, Ann.
Pure Appl. Logic 82 (1996) 273316.
[9] K. Kunen, Set Theory: An Introduction to Independence Proofs, North-Holland, Amsterdam, 1983.
[10] C. Merimovich, Extender based radin forcing, Trans. AMS, to appear.
[11] W. Mitchell, The core model for sequences of measures, Math. Proc. Cambridge Philos. Soc. 95 (1984)
4158.
[12] M. Segal, Masters Thesis, The Hebrew University, 1993.
[13] S. Shelah, Cardinal arithmetic, Oxford Logic Guides 29 (1994).
[14] S. Shelah, The singular cardinal problem. Independence results, in: A. Mathias (Ed.), London
Mathematical Society, Lecture Note Series, Surveys in Set Theory, vol. 87, Cambridge University
Press, Cambridge, pp. 116 133.

107

Ehud Hrushovski
Annals of Pure and Applied Logic
112 (2001), Pages 43-115

108

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

44

pp. 3739; 27, p. 73]. This was proved by Raynaud. Indeed he proved (with K a
denoting the algebraic closure of a 'eld K):

Annals of Pure and Applied Logic 112 (2001) 43115


www.elsevier.com/locate/apal

The ManinMumford conjecture and the model


theory of di&erence 'elds
Ehud Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

Department of Mathematics, Hebrew University, Jerusalem, Israel


Received September 1995; accepted 17 April 2001
Communicated by A.J. Wilkie

Abstract
Using methods of geometric stability (sometimes generalized to 'nite S1 rank), we determine
the structure of Abelian groups de'nable in ACFA, the model companion of 'elds with an
automorphism. We also give general bounds on sets de'nable in ACFA. We show that these
tools can be used to study torsion points on Abelian varieties; among other results, we deduce
a fairly general case of a conjecture of Tate and Voloch on p-adic distances of torsion points
c 2001 Elsevier Science B.V. All rights reserved.
from subvarieties. 
MSC: 03C; 11G; 12H10
Keywords: Abelian varieties; Torsion points; Di&erence 'elds; Geometric stability;
Model theory of di&erence equations

1. Introduction
This paper extends and specializes the general model theory of di&erence equations
in characteristic 0, developed in [9]. We investigate the induced structure on de'nable
Abelian groups of 'nite dimension. We also give general bounds on the number of solutions to a 'nite set of di&erence equations. As a corollary, we obtain a model-theoretic
proof of the ManinMumford conjecture. The proof yields new number-theoretic information, particularly with respect to p-adic and algebraic uniformities of the bounds
obtained.
1.1. The ManinMumford conjecture
The ManinMumford conjecture states that if C is a curve of genus two or more,
embedded in its Jacobian J , then the set of torsion points on C is 'nite, see [18,
1 The work was begun at MIT, with support from the NSF. The latter part was supported by the ISF.
E-mail address: ehud@math.huji.ac.il (E. Hrushovski).

Theorem 1.1.1. Let A be an Abelian variety; X a subvariety; over a number 4eld


K. Let T (A) denote the group of torsion points of A(K a ). Then there exist a 4nite
number of suvarieties ci + Ai of X; all translates of group subvarieties Ai of A; such

that T (A) X = i T (Ai ) + ci .
Subsequently, Hindry [11] found a version with an explicit bound on the 'nite number, and McQuillan generalized the theorem to semi-Abelian varieties (in a strong relative version, see Theorem 1.1.4 below). In the case of one-dimensional X (with some
further restrictions) an e&ective proof was independently given by Buium [4]. Buium
uses a new method, of p-jets, whose relation to ours is unclear and highly intriguing.
We will give another proof of these statements. All proofs deal 'rst with the points
of order relatively prime to a given prime p. The number theoretic proofs depend on
the following idea: if K is a number 'eld, p a 'xed prime, then there exists a constant
c = c(K; A) such that for all prime-to-p torsion points t of A, ct and t are conjugate
over K. Thus if X is de'ned over K, and t X then also ct X and this puts t into
a smaller-dimensional variety, contained in X c1 X . From our point of view, this
proof can be interpreted as intersecting X with solutions to the di&erence equation
(x) = cx. Our proof will also use a di&erence equation, but a higher order one, of the

form
mi i (x) = 0. We will obtain it from the characteristic equation of a Frobenius
acting on the torsion points of A. To show that torsion points are solutions will be
straightforward, and e&ective (by contrast to the deep study of the Galois representation
required in the number theoretic approaches). To understand the intersection, however,
both qualitatively and quantitatively, we will require the theory of di&erence equations.
Our proof does not use completeness, and applies directly to semi-Abelian varieties.
With a little additional work (adding the model theoretic concept of orthogonality, to
the local modularity used before), one obtains:
Theorem 1.1.2. Theorem 1:1:1 is valid for arbitrary commutative algebraic groups.
For example, let G be a nonsplit group extension of an elliptic curve E by the
additive group Ga , and let X be any curve on the surface G. Then X T (G) is 'nite.
E6ective bounds and uniformity: Fix a projective embedding of A, and hence of any
subvariety X .
Theorem 1.1.3. Let A be a commutative algebraic group; de4ned over some number
4eld K. Let X be a subvariety; de4ned over K a . Then there exist a 4nite number M
of suvarieties ci + Ai of X; all translates of group subvarieties Ai of A; such that

T (Ai ) + ci :
T (A) X =
i

We have M 6 c deg(X )e ; with c and e depending on A but not on X .

c 2001 Elsevier Science B.V. All rights reserved.


0168-0072/01/$ - see front matter 
PII: S 0 1 6 8 - 0 0 7 2 ( 0 1 ) 0 0 0 9 6 - 3

109

110

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

45

c and e can be (and are, in Section 5) written down explicitly; they are doubly
exponential in some natural parameters associated with A.
Compare this to the bound in [11]. The bound there is given for X de'ned over a
'xed number 'eld K, and is not given uniformly in K. There is a constant, depending
on K and A, whose existence follows from Serres work on Galois representations, but
was not known to be e&ective [18, p. 39; 29].
I learned from the referee report that the constant was later shown to be e&ective, as a
corollary of an alternative transcendence proof of the Tate and Shafarevich conjectures
by Masser and WNustholz [20], and subsequent work by Bost and David [1]. 2 At all
events, the bounds given here are independent of this Galois theoretic data.
The bound we obtain can be written down more easily if one prime is excluded
from the torsion. Let ZCl(Y ) denote the Zariski closure of Y .
Example 1.1.1. Let A be a connected, commutative algebraic group de'ned over a
number 'eld K. Let p be a prime of good reduction, with residue 'eld GF(q) of
characteristic p. Let Tp be the group of points of A(K a ) of 'nite order prime to p.
Let X be a subvariety of A. Then ZCl(X Tp ) is a union of at most
4d (2dr +1) log2 (1+q1=2 ) 22dr dim(X )

(deg(X )2dr +1 d+ r

cosets of group subvarieties of A, contained in X .


Here dr is a certain dimension below dim(A), and d+ is a degree associated to the
graph of addition on A; they will be explained below. If A = E m for an elliptic curve
E, then dr = 1.
Remark 1.1.2. All the quantities in Example 1.1.1 are constant in algebraic families.
Thus the bound remains uniform if A and X move in an algebraic family, so long as
A remains de'ned and with good reduction over Kp . In particular, A may be moved
in a p-adic neighborhood in moduli.
For the special case when X is a curve, or just contains no cosets of in'nite group
subvarieties, the bound obtained in the general case is similar to the restricted case
(Example 1.1.1). In general, however, the bound and proof are more complicated
(though still explicit, and of doubly exponential growth in the totality of parameters).
We also give a di&erence algebraic=model theoretic proof of McQuillans theorem
[21]:
Theorem 1.1.4. Let A be a semi-Abelian variety over a number 4eld K. Let X be
a subvariety of A; and let D = {a A(K a ) : na A(K) for some n}. There exists an
integer m and group subvarieties Ai of A such that

(D C):
(D X ) =
C
2

References and quote from the very useful referee report, for which I am grateful. The referee adds:
This might also be a good place to cite the important paper of Faltings [9], which underlay Serres result.

111

46

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

The union is taken over a 4nite set of subvarieties C X of the form Ai + c; with
mc A(K). The integer m and the subvarieties Ai can be determined e6ectively.
Tate--Voloch conjecture. Tate and Voloch conjectured that the torsion points on an
Abelian variety A over Cp that do not lie on a subvariety V A, are bounded away
from that variety. Certain special cases were proved by TateVoloch, and by Buium
and Silverman. The proof of the ManinMumford conjecture given above lends itself
immediately to a proof of the TateVoloch conjecture under much weaker restrictions:
A must be assumed de'ned over a 'nite extension of Qp ; must have good reduction;
and the prime-to-p torsion points only are considered. We show this in Lemma 6:6:1.
Structure of the proof of ManinMumford: The proof moves from number theory
through algebra to model theory; the main work is done there. The 'rst step is to
embed the group of torsion points into a group de'ned by di&erence equations. All
the number theory used occurs here. Beyond this point we have a di&erence algebra
setup, and we study it with model theoretic binoculars. Three levels of model theory
are used.
(1) Quanti'er elimination (to a certain level). Consider, for instance, the problem of
showing that if two groups A and B satisfy the conclusion of ManinMumford, then
so does A B. This involves considering projections to B of subvarieties X (A B),
and their 'bers. If A; B are taken to be groups of rational points, or torsion points, the
projections are notoriously undecidable (GNodel). But if we take larger groups de'ned
in a structure with quanti'er elimination, then the projections are de'ned by similar
formulas. This provides a reasonable context in which to carry out a proof.
(2) A dimension theory is developed to study the de'nable sets. We use S1-rank.
This is much younger than Morley rank, and we need to develop the foundations to
some extent (Section 3). The very existence of the dimension theory suQces for some
purposes. For instance, by comparing dimensions, one sees that any de'nable group
has 'nite index in one arising directly from an algebraic group (A 'ner argument can,
in fact, completely give the structure of de'nable groups in terms of algebraic groups.)
(cf. Remark 4.0.3).
(3) Model theory also permits second-order arguments; properties of the class of all
de'nable sets are often more amenable to devissage, more functorial under interpretations, than of individual de'nable sets. A very simple example occurs in Proposition 3.4.1; see the remark following it. Another instance occurs in [5], and enters the
present paper via the key De'nition 4.1.2 (where a dichotomy is found between groups
satisfying ManinMumford, and those embeddable into the set of points of the 4xed
4eld of an algebraic group). Within [5] one uses a relation, for arbitrary structures
with a certain dimension theory, between Galois theory, amalgamation, and modularity
(modularity is the abstract form of ManinMumford, or MordellLang). This makes
no sense for a single algebraic variety; it can be applied, roughly speaking, to an
appropriate class of varieties, i.e. to a structure interpretable in (enriched) algebraic
geometry.

112

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

47

The model theoretic part of the proof is very similar to that of the geometric Mordell
Lang conjecture in [14]; but the algebraic layer involves 'elds with automorphisms,
rather than derivations, in order to be able to say something about number 'elds; as
a consequence the relevant structures are not stable, leading to the use of a di&erent
dimension theory.
1.2. Di6erence algebra and model theory
A (k-fold) di&erence 'eld is a 'eld with distinguished automorphism 1 ; : : : ; k . The
deeper results of [5] are available only for k = 1. We will always assume k = 1 except where explicitly stated otherwise. We also restrict attention to characteristic zero
throughout the paper. The (only) reason is that the deeper results of [5] are available only with this assumption. We expect entirely similar results concerning de'nable
subgroups of semi-Abelian varieties in positive characteristic; for subgroups of vector
groups new phenomena are encountered [5, Section 7]. 3 The direct analogue of the
ManinMumford conjecture is of course false for Abelian varieties de'ned over 'nite
'elds; once these are ruled out (in the appropriate sense), the result follows from [14].
The theory of di&erence 'elds of characteristic zero has a model companion; it plays
the same role for di&erence 'elds as the algebraic closure does for a 'eld. See [24] or
[25] or [7] for this notion in general, and [5] for the case of di&erence 'elds (with a
single autormorphism). We give a quick summary.
Denition 1.2.1. Let (K; 1 ; : : : ; k ) be a di&erence 'eld. K is di6erence-closed if K is
algebraically closed, and the following condition holds:
Let X be an irreducible K-variety, Xi the variety obtained by conjugating by i . Let
Y be an irreducible subvariety of X X1 Xk , projecting dominantly to each
factor. Then there exists a X (K) with (a; 1 (a); : : : ; k (a)) Y .
We will say that (K; 1 ; : : : ; k ) is a universal domain if whenever K  is a relatively algebraically closed sub'eld, (K  ) K  , K  is a countable di&erence 'eld,
and i : K  K  is an embedding of di&erence 'elds, then there exists an embedding
j : K  K of di&erence 'elds, with j i = idK  . We will not use any nontrivial properties of universal domains; however it is easy to see that every (countable) di&erence
'eld embeds into some universal domain. This amounts to saying that the class of
algebraically closed di&erence 'elds has the amalgamation property; cf. [7].
Lemma 1.2.2. Any universal domain is di6erence-closed.
Proof. Let X; Xi be as in the de'nition of di&erence-closed. Let K  be a countable,
i -invariant, algebraically closed sub'eld of K, over which X is de'ned. Let (a; a1 ; : : : ;
ak ) be a generic point of Y , over K  . Let K  be an algebraically closed 'eld extending
K  (a; a1 ; : : : ; ak ), and of in'nite transcendence degree over K  . It is easy to see that
3

113

After these lines were written, major parts of [5] were generalized to positive characteristic in [6].

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

48

there exist automorphisms i; K  of K  extending i , and with i; K  (a) = ai . Let this


determine a di&erence 'eld structure on K  . Since K  embeds into K over K  , there
exist also (a; a1 ; : : : ; ak ) Y (K) with i (a) = ai .
In particular, every countable di&erence 'eld embeds into a di6erence-closed di&erence 'eld. In a di&erence-closed 'eld, every set or relation de'nable by a 'rst-order
formula is de'nable by an existential formula, and moreover by one in which the quanti'ers range only over a 'nite set, the set of roots of some polynomial; see [5] for a
precise statement. This result follows from the form of the de'nition of a universal domain above, using standard model theory; see [7]. For the most part we will refer only
to sets de'ned by di6erence equations. These are equations f(X ) = 0, where f(X ) is
a di6erence polynomial, i.e. a polynomial in the variables X $ : $ F0 , with F0 the free
semi-group generated by 1 ; : : : ; k .
We observe that when k1 the automorphisms i on a universal domain K do not
commute. (One can obtain a universal domain for algebraically closed 'elds with k
commuting automorphisms by taking within K the common 'xed 'eld L of the commutators [ i ; j ], and their conjugates. This in'nitely de'nable pseudo-algebraically closed
sub'eld of the universal domain for k noncommuting automorphisms is a substitute for
a saturated model for the nonexistent model completion of the theory of k commuting
automorphisms. It will play no role in our considerations.)
We now turn to the case k = 1. As in the case of 'elds, one has a correspondence
between prime di&erence ideals in the di&erence ring of di&erence polynomials over K,
in the variables X = (X1 ; : : : ; Xn ), and between certain subsets of K n , the zero sets of the
ideals. But here, not every de'nable set is a Boolean combination of such di&erence
varieties. A basic de'nable set is rather the projection of such a di&erence variety;
one can show that the projection can be taken to be 'nite-to-one.
Dimensions: It is possible to attach a 'nite dimension to certain de'nable sets. In
model theory one considers more general ordinal-valued dimension theories, referring
to them as the rank. Thus we will speak of groups of 'nite rank. This is quite distinct
from (in very special cases, dual to) the notion of an Abelian group of 'nite Q-rank.
Though sets of in'nite rank occur in di&erence 'elds (as well as unranked sets, in
the case of several automorphisms), we will largely be concerned with sets of 'nite
rank. We refer to [5] for a detailed de'nition and discussion of the relevant dimension
theories. Let us note here that there are three approaches to de'ning the dimension. Let
K0 be a di&erence 'eld, D the zero set of a ('nite) number of di&erence polynomials
over K0 (or more generally a quanti'er-free de'nable set).
(i) The transformal degree of D is de'ned to be
sup tr:deg:K0 K0 ({$(a) : $ F0 }):

aD

For instance if k = 1, and f is a nonzero ordinary (k = 1) di&erence polynomial in one


m
variable X , i.e. a polynomial in X; X ; : : : ; X for some m, and D is the 0-set of f,
m
then (if X occurs nontrivially in f), D has transformal degree m.

114

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

49

(ii) One can use the structure of quanti'er-free de'nable subsets of D. The dimension
of D is then the maximal m, such that there exist a chain p0 pm of prime
di&erence ideals, with the de'ning polynomials of D lying in p0 . This could be stated
dually in terms of irreducible di&erence subvarieties of D.
(iii) One can use the structure of all de'nable subsets of D. See the discussion of
S1-rank below. The ranks given in these three ways satisfy (iii)6(ii)6(i), despite the
additional freedom permitted by (iii). It is this fact that will be used, later on, to show
that all de'nable groups embed into algebraic groups, and are simply determined by
certain algebraic-group data (up to 'nite index, in the version given here; a precise
determination requires more work; this reRects the blindness of the rank to 'nite index).
Note that if G is an algebraic group, or generally an algebraic variety, then G has
'nite (Zariski) dimension as such, but in'nite rank in the sense of .
(Classical model theory has developed a convention of denoting a model and its
universe by the same symbol. This is gradually becoming cumbersome, as one works
more and more with di&erent structures on the same universe. For instance, here it
would be much better to have di&erent symbols for G as an algebraic group, and as
a group de'ned in a di&erence 'eld. However we will stick to this convention in the
present paper, and trust to context.)
From now on, unless explicitly stated otherwise, we work in the ordinary case of
a single automorphism, k = 1.
In our analysis of de'nable Abelian groups below, we will also require the general
classi'cation of di&erence formulas of 'nite rank, developed in [5]. With a little extra
analysis, the central result there can be phrased as follows. Let '(x) be any di&erence
equation, or formula, of 'nite rank. One can try to simplify it by substitution, say the
substitution of x for x, where x is a rational or algebraic function of x; (x); 2 (x); : : : .
Using such transformations, ' can be reduced to an equation ' of one of the following
forms. Let E = {x : ' (x )}.
E is the 'xed 'eld k, i.e. ' is the equation (x ) = x .
E is a one-dimensional -de'nable subgroup of a simple Abelian variety A. Moreover, every -de'nable subset of E n is a 'nite Boolean combination of -de'nable
subgroups.
' is trivial in the sense that there are no algebraic relations between pairwise
independent solutions of '.
Though this has been proved only in characteristic zero, we believe that an appropriate
modi'cation of the theorem holds in all characteristics. (In (i) one must allow more
generally the equation for the 'xed 'elds of l Frobk , and in (ii) certain subgroups
of vector groups must be taken into account.) Note the special role that this theorem
accords to subgroups of Abelian varieties, among all di&erence equations. This by itself
suggests a closer understanding of such groups could be useful.
De4nable groups: In Section 4 we will develop the theory of Abelian groups de'nable in di&erence algebra. In principle, these groups may be de'ned by arbitrary
'rst-order formulas in the language of commutative rings, with a symbol for the automorphism . However, one quickly sees that up to isogeny and 'nite index subgroups,

115

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

50

 i
they can all be de'ned using linear equations
ei (x) = 0, where x ranges over a
commutative algebraic group A (this will be explained in more detail in Section 4).
We put linear in quotes, since if A is a semi-Abelian variety, and the equation is
written out in coordinates, it is not linear at all. We are interested however in the inner
structure of these groups, in the model-theoretic sense of induced structure; this includes polynomial relations among elements, intersections with subvarieties, behaviour
of . We obtain essentially the full story here. As an example, we quote:
Denition 1.2.3. Let A be a semi-Abelian variety over K, de'ned over the 'xed 'eld
k. An equation of the form

mi i (x) = 0
i=0;:::;n


(or in the inhomogeneous version,
m i (x) = a) is said to be of restricted
n i=0;:::;i n i
Abelian type if the polynomial
m
T

Z[T ] has no roots of unity among its


i=0 i
zeroes.
Theorem 1.2.1. Let (K; ) be a di6erence-closed di6erence 4eld; A be a semi-Abelian
variety over K; and let F( ) = a be an equation of restricted Abelian type on A. Let
B be the set of solutions to F( ) = a in K; and let X be a subvariety of A. Then
X B is a 4nite union of cosets of (de4nable) subgroups of A.
Note the analogy to Theorem 1.1.3.

At the other extreme, if
mi T i is cyclotomic, the group in question is a possibly
twisted algebraic group over the 'xed 'eld k, and as such carries a lot of structure.
It is shown in Section 3.2 that any de'nable group of 'nite rank is built up from
the two types of example, Abelian type groups and twisted algebraic groups. The two
types are orthogonal in the model theoretic sense. This means that every de'nable
subset of their product is a Boolean combination of rectangles. In Sections 3.3, and
3.4, we describe the non-split ways these groups can be put together. Section 3.6 says
something about the induced structure in this case.
Quantitative bounds: To obtain our quantitative bounds, we will need an estimate
on the number of solutions to a di&erence equation. It is convenient to bound more
generally the number of components of the Zariski closure of the set of solutions to a
di&erence equation, working in the model completion. With an appropriate notion of
(total) degree of a subvariety of Pnl , we obtain:
Proposition 1.2.2. Let X be a subvariety of Pn ; and let S be a subvariety of Pnl . Let
(K; ) be a di6erence-closed di6erence 4eld; and let Z be the Zariski closure of
{x X (K) : (x; (x); : : : ; l1 (x)) S}:
Then
d

deg(Z) 6 (deg(X )l deg(S))2 ;

d = min(dim(S); l dim(X )):

In particular; this bounds the number of irreducible components of Z.

116

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

51

These doubly exponential bounds will be proved in Section 2. The proof requires
only Bezouts theorem and Proposition 2.2.1 (which can be taken as a de'nition of
di&erence-closed di&erence 'elds). They apply a posteriori to any di&erence variety
known to be 'nite, regardless of the e&ectivity of the initial proof of such 'niteness.
In particular they apply to the qualitative 'niteness statement of Theorem 1.2.1, and
yield the explicit bounds.

by number theorists and model theorists alike, but it is an immediate consequence of


a lemma, Lemma 3.5.8, that was folklore in model theory since the mid-1980s, and
whose proof is short and requires no technology beyond the compactness theorem. In
particular, it has nothing to do with any of the aspects of the present paper, neither
di&erence 'elds nor model theoretic dimension theories. In itself it seems to be of
interest, for instance because of the following corollary:

1.3. Postscript, November 2000

Corollary 1.3.1. Let A be a simple Abelian variety of dimension 2 over a number


4eld K; and f a rational function on A. Let Ca be the curve f1 (a) A. Then for
some integer M; |Ca (K)|6M for all a K.

The proof of ManinMumford given in this paper was found in 1994 (cf. [13]),
written and submitted in 1995. The present text is a great improvement over the 1995
preprint, thanks to the graceful help of Elisabeth Bouscaren, ZoSe Chatzidakis, and
Michael McQuillan, as well as Alex Wilkie and Boris Zilber. Asides from many local
corrections, the following aspects are new.
TateVoloch: I heard about the TateVoloch conjecture a few weeks after the original preprint was submitted. It was immediately clear that the methods of this paper,
with no further work other than a classical nonstandard analysis type argument, answer
signi'cant cases of that conjecture. The result (Proposition 6.6.1), circulated separately
in early 1996, is now included in the text.
Uniformity in X: The proof of ManinMumford, unlike the number theoretic proofs,
does not assume that X lies over the 'xed 'eld, and hence gives bounds uniform in X .
In other words, if X (A Y ) is an algebraic variety, Xb = {a A : (a; b) X}, then
there exists m such that the Zariski closure of Xb Tor(A) is a union of at most m
translates of group subvarieties. It seemed at the time of writing that this is a signi'cant
contribution of the present approach. But it turns out that this can be deduced directly
from the statement! See automatic uniformity, below.
Uniformity in A: Uniformity in A (Theorem 1.1.2) does appear to be an important
feature of this proof. In particular, going back to the original statement of Manin
Mumford, 'x p and also g2; there exists an absolute bound b(p; g) on the number
of prime-to-p torsion points lying on a curve over Qp of genus g and with good
reduction at p.
Improvements in model-theoretic technology: The theory of simple unstable 'rstorder theories has greatly matured in the intervening 've years. The main advance
was a generalization of the 'nite-dimensional theory (represented in part here) to the
general simple context. But there were also advances within the 'nite-dimensional
context, and in particular Frank Wagner found a much smoother and more general
approach to internalizing groups. However I left the original treatment intact.
Automatic uniformity: Statements such as ManinMumford, or MordellLang, at
the level of all Abelian varieties (or at least all Cartesian powers of a given one),
enjoy an automatic uniformity property. As soon as one knows that ZCl(X ,n ) is
a 'nite union of cosets of group subvarieties for any n and any subvariety X of An ,
one also obtains a bound on this 'nite number, that does not grow when X moves in
an algebraic family. This uniformity (Corollary 3.5.9) seems to have gone unobserved

117

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

52

Proof. By Faltings + automatic uniformity.


At the end of this section (Lemma 1.3.2) we include a complete and elementary
proof, where even the use of compactness can be replaced by an algebraic argument.
Work of Scanlon: Later work of Scanlon amply demonstrated the potentiality of
these model theoretic methods. Extending the ideas in Lemma 6.6.1 and in Theorem 1.1.4, he gave a proof of the full TateVoloch conjecture for semi-Abelian varieties
over Qp . Transposing these ideas to Drinfeld modules, he proved Denis conjecture.
No other proofs are presently known.
New in this edition: Originally, the quantitative and qualitative arguments were separated inasfar as one automorphism goes, but were given together in the arguments
using two automorphisms. They are now separated in all cases; a priori bounds are
given 'rst in Section 2:3:1, so that quantitative details need not confuse the actual proof
of 'niteness. The lemma on automatic uniformity is new here. Section 4.6 is also new
(Problem 4:5:2 was previously proved ad hoc).
Lemma 1.3.2 (Automatic uniformity). Let G be a commutative algebraic group over
a 4eld K. Let Y and U (G Y ) be varieties; so that {Uy : y Y } is a constructible
family of subvarieties of G. Then there exists a 4nite number of subvarieties Ci G ni
of Cartesian powers of G such that for any group A G(K); if
Ci Ani = Fi Ani
for each i; with Fi the union of cosets of group subvarieties of G ni ; then for some M;
for any b; the Zariski closure of Ub A is the union of at most M cosets of group
subvarieties of G.
Proof. For g = (g1 ; : : : ; gn ) G n , let

Y (g) =

yY:

n


(gi ; y) U

i=1

118

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

53

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

54

' is de'ned outside of the union L of l linear subvarieties. For a subvariety S of Pnl ,
we let S  be the Zariski closure of '1 S. Let deg(S) = deg(S  ), the latter taken in
projective space. If S is a reducible variety, we de'ne deg(S) to be the sum of the
degrees of the components.

Let
Rn = {(g; h) G n G : Y (g) Y (h)};


n

En = (g1 ; : : : ; gn ) G n :
Y (g1 ; : : : ; gi1 ) = Y (g1 ; : : : ; gi ) :

Our calculations will be based on:

i=1

(For i = 1, read: Y = Y (g1 ).) Then Rn G n G, En G n are constructible sets. Take


the Ci to be Zariski closed sets, so that each Rn is a Boolean combination of some of
the Ci .
For large enough N , we have EN = . For otherwise, by compactness, we could 'nd
g1 ; g2 ; : : : G with Y = Y (g1 ) = Y ((g1 ; g2 )) = : : :, contradicting Noetherianity of Y in
the Zariski topology.
Now let A be a subgroup of G(K). Let b Y . Pick g1 ; g2 ; : : : ; gn A Ub such that
g = (g1 ; : : : ; gn ) En , and n is as large as possible. (We have n6N .) So if h A Ub ,
then h Rn (g), i.e. (g; h) Rn . Conversely if h Rn (g), then b Y (g) Y (h), so h Ub .
Thus
A Ub = A Rn (g):
By assumption, Rn An+1 = Fn An+1 , where Fn is a Boolean combination of cosets.
So Rn (g) A = Fn (g) A. But Fn (g) is clearly a Boolean combination of boundedly
many cosets. Hence so is Fn (g) A = A Ub , and thus also the Zariski closure of
A Ub .

Lemma 2.1.2 (Bezouts Theorem). (1) Let V1 ; : : : ; Vr be subvarieties of (Pn )l . Let


Z1 ; : : : ; Zt be the irreducible components of V1 Vr . Then
t

i=1

deg(Zi ) 6

r


deg(Vj )

j=1

(2) Let V be a subvariety of (Pn )l (Pn )k ; and let VU be the (set-theoretic) projection to (Pn )l . Then deg(VU )6deg(V ).
(3) Let X be a subvariety of (Pn )l (Pn )k of degree d. Let pr1 be the 4rst
projection; X (a) = X pr11 (a). Suppose dim X (a) = r for generic a pr1 X . Then
{a (Pn )l : dim(X (a))r} is contained in a proper Zariski closed subset of pr1 X of
degree at most d.
(4) Let V be a subvariety of (Pn )l (Pn )l (Pn )k ; 2 = {(a; b; c) (Pn )l (Pn )l
(Pn )k : a = b}; and let VU = pr(V 2) be the (set-theoretic) projection of V 2 to
(Pn )l (Pn )k . Then deg(VU )6deg(V ).
Proof. (1) Note that S = '(S  U ), and S  is irreducible if S is. For S  is invariant
under the action of the torus Gml , acting by

2. The number of solutions to di$erence equations

(31 ; : : : ; 3l ) ((x01 : : : : : xn1 : x02 : : : : : xn2 : : : : : xnl )


= ((31 x01 : : : : : 31 xn1 : 32 x02 : : : : : 32 xn2 : : : : : 3l xnl ):

2.1. Bezouts theorem


In order to formulate our results, we embed our variety X in projective space Pn . We
wish to bound the number of elements of X satisfying some di&erence equations. This
amounts to considering elements x of X such that (x; x; : : : ; l x) lies in a prescribed
subvariety S of Pnl . We denote by deg(S) the sum of the multi-degrees of S; in
particular if S is the set of zeroes of a multi-homogeneous polynomial, then deg(S) is
the total degree of this polynomial.
There is however a more convenient description for our purposes, given in
[10, p. 148, Example 8:4:4]. Viewing Pn as V  =Gm , where V  is an (n + 1)-dimensional
vector space with 0 removed, we obtain a representation of Pnl as (V  )l =(Gm )l . Let Gm
be the diagonal subtorus of Gml . Then (V  )l =(Gm ) can be identi'ed with an open subset
U of (V )l =(Gm ) = P(n+1)l1 .
Denition 2.1.1. We have a surjective rational map
'((x01

119

: ::: :

xn1

x02

: ::: :

xn2

: ::: :

xnl )

((x01

' : P(n+1)l1 Pnl

: ::: :

xn1 ); : : : ; (x01

given by

: ::: :

xnl )))

It follows that some component of S  of maximal rank must also be invariant; but then
it has the form U  for some subvariety U , and necessarily U = S and U  = S  . Thus the
operation S S  takes subvarieties of Pnl to subvarieties of P(n+1)l1 , injectively and
preserving the inclusion and irreducibility. Hence if Z is a component of V1 V2 , then
Z  is a component of V1 V2 . The operation also preserves degree by our de'nition
of degree. Thus we are reduced to the case l = 1; this is [10, p. 148, Example 8:4:6].
(2) We may assume V , and hence VU , are irreducible. Let
'1 : P(n+1)l1 (Pn )l
and
'2 : P(n+1)(l+k)1 (Pn )k (Pn )l
be the maps from De'nition 2.1.1 used to de'ne degree on (Pn )l+k and on (Pn )l . Let
4 : (Pn )l (Pn )k (Pn )l

120

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

55

be the projection. Then clearly 4'2 = '1 5 where 5 is a linear projection from
P(n+1)(l+k)1 to P(n+1)l1 . It remains to show:

56

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

2.2. Ordinary di6erence equations


We will also use the following characterization of di&erence-closed di&erence 'elds.

Claim. Let 5 be a linear projection from PN onto PM ; with center C. Let V be a


subvariety of PN ; not contained in C. Let VU be the Zariski closure of 5(V \C). Then
deg(VU )6deg(V ).
Proof. Let LU be a linear subspace of PM of complementary dimension to VU . Let L be
the pullback to PN , so C L, L is a linear subspace of PN , and 5(V L\C) = LU VU .
V L has at most deg(V ) irreducible components. Hence the Zariski closure of 5(V L)
has at most deg(V ) irreducible components. Since it is 'nite, it has size at most deg(V ).
So deg(VU )6deg(V ).
(3) Let ' : Pk(n+1)1 (Pn )k be the map from De'nition 2.1.1, and let 5 = Id(Pn )l '.
Let Y be the Zariski closure of 51 X . Let L be a generic linear subvariety of Pk(n+1)1
of codimension r + k. Let X  = Y ((Pn )l L). Note that deg(X  ) = deg(Y ) = deg(X ).
For generic a pr1 X , X  (a) = (Y (a) L) = . On the other hand by the projective
dimension theorem, if dim X (a)r then X  (a) = . Thus
{a (Pn )k : dim(X (a)) r} pr1 X  pr1 X
and we can apply (2).
(4) 2 is easily seen to have degree l + 1; so from (1) and (2) we get deg(VU )6
(l + 1)deg(V ). To get the result with coeQcient 1, recall the maps
'0 :

Wl2

Wk (Pl ) (Pk );

'0 : (x0 ; : : : ; xl ; y0 ; : : : ; yl ; z0 ; : : : ; zk )  ((x0 : : : : : xl ); (y0 ; : : : ; yl ); (z0 ; : : : ; zk ));


where Wl is a standard (l + 1)-dimensional vector space with 0 removed; and also

Lemma 2.2.1. Let (K; ) be a di6erence-closed di6erence 4eld; and let X be an irreducible K-variety; X the conjugate variety; and Y an irreducible K-subvariety of
l

X X X :
l1

{x X (K) : (x; (x); : : : ; l1 (x)) Y }


is Zariski dense in 40 Y .
Proof. For l = 2 this follows from De'nition 1.2.1 of di&erence-closed di&erence 'elds:
if H is a hypersurface in 4Y , by De'nition 1.2.1 applied to (4Y \H ); ((4Y ) \H ); Y
(4Y \H ) ((4Y ) \H ), there exists a 4Y \H with (a; (a)) Y ; since this holds for
l1
any H; {a : (a; (a)) Y } is Zariski dense in 4Y . For higher l, let X  = X X ,


Y = {(4Ux; 4 xU ) : xU Y }, and apply the case l = 2.
The following proposition yields estimates of 'nite numbers arising in di&erence
algebra. We allow reducible varieties here.
Notation 2.2.2. P = Pn = n-dimensional projective space
Pl = (Pn )l ;

'0 : Wl Wk Pl Pk ;

4(x1 ; : : : ; xl ) = (x1 ; : : : ; xl1 );

'0

4 (x1 ; : : : ; xl ) = (x2 ; : : : ; xl );

: (y0 ; : : : ; yl ; z0 ; : : : ; zk )  ((y0 ; : : : ; yl ); (z0 ; : : : ; zk )):

41 (x1 ; : : : ; xl ) = x1 :

Now
1

' (pr(V 2)) =

pr0 ('1
0 (V )

20 );

where pr0 : Wl2 Wk Wl Wk is the projection, and 20 = {(a; b; c) Wl2 Wk : a = b}.


The maps '0 ; '0 ; pr induce rational maps '; ' ; pr on and between appropriate subsets of the projectivizations Proj(Wl Wl Wk ), Proj(Wl Wk ). We have

Proposition 2.2.1. Let (K; ) be a di6erence-closed di6erence 4eld; and let S be a


subvariety of Pl ; de4ned over K. Let
Z = Zariski closure of {x P(K) : (x; (x); : : : ; l1 (x)) S}:
dim(S)

'1 (pr(V 2)) = pr ('1 (V ) 2);


where 2 is the projectivization of 2. Since 2 has degree 1, the result follows as in
(1) and (2).

121

Let 4; 4 ; 40 be the projections to X X X ; X : : : X ; X; respectively. Suppose (4Y ) = 4 Y (or just that these two sets have the same Zariski
closure). Then

dim(S)

Then deg(Z)6deg(S)2
. In particular; Z has at most deg(S)2
irreducible components.
Say S is de4ned over Q(c). Then every irreducible component of Z is de4ned over
Q( i (c); : : : ; idim(S) (c))a for some i; 06i6dim(S).

122

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

57

Proof. We use Notation 2:2:2. Given a (nonempty) irreducible subvariety W of S, we


de'ne varieties W  , W  as follows.
Case 1: (4W ) = 4 W .
In this case we let W  = W , W  = .
Case 2: 4 W *(4W ) .
Then let W  = (P (4W ) ) W , W  = .
1
Case 3: (4W ) *4 W , or equivalently 4W *(4 W ) .
1
Then let W  = ((4 W ) P) W , W  = .
Claim. (1) deg(W  W  )6deg(W )2 .
(2) 41 W  Z.
(3) If xU = (x; x; : : : ; l1 (x)) W; then xU W  W  .
(4) dim(W  )dim(W ).
(5) If W is de4ned over Q(c); then W  and W  are de4ned over Q( 1 (c); c) or
over Q( (c); c) (depending on the case).
Proof. (1) Follows from Lemma 2.1.2; projections, product with projective space P,
and application of do not increase degree, while the intersection (in cases (2) and
(3)) squares it.
(2) Follows from Proposition 2.2.1
(3) Clear from the de'nition of W  ; W  in each case.
(4) Clear since W is nonempty and irreducible, and W  is either empty or a proper
subvariety, in each case.
(5) Evident.
Now de'ne S(i); T (i) as follows:
S(0) = S;

T (0) = ;

{W  : W a component of S(i)};

T (i + 1) = {W  : W a component of S(i)} T (i):
S(i + 1) =

Using Claim (4), it is clear that S(k) = for some k6dim(S) + 1. By Claim (3), if
xU = (x; x : : : ; l1 (x)) S then xU S(i) T (i) for each i, hence xU T (k). Thus in this
situation x 41 T (k). It follows that Z is contained in 41 T (k). Conversely, by Claim (2),
every component of 41 T (k) is contained in Z. Hence Z = 41 T (k).
i
i
By Claim (1), deg(S(i) T (i))6deg(S)2 and degT (i + 1)6deg(S)2 for each i. The
required bound on the degree follows using Lemma 2.1.2(2).
The rationality statement follows by induction from Claim (5).
Corollary 2.2.3. Let X be a subvariety of Pn ; and let S be an irreducible subvariety
of Pnl . Let (K; ) be a di6erence-closed di6erence 4eld; and let
Z = Zariski closure of {x X (K) : (x; (x); : : : ; l1 (x)) S}:

123

58

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

Then
d

deg(Z) 6 (deg(X )l deg(S))2 ;

d = min(dim(S); l dim(X )):

In particular; this bounds the number of irreducible components of Z.


l1

Proof. Let S  = S (X X X ). Then deg(S  )6deg(S)deg(X )l , while the


dimension is at most d. Thus Proposition 2.2.1 applies.
Our original treatment required a uniform version of Proposition 2.2.1, when S is
allowed to vary with a parameter. There may be embedded components that are hidden
to Proposition 2.2.1, in a generic 'ber, but appear upon specialization. The following
somewhat technical lemma deals with this issue. Will not be used in present approach.
Notation 2.2.4. We keep Notation 2:2:2, except that we let P = Pm Pm , and > : P
Pm the projection. We also denote by > the map > > : Pl (Pm )l . If Z is a
subvariety of Pl and a (Pm )l ,
Z(a) = {b (Plm ) : (a; b) Z};
and if r = dim Z(a) for generic a >Z;
Z = {a >Z : dim Z(a) r}:
Lemma 2.2.5. Let (K; ) be a di6erence-closed di6erence 4eld; and let S be a subvariety of Pl ; de4ned over K. There exist irreducible subvarieties Zi of P such that:
dim(S)
i
1. ?i deg(Zi )6 i=0 deg(S)2 .
2. {x Zi (K) : (x; (x); : : : ; l1 (x)) S} is Zariski dense in Zi .
3. If a {x P(K) : (x; (x); : : : ; l1 (x)) S} then for some i; a Zi and >(a) Zi .
4. (Irredundancy) If i = j and Zi Zj ; then >(Zi ) Zj .
Proof. A straightforward variation on the proof of Proposition 2.2.1. S(i); T (i) are
treated as collections of irreducible varieties. In Case 1 we let W  = W , W  = (W
Pm ) W . By Lemma 2.1.2(3) and (1), we have deg(W  )6deg(W )2 ; so Claim 1 must
be modi'ed to:
deg(W  W  ) 6 deg(W )2 + deg(W ):
As in Proposition 2.2.1, we begin with the variety S, and obtain a larger collection
of irreducible varieties by closing under the operation: given W , add the irreducible
components of W  and of W  . In addition, to achieve the fourth item, we close under
the binary operation: given W1 and W2 such that W1 W2 , add W1 (W2 Pm ). This
allows to simply remove, at the end of the construction, any redundant Zi as in (4).
These operations always decrease dimension. One can show inductively that the sum
e
of the degrees of the Zi of codimension e in S is bounded by (deg S)2 .

124

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

59

2.3. Numerical bounds for partial di6erence equations


In this subsection, we work in a universal domain U for the theory of 'elds with r
automorphisms, 1 ; : : : ; r . F denotes the free group generated by 1 ; : : : ; r .
F acts on itself by conjugation, and on the polynomial rings over U by acting
on the coeQcients. Putting these actions together we obtain an action of F on the
di&erence polynomial ring. This action can be characterized by F(a)$ = F $ (a$ ).
Three considerations frame the e&ectivity picture for partial di&erence equations.
(1) For a 'nite number of di&erence equations, bounds entirely similar to the ordinary
case hold. (Proposition 2.3.1).
(2) Unlike the ordinary case, prime di&erence ideals are not 'nitely generated, and
one is typically interested in in'nite sets of equations. In particular, if T is a set of
points in the universal domain that is de'ned invariantly over Q, such as the set of
torsion points of an Abelian variety over Q, then the set of equations satis'ed by
T will be invariant under F-conjugation. If '(X; (X )) is an equation involving
alone, the properties of ' as an ordinary di&erence equation correspond better to the
properties of {'3 : 3 F}, than to ' as a partial di&erence equation. (See for instance
Problem 4:5:2 and Lemma 4.5.4.)
(3) The Zariski closure of the solution set of an in'nite set of partial di&erence
equations in the universal domain is already the Zariski closure of the set of solutions
of a 'nite subsystem. Given (1), the problem becomes to 'nd such a 'nite subsystem.
One does not expect a general e&ective solution for conjugation-invariant di&erence
ideals, even when 'nitely generated as such. The word problem for groups may be
coded in such ideals, using equations in one variable of the form xw x.
In our application, we will directly estimate the number of equations needed in (3),
and so (1) will apply.
A path in F is a sequence $1 ; : : : ; $n such that for each 16mn, for some generator
i of F, $m+1 = i $m or $m+1 = $m or $m+1 = i1 $m . A subset A F is called connected
if there exists a path connecting any two points of A.
Notation 2.3.1. A is a 'nite, connected subset of F. P = (Pn ) is projective space, PA
the Cartesian power of P, with coordinates indexed by the elements of A. For each i,
let
Ai; = {a A : i a A};
Ai;+ = {a A : i1 a A}:
Let 4i; ; 4i; + be the projections
4i; : PA PAi; ;

4i;+ : PA PAi;+ :

If a P, let aA denote the tuple (: : : ; a' ; : : :) PA with a' in the 'th-place.

125

60

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

Proposition 2.3.1. Let A be a connected; 4nite subset of F. Let S be a subvariety


of PA ; de4ned over U. Let
Z = ZCl{a P(U) : aA S}:
Then
deg(Z) 6 deg(S)2

dim(S)

:
dim(S)

In particular; Z has at most deg(S)2


irreducible components.
If S is de4ned over Q(c); then every connected component is de4ned over Q(c; c$1 ;
c$2 ; : : : ; c$m )a for some path 1; $1 ; : : : ; $m in F with m6dim(S).
Corollary 2.3.2. Let X be a subvariety of P; and let S be an irreducible subvariety
of PA . Let
Z = ZCl{a X (U) : aA S}:
Then
d

deg(Z) 6 (deg(X )|A| deg(S))2 ;

d = min(dim(S); |A|dim(X )):

In particular; this bounds the number of irreducible components of Z.


Proof. The proofs of Proposition 2.3.1, and of Corollary 2.3.2, are the same as to
those of Proposition 2.2.1 and Corollary 2.2.3, replacing the use of Proposition 2.2.1
by that of Lemma 2.3.3 below. In the proof of Proposition 2.3.1, Case 1 is that for
each i, 4i; W i = 4i; + W . In this case, using Lemma 2.3.3, Z W will be Zariski dense
in W . If Case 1 fails, there are 2r possible ways it can fail (two for each i), and one
proceeds as in Proposition 2.2.1.
Lemma 2.3.3. Let X be an irreducible U-variety; X ' the conjugate variety (' F);
and Y an irreducible U-subvariety of B'A hX ' ; projecting dominantly to each single factor. Let 4i; ; 4i; + be as in Proposition 2:3:1. Suppose (4i; Y ) i = 4i; + Y for
each i = 1; : : : ; r (or just that these two sets have the same Zariski closure). Then
{x X (U) : xA Y } is Zariski dense in X .
Proof. Let a = (a' : ' A) be a generic element of Y over U, L = U(a)a . Then for any
subset S A, (a' : ' S) is a generic element of the projection of Y to B'S X ' . In
particular, for each i, 4i; (a) is a generic point of 4i; Y , and 4i; + (a) is a generic point
of 4i; + Y . By assumption, the automorphism i of U carries ZCl4i; Y to ZCl4i; + Y . Thus
i extends to an automorphism i of L with i 4i; (a) = 4i; + (a). We obtain an action of
F on L. Since i (a' ) = a i ' whenever '; i ' A, we have ( i ')1 (a i ' ) = '1 (a' );
as A is connected, '1 (a' ) is the same element a0 for any ' A, and we have
'(a0 ) = a' . Now a0 is a generic element of X , and a0A Y . As U is existentially
closed, {x X (U) : xA Y } is Zariski dense in X .

126

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

61

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

3. Abelian groups of nite S1-rank

3.1. Commensurable families of groups

We include here some of the general theory of groups of 'nite S1-rank. Some of this
material appeared previously in an unpublished preprint PAC (On PAC and related
structures), and some was introduced in [8]. We refer to Chap. 7 of [5].
In these lemmas, all structures are assumed to be of 'nite S1-rank; there are no other
assumptions. To be precise, we assume that we work in a universal domain U, with a
map rk on nonempty de'nable sets, into the nonnegative integers, with the following
properties. (set rk() = )
A point of a de'nable set D of rank d is said to be generic over a base set B if it
does not lie in any B-de'nable set of rank d:
1. Suppose f : D E is a de'nable map. Let

Denition 3.1.1. Two de'nable subgroups A; B of a group G are commensurable if they


share a common subgroup of 'nite index in both. Commensurability is an equivalence
relation, denoted A B. We also write A6 B if B A has 'nite index in A. If
H; H  are each the intersection of a bounded family of de'nable subgroups, we say


H; H  are commensurable if one can write H = Hi , H  = Hi , with all the Hi and
Hi commensurable; equivalently, H H  has bounded index in H and in H  .

Ei = {e E : rk(f1 (e)) = i}:


Then Ei is de'nable, and rk(D) = maxiN {rk(Ei ) + i}.
2. Suppose R (D E) is de'nable, and let De = {d D : (d; e) R}. Assume ei E
for i = 1; 2; : : : ; rk(De ) = k for e E, and rk(Dei Dej )k for i = j. Then rk(D)k.
We will write rk(Q) for inf {rk(D) : Q D} if Q is an -de'nable set. rk(a=B) denotes
the rank of the type of a over B. Note that rk(Q) is the supremum of rk(q) over all
complete types q extending Q.
Remark 3.0.4. (1) These properties are satis'ed by transformal degree in di&erence
'elds.
(2) If a rank function with the above properties exists, then one can 'nd a smallest
possible rank function satisfying the second property. However the 'rst, de'nability
property need no longer hold. (This indeed occurs in di&erence 'elds.)
(3) De'ne two elements a,b to be independent over a substructure C if rk(a=bC) =
rk(a=C). This is symmetric and transitive, by the properties of rank. One can show
(cf. [5], Chap. 7) that the Independence Theorem holds:
Assume given substructures B; A1 ; A2 ; A3 ; A12 ; A13 ; A23 , with B; A1 ; A2 ; A3 algebraically
closed, and maps:
gi : B Ai and gij : B Aij , and
hij : Ai Aij , hji : Aj Aij (ij; ), with gij = hij gi for i = j.
If hij (Ai ); hji (Aj ) are independent over gij (B) for ij, there exist embeddings fij : Aij
U (ij) with
fij hij = fik hik :
Further, the fij hij (Ai ) are independent over the image of B.
(4) On the other hand, if a,b are not independent over C, q(x; y) = tp(a; b=C), and
b1 ; b2 ; : : : are independent realizations of tp(b=C) over C, then one can show that

i q(x; bi ) is inconsistent. It follows from this and the independence theorem that
independence in this sense coincides with independence in the sense of simple theories
[17, 30].

127

62

An -de'nable set (over C) is an intersection of C-de'nable sets. A generic (over


C) element of a -de'nable set Y (over C) is an element b Y such that b is not in
any C-de'nable set of smaller rank than Y .
Lemma 3.1.2. Let Q be the solution set of a complete type over an algebraically
closed set C. Let G be a C--de4nable group. Let H Q G be -de4nable; such
that H (a) is a subgroup of G; for a Q. Suppose all the H (a) are commensurable.
Then there exists a C--de4nable subgroup H of G; commensurable with each
H (a). Moreover;
(i) For independent a; b Q; H (a) H (b) H .
(ii) If h H is generic; then there exists an element a generic over h; with h H (a).
Proof. Let B = {g G : for some a Q; g; a independent, g H (a)}. This is an
-de'nable subset of G. An easy application of the independence theorem shows that if

b1 ; b2 are independent elements of B, then b1 b1


2 B. Thus H = BB is a subgroup of G,
and a generic element of BB is in B. It follows that rk(H )6rk(B)6rk(H (a)). Now let
a; b be independent elements of Q (over C). We have rk(H (a)) = rk(H (b)) = rk(H (a)
H (b)). Pick c H (a) H (b) with rk(c=C {a; b}) = rk(H (b)). Then rk(c=Cab)rk
(H (b))rk(c=Cb) so c is independent from a over bC. Since also a,b are independent,
over C, c and a are independent over C. Thus by de'nition c BB. This shows that a
generic element of H (a) H (b) is in BB; hence H (a) H (b) BB. Since H (a),H (b)
are commensurable, and rk(H )6rk(H (a) H (b)), it follows that all these groups are
commensurable.
Remark. The existence of H , and (i), are valid for partial types too.
If one assumes in that G and H are de'nable, one obtains a de'nable H commensurable with each H (a), and satisfying (i).
(Any -de'nable group in a 'nite S1-rank theory is an intersection of de'nable
groups (over the same set); see [5, Chap. 7].)
Lemma 3.1.3. Let Q be a possibly incomplete type over a set C. Let G be a
C-de4nable group. Let H Q G be -de4nable; such that H (a) is a subgroup
of G; for a Q. Suppose all the H (a) are commensurable; and further that all the
conjugates of the H (a) are commensurable among themselves. Then there exists a

128

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

63

C--de4nable normal subgroup H of G; commensurable with each H (a). If h H


is generic; then there exists an element a generic over h; with h H (a). If H is
de4nable; we can take H to be de4nable.
Proof. Assume C = 0 for convenience. We may assume here that Q is complete. Let
Q0 be a 0-de'nable set containing Q. Let G0 be a copy of G. Let G act on itself by conjugation, on G0 by translation, and on Q0 trivially; call this action on R = G G0 Q0 ,
as well as the induced action on the powers of R, >. Let R be the reduct with
universe R, and language L consisting of the L-0-de'nable relations, that are also >equivariant. Observe that H  G G0 Q de'ned by: (a; e; c) H  i& e1 ae H (c) is
in L . Also the group structure on G is in L . Similarly, any L-de'nable subset of R is
L -de'nable with parameters, so R has the same S1-rank as L, and it is L -de'nable.
By Lemma 3.1.2 applied to (R ; L ), there exists in this reduct a 0--de'nable group
H commensurable with each H (a). Being 0-de'nable in L , it must be respected by
the action > on G, so it is normal. By the remark following Lemma 3.1.2, we can take
H to be de'nable if each H (a) is de'nable.
Remark. Q could be a partial -type, i.e. a type in in'nitely many variables.
3.2. Indecomposability lemma
Lemma 3.2.1. Let U be the universal domain of a theory of 4nite S1-rank; C a
countable elementary submodel. Let G be a C-de4nable group; P the set of realizations in U of a generic type of G over C. Let X2 be the set of products gh1 ;
with g; h independent elements of P. Then the set of products {ef: e; f X2 } is an
-de4nable subgroup of G of bounded index (intersection of boundedly many de4nable subgroups of 4nite index). Hence there exists a bounded set R such that
RPPPP = G.
The notation G(C) means the set of elements of G de'nable over C. The last
sentence of the lemma states that any element of G is the product of an element of
R; and four elements of P. The lemma can easily but tediously be proved directly, but
follows more pleasantly from the theory of stabilizers, and we refer to [5] in lieu of a
proof.
The following lemma, originating in the theory of algebraic groups as the indecompasability theorem, was lifted by Zilber to the 'nite Morley rank context. The key is
the development of a theory of stabilizers valid there. The existence of a (di&erent)
theory of stabilizers makes the lemma valid also in the 'nite S1-rank context, cf. [5,
Chap. 7].

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

64

Lemma 3.2.3. Let G be a group de4nable in a structure of 4nite S1 rank. For any
de4nable Y G there exist de4nable groups Hi of G and 4nitely many cosets Ci of
Hi such that:

1: Y i Ci .
2: For some n; every element of Hi has the form


a(1)
i

16i62n

with ai (Y Ci ).

3.3. Internal groups


Denition 3.3.1. A de'nable set D is stably embedded if every de'nable subset of D,
with parameters possibly outside D, is de'nable also with parameters from D.
This notion seems to have arisen 'rst in [8]. See [5] for its basic properties. Note
here that in a saturated model U, a 0-de'nable set D, endowed with the induced
structure, is stably embedded i& the restriction map Aut(U)  Aut(D) is surjective.
(Given stable embeddedness, surjectivity can be shown by a back-and-forth construction
using domains of the form D X; X small. Conversely, if the map is surjective, every
U-de'nable subset of D has at most |U| conjugates under Aut(D); it follows using a
standard criterion of Joyal and Shelah that it is D-de'nable with parameters.)
Denition 3.3.2. A de'nable set E is internal to a de'nable set D if there exists a
de'nable (with parameters) surjective map h : D1 E, where D1 Dk is de'nable
We will use this when D is stably embedded; in this case it is equivalent to: D C
is trivial, for some 'nite (or countable) C; where:
Denition 3.3.3. Let D be a C-de'nable set. Write a CD a if tp(a=C D) =
tp(a =C D).
In the di&erence 'eld context, D will usually be the 'xed 'eld. When E is a
-de'nable set, we will say that E is D-internal if every de'nable quotient of E is
D-internal; in particular, if K is an intersection of a descending sequence of de'nable
subgroups Ki of a de'nable group G, we will say that G=K is D-internal to mean:
G=Ki is D-internal, for each i.
The 4nite case:

Lemma 3.2.2. Let G be a group de4nable in a structure of 4nite S1 rank. Let Y


be the set of solutions of a complete type over an algebraically closed set. Then for
some n; (YY 1 )n is a subgroup of G.

129

Lemma 3.3.4. Let G be a 0-de4nable group; D a stably embedded 0-de4nable set; C


a 4nite or countable set; p a generic type of G over C; P the set of realizations of

130

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

65

66

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

Similarly, we may show that K has a normalizer of 'nite index. Let a; a be independent generic elements of P, and let b = a a1 . We will show that b normalizes K.
Let c K. Then

p in the universal domain. If a P; let


E(a) = {a : tp(a=C D) = tp(a =C D)}
and suppose E(a) is 4nite for a P. Then there exists a 4nite normal subgroup N of
G with G=N D-internal. Moreover; for any a P and n N; na P and

tp(a=(C {b} D) = tp(ca=(C {b} D)


as in the previous claim. Thus,

tp(na=C D) = tp(a=C D):

tp(ba=(C D)) = tp(bca=(C D));

Proof. Enlarging C, and replacing p by a complete type over the new base, preserves
the hypothesis. Hence we may assume that C is algebraically closed, and that any
subgroup of G of 'nite index, de'nable over C, has a set of coset representatives in
G(C) = {a G : a dcl(C)}. Then by assumption, E(a) is 'nite, of size k say. Replace
C and p by another set and type, in such a way that k is least possible.
Let
K = {a G(C) : for some b P; ab P; and tp(b=C D) = tp(ab=C D)}:
Since P is the solution set of a complete type, one can equally well say for all b P
in the de'nition of K. It follows that K forms a subgroup of G. By assumption, for
b P, only 'nitely many b P have the same type over C D as b; hence Kb is
'nite, so K is 'nite.
Note that if a G, and for some b P with a; b independent over C, tp(b=C
D) = tp(ab=C D), then a K. This is because ab E(b) acl(C {b}), so a
acl(C {b; ab}) = acl(C {b}); as a; b are C-independent, a acl(C) = C; and hence
a K.
Claim. For a P, E(a) = Ka.
Proof. Ka E(a) by de'nition. Let a; d be independent elements of P over C,
a = db1 . Let a E(a); let c = a a1 ; we will show that c K. Note that a acl
(C {a; a }) = acl(C {a}); and that a; b are C-independent, by genericity of P.
We have
tp(a=C {b} D) = tp(a =C {b} D)


for otherwise, replacing C by a bigger set C containing b, and replacing p by tp(a=C ),


we will decrease k, contradicting the minimal choice of k. Thus the two types are equal,
and
tp(ab=(C D)) = tp(a b=(C D)):
Thus a b acl(C {ab}), and so c = a a1 = (a b)(ab)1 acl(C {ab}). We also
have c acl(C {a}), and ab; a are independent over C. Thus c acl(C) = C. Now
by the de'nition of K it follows that c K.

131

So
tp(a =(C D)) = tp((bcb1 a =(C D))
and hence bcb1 K. Now by Lemma 3.2.1, it follows that the normalizer N (K)
contains a subgroup of G of 'nite index.
Now P is divided into 'nitely many cosets of N (K); so there exists a translate gP of
P such that gP N (K) has rank equal to G. N (K) has a set of coset representatives in
G(C), so we may take g G(C). Let P  = gP N (K). If a P  , in particular a gP, so
every conjugate of a over C D is in Ka; hence the image aU of a in GU = (N (K)=K) has
no proper conjugates over C D. Since D is stably embedded, aU dcl(C D). Let PU
be the set of elements of GU with the same type over C as a.
U Then PU dcl(C D). By
U
U Thus every
U P)
U 4 = G.
Lemma 3.2.1, there exists a 'nite subset RU of G(C)
such that R(
element of GU is in dcl(C D).
However, we wanted this conclusion for a quotient of G itself. Let G1 be the intersection of all conjugates of N (K) in G; since N (K) has 'nite index in G, so does
G1 . Let N be the intersection of all conjugates of K; since K is 'nite, it is a 'nite
intersection. By the assumption on C, we have

N =
gi1 Kgi ; g1 ; : : : ; gm G(C):
i=1;:::;m

We have (gi1 (N (K))gi )=(gi1 Kgi ) dcl(C D), so G1 =(gi1 Kgi ) dcl(C D) for each
i, and it follows that G1 =N dcl(C D). But N is normal in G, and there exists in
(G=N )(C) a set of coset representatives for G1 =N ; thus G=N dcl(C D).
Remark 3.3.5. In Lemma 3.3.4, one can 'nd a 0-de'nable 'nite N normal in G, with
G=N D-internal.

Proof. Let N = { (N ) : Aut(U)}, where U is the universal domain. Then N
N so N is 'nite. It is Aut(U)-invariant, hence is 0-de'nable. By 'niteness, N =

i=1; :::; m i (N ) for some 1 ; : : : ; m Aut(U). Then G=( i (N )) is D-internal, and G=N
embeds de'nably into the product of these groups.
Internalizing groups (General case): Recall that a CD b i& a; b have the same
type over C D. Observe that CD is -de'nable: a CD a i& for every formula

132

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

67

5(x; y1 ; : : : ; yr ) over C, E 5 :


(y1 ) : : : (yr ) i D(yi ) (5(a; y1 ; : : : ; yr ) 5(a ; y1 ; : : : ; yr ):


Indeed this shows that CD is an intersection of 0 +|C| de'nable equivalence relations,
E 5 written above. We de'ne a=CD to be ((a=E 5 ) : 5 L(C)). Note that a CD b i&
a=E 5 = b=E 5 for each such 5.
Proposition 3.3.1. Let G be a 0-de4nable group. Let D be a stably embedded;
0-de4nable set. Then there exists an -de4nable (over ) normal subgroup K of
G; such that: G=K is D-internal; and such that if a K is generic; then for some
generic b G; b D ab.
Corollary 3.3.6. Let G be a 0-de4nable group. Let D be a stably embedded;
0-de4nable set. Suppose every element of G is algebraic over D. Then G=K is
D-internal for some 4nite normal subgroup K.
Proof. This is a special case of Lemma 3.3.4.
Remark 3.3.7. More generally; if D is any 0-de'nable set, there exists an -de'nable
normal subgroup K of G, such that: for some C, CD is trivial on G=K; and for any
C, for generic a K, for some generic b G, b CD ab.
Remark 3.3.8. Still more generally; one can prove an analog for any de'nable equivalence relation E on G; the lemma will give a normal subgroup K; such that the kernel
of G (G=K) is the generic intersection of all the translates of E.
Proof of Proposition 3.3.1. Consider pairs (p; C) with C the algebraic closure of a
'nite set (or a set of bounded size), and p a complete generic type of G over C.
P will denote the set of realizations of p. Let k(p; C) = k be the maximal integer such
that for some a; a P, a CD a and rk(a =C {a}) = k. (Then for any a P, for some
a , a CD a and rk(a =C {a}) = k.) Let k0 be the least possible integer of this form;
we will consider only pairs (p; C) with k(p; C) = k = k0 .
Let K(p; C) be the set of elements a G such that for some generic realization c
U
C) =
of p over C {a}, ac CD c. Then K(p; C) is -de'nable over C. Let K(p;
K(p; C)K(p; C).
Claim 1. K = K(p; C)K(p; C) is a subgroup of G. rk(K\K(p; C))rk(K).
Proof. It suQces to show that if a; b are independent elements of K, then ab1 K. Let
e P be such that ae CD e and be CD e; this is possible simultaneously by the independence theorem. Then e = ae P, e is generic over C {a; b}, and ba1 e = be CD
e CD e ; so ba1 K.
Claim 2. rk(K) = k0 .

133

68

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

Proof. Let (a; d) be mutually generic realizations of p, b = a1 d. Let C  = C b,


and let p = tp(a=C  ). Pick a with a C  D a , and rk(a =C  {a}) = k(p ; C  )k0 . So
k(a; a =C  )k0 + rk(p). Now clearly a CD a , so rk(a; a =C)6rk(p) + k0 . Thus (a; a )
is independent from b over C; hence also b is generic over C {a; a }. Let c = a a1 .
Then c is independent from ab over C.
Since tp(a =C  D) = tp(a=C  D) and b C  , we also have tp(a b=C  D) =
tp(ab=C  D), and in particular ab CD a b. But a b = cab. Thus c K.
It follows that rk(K)rk(c=C)rk(a =C {a}) = k0 . Conversely, if c K(p; C)
then for some generic a, c a CD a, so rk(c =a) = rk(c a=a)6k0 .
U
Claim 3. All K(p;
C) are commensurable.
Proof. Recall that we are referring only to K(p; C) with k(p; C) = k0 . If we extend
U  ; C  ) K(p;
U
C);
C to C  and p to a generic type p over C  , its clear that K(p
by the minimality of k0 , they must be commensurable. If p; q are two generic types
over C, let b; c be independent elements over C realizing p; q. Let d = b1 c, and let
C  = C d, and p = tp(b=C  ). Now K(p ; C  ) K(q; C): to see this let a K(p ; C  ).
Then tp(ab =C  D) = tp(b =C  D) for some b P  ; we may assume b = b. Since
d C  , abd CD bd; i.e. ac CD c; showing that a K(q; C).
Claim 4. If K = K(p; C) and K  = e1 Ke is a conjugate of K; then K; K  are commensurable.
Proof. After increasing C (using the previous claim), we may assume e C. Let b P,
c = e1 b, q = tp(c=C). We show K(q; C) K  , so that commensurability follows from
the equality of ranks. Let a K(q; C). Then we may assume ac CD c. Since e C, it
follows that eac CD ec. In other words, (eae1 )b CD b. Thus eae1 K, so a K  .

Let C continue to range over sets of the type considered above. By Lemma 3.1.3,
there exists a (0-)-de'nable normal subgroup K of G, commensurable with all the
U
U
C) (a; C are
K(p;
C), and such that a generic element of K lies in some generic K(p;
U
independent). Moreover, (using also the proof of the previous claim), if a K(p;
C)
for any p and any generic C, then a K . It remains only to show that G=K is

D-internal. Write K = j Kj , where Kj are de'nable groups, commensurable with


each other. Let Gj = G=Kj , and let 4j : G Gj be the quotient map. Let Pj be the
image of P under 4j .
Claim 5. Let CD (a) be the CD -class of a. Then CD (a) is contained in 4nitely
many cosets of each Kj .
Proof. Some generic c K(p; C) satis'es ca CD (a), so K(p; C)a CD (a) has
U
rank k0 . Since K(p;
C) is -de'nable over C, and all elements of CD (a) realize the
U
same type over C, all cosets of K(p;
C) intersecting CD (a) must intersect it in a

134

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

69

set of the same rank k0 . Since also rk(CD (a)) = k0 , CD (a) is contained in 'nitely
U
U
many cosets of K(p;
C). Since K(p;
C) and K are commensurable, the same is true
for K .
Claim 6. If a ; b Pj and a CD b ; then there exist a; b P; a CD b; with 4j (a) =
a ; 4j (b) = b . (And a may be chosen arbitrarily; with 4j (a) = a :)
Proof. Since D is stably embedded, there exists an automorphism of U 'xing C D
with (a ) = b . Pick any a P with 4j (a) = a , and let b = (a).
It follows from the last two claims that CD has 'nite classes on Gj . By Lemma
3.3.4, there exists a 'nite normal subgroup N of Gj such that Gj =N is D-internal,
and
for any a P and n N ; tp(na=C D) = tp(a=C D):
I claim that N = 1. Let n N , and lift it to n G. We must show that n K . For
this it suQces to show that n K(p; C) for generic C. This follows from Claim 6.
Thus N = 1 so Gj is D-internal.
3.4. Stability and modularity
We introduce here one of our central notions, of a locally modular group. In ordinary
di&erence 'elds of characteristic 0, these groups will be stable and stably embedded.
It is misleadingly easy to de'ne modularity in group-theoretic terms, but the more
abstract point of view of stability is better suited to analyze the relation of such a
group to its environment (or of two such groups), a relation that need not a priori be
group-theoretic. We start with a property of independence in stable theories.
Lemma 3.4.1. If a; b are independent over acl(C {a}) acl(C {b}); and (a; b) is
independent from C over E C; then a; b are independent over acl(E {a}) acl(E
{b}).
Proof. Let E  = acl(E {a}) acl(E {b}), C  = acl(C {a}) acl(C {b}). It suf'ces to show that a; b are independent from C  over E  . Note that a is independent
from C  {b} over E {b}, since acl(C  {b}) = acl(C {b}). Thus (a; b) is independent from C  over E {b}. Similarly (a; b) is independent from C  over E {a}.
It follows that the canonical base of tp(ab=C  ) is contained in acl(E {a}) and in
acl(E {b}), hence in their intersection E  .
Denition 3.4.2. A theory is called 1-based if it is stable, and in any model M of T ,
any two algebraically closed substructures of M eq are independent over their intersection. A structure is 1-based if the theory of that structure is that.
An equivalent condition: A saturated stable structure M is 1-based i& the lattice
of algebraically closed substructures of M eq (including imaginary elements) satis'es

135

70

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

the modular law: if X Z, then acl(X Y ) Z = acl(X (Y Z)). Such structures


are also called locally modular. It would be better to call them modular, but for
historical reasons, this word is reserved for locally modular structures satisfying an
additional condition of a weak elimination of imaginaries.
Corollary 3.4.3. Suppose M is a saturated stable structure; and the result M  of
naming constants in M for elements of the countable set C; is 1-based. Then M is
1-based.
Proof. Let A; B be algebraically closed subsets of M . We may assume A B is independent from C, by moving them by an appropriate automorphism. But then the result
follows from Lemma 3.4.1.
Denition 3.4.4. We will say that a de'nable group B is LMS (stable, stably embedded,
locally modular) if every de'nable subset of Bn (with parameters possibly outside B)
is a 'nite Boolean combination of cosets of de'nable subgroups of Bn .
Remark 3.4.5. By Hrushovski and Pillay [15], the condition LMS for a stable group
B is equivalent to 1-basedness of the induced structure. By quanti'er elimination for
Abelian structures (Baur, Ziegler), an Abelian group G with extra structure generated
by subgroups of G n , for various n, is always 1-based.
It is also shown in [15] that there are no in'nite de'nable families of distinct
de'nable subgroups of a 1-based group.
Proposition 3.4.1. (1) Let 0 A B C 0 be an exact sequence of de4nable
Abelian groups and homomorphisms; in a structure W . Then B is LMS i6 A and
C are LMS.
(2) More generally; let g : B C be a surjective de4nable map in a saturated
structure W . Then B is stable; stably embedded; and 1-based i6 these three properties
hold for C and for every 4ber of g.
(3) In particular; the union of two 1-based; stably embedded sets again enjoys this
property.
Remark 3.4.6. If A is stable, stably embedded and 1-based, then it remains so after a
parameter is added; hence every coset of A in B is also stable, stably embedded and
1-based. Thus the 'rst statement is indeed a special case of the second one.
Remark 3.4.7. The proof is an instance of the use of second-order characterizations in
model theory. The number of types is a global property of the family of all de'nable
sets; perhaps the simplest one. It is much more amenable to devissage (i.e. reductions as
in Proposition 3.4.1) than the geometric description of the structure of each individual
de'nable set. Similarly for the modularity property. We did not 'nd a good direct
proof; the diQculty occurs already in case A is 'nite. We o&er a direct proof of a
similar statement in Proposition 3.6.1 (Section 3.2); see also Lemma 3.5.11.

136

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

71

Proof of Proposition 3.4.1. We will use the following characterization (cf. [5], lemmas
on stable embeddability).
() B is stable and stably embedded i& for any subset X of the universal domain,
of size D = D0 , there are at most D types of elements of B over X .
If () holds for B, then it certainly holds for C and for every 'ber of g, as they are
interpretable in the structure B (with a named parameter). Conversely, assume () holds
for C and for every 'ber of g. Then there are 6D possibilities for tp(c=X ), with c C.
For any given c C, there are 6D possibilities for tp(a=cX ), with a g1 (c). Thus
there are also at most D2 = D possibilities for tp(bc=X ), b B, c = g(b); equivalently,
6D possibilities for tp(b=X ).
Now for 1-basedness.
Note 'rst (#): if a de4nable set D is stably embedded in B and 1-based, and X D,
Y B, then X; Y are independent over acl(X ) acl(Y ).
Indeed with Y  = acl(Y ) Deq , we have X; Y  independent over (acl(X ) Deq ) Y  ,
while by stable embeddability we have Y independent from X Y  over Y  , and transitivity applies.
More generally, in (#) we may take X acl(D) instead of X D; since then with
X  = X Deq , we have acl(X ) = acl(X  ).
In fact, (##): Suppose that for some set F independent from X and some
F-de4nable set D that D is stably embedded and X acl(D F). Then for any Y ,
X

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

72

D always have a canonical base e of de'nition with tp(e) D-internal; and there exists
a representative of the germ, de'ned over e. So if t dcl(X1 Y ) dcl(D2 ), write
t = f(a1 ; b), a1 X1 ; b Y , r = tp(a1 ); then the r-germ g of f( ; b) is in dcl(b) and has
a D2 -internal type; hence g Y2 . Let F be a g-de'nable function agreeing with f( ; b)
on generic realizations of r; then F(a1 ) = f(a1 ; b) = t, so t dcl(a1 ; g) dcl(X1 Y2 ).
Remark. A parallel but more formal proof of the same Claim, using algebraic closure
throughout, may be given along the lines of Proposition 3.4.1; see the proof of (2)
below.
Note that any formula '(u) implying D2 (u) is de'ned over D2 (by stable embeddedness of D2 ); so if we accept the claim, and if ' is also de'ned over X1 Y , then it
is de'ned over X1 Y2 . Thus tp(Y=X1 Y2 ) implies tp(Y=X1 D2 ), so
D2 :

X1 Y2

Resuming, we have in particular:


Y X2
X1 Y 2

hence as X dcl(X1 X2 ),
Y X

Y:

X1 Y 2

(acl(X )acl(Y ))

but Y2 X so
For we may assume (X; Y ) is independent from F; so (#) is valid over F; and by
Proposition 3.4.1 it holds over .
Moreover in the conclusion of (#) or (##), we may say X; Y are independent over
acl(acl(X ) acl(Y ) Deq ). For if W = acl(W ) acl(D) then W = acl(W Deq ).
It follows that if B is stable and 1-based, then so is every interpretable set D. Indeed
if X; Y are relatively algebraically closed subsets of Deq , then they are independent over
acl(X ) acl(Y ) Deq .
Let us now prove (3): Let X; Y D1 D2 be relatively algebraically closed where
D1 ; D2 are 1-based, stably embedded. By naming parameters, we may assume X Y
dcl(). We wish to show that X; Y are independent. Let Xi = dcl(X ) dcl(Di ),
Yi = {b Y eq : tp(b) is Di -internal}:

X1

and recalling Y X1 we have X Y . This 'nishes the proof of (3).

Now to prove (2), assume C and every 'ber of g are stable, stably embedded, and
1-based. Then every 'nite union of 'bers of g is also stably embedded, and 1-based.
We also already know that B is stable. Let X; Y be algebraically closed subsets of
Beq . We must show that X; Y are independent over X Y . By naming parameters, we
may assume X Y dcl(). We may assume X = acl(b), where b = (b1 ; : : : ; bn ); denote
gb = (gb1 ; : : : ; gbn ). Then gb X0 = X C eq . By (#), X0 ; Y are independent. Let Y  be
the canonical base of tp(b=Y ). Then b Y , so we may assume Y = acl(Y  ). Now if
Y

Then dcl(X ) = dcl(X1 X2 ) and similarly for Y . Moreover D1 Y so X1 Y ; but by


Y1

Y1

1-basedness of D1 , X1 Y1 ; so X1 Y . Dually, Y2 X , so also Y2 X .

b; b ; b ; : : : is a sequence of independent realizations of tp(b=Y ), then Y  acl(b; b ; : : :)


by a basic result of stability (Shelah). So Y acl(b; b ; : : :). But (gb; gb ; : : :) Y . And

Claim. dcl(D2 ) dcl(X1 Y ) dcl(X1 Y2 ).

(b; b ; : : :) lies in a (gb; gb ; : : :)-de'nable 1-based stably embedded set (namely the union
of the 'bers of g above the elements gbi ; gbi ; : : :). Thus the hypothesis and hence the
conclusion of (##) apply to Y .

We prove the claim using the theory of germs of de'nable functions in a stable
theory; cf. [12]. In a stable theory, germs of de'nable functions into a de'nable set

Lemma 3.4.8. Suppose X is LMS:


1. If f : Y X a 1-1 de4nable map; then Y is LMS.

137

X1

Y X

X1

138

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

73

2. If g : X m Z is surjective; then Z is LMS.


3. If a de4nable subset X of a de4nable group G is LMS and X generates G; then
G is LMS.
Proof. The 'rst two items are immediate from the de'nition of 1-basedness. The third
follows from the second, since by saturation, for some n, the map X 2n G given by
(x1 ; : : : ; x2n )  Bi xi(1)

is surjective.
Recall that two de'nable sets p; q are orthogonal if for any base set B over which
p, q are de'ned, and any a realizing p and b realizing q, a; b are independent over
B. The term hereditarily orthogonal is sometimes used, but the distinction this reRects
will not be important for us. Orthogonality is inherited by powers. In a language to
be introduced later, the following lemma says that orthogonality of 'nite rank groups
implies complete orthogonality.
Lemma 3.4.9. Let G1 ; G2 be de4nable groups of 4nite rank. Suppose G1 ; G2 are
orthogonal; and at least one of them is stably embedded. Then every de4nable
R G1 G2 is a 4nite union of rectangles X1 X2 .
Proof. Say G2 is stably embedded, and work over an algebraically closed base set
B. Let R G1 G2 be B-de'nable. Let (a1 ; a2 ) R. Since G2 is stably embedded,
R(a1 ) = {y G2 : (a1 ; y) R} is de'nable with a parameter from G2 . Taking a canonical
parameter c, we have c acl(B; a1 ), hence by the hypothesis of the lemma, c acl(B) =
B. Thus R(a1 ) = X2 is B-de'nable. Let X1 = {a G1 : R(a) = X2 }. Then X1 is a
B-de'nable subset of G1 , (a1 ; a2 ) X1 X2 R. So R is a union of B-de'nable rectangles. By compactness, it is a union of 'nitely many.
Remark 3.4.10. By the structure theory we are about to prove, for de'nable groups of
'nite rank in ordinary di&erence 'elds of characteristic 0, the assumption that at least
one of G1 ; G2 is stably embedded is unnecessary. For if Gi is not LMS, we will see
that Gi has a de'nable subquotient of the form Hi (k), Hi an algebraic group over the
constants. But then H1 ; H2 are not orthogonal, and hence neither are G1 ; G2 .
3.5. Algebraic modularity
In generalizations to two automorphisms or to positive characteristic, stability is lost,
and di&erent proofs and de'nitions must be given. Initial steps in this direction were
taken in [8]. Here we will refer to such modularity only in passing, essentially as
a convenient summary of facts about the one-automorphism case. Thus we will not
develop here the theory of modularity in simple theories.

139

74

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

At the level of de'nitions, we will use the ambient Zariski topology. (We will
use little more about the ambient 'eld, than that every de'nable set is a Boolean
combination of closed ones.)
Let ZCl denote Zariski closure, and ZClk closure for the k-Zariski-topology. ZCl0 =
ZCl .
Denition 3.5.1. Let A be a set of points of an algebraic variety V  , over an algebraically closed 'eld L. Assume A is Zariski dense in a subvariety V of V  , de'ned
over k L.
Let AZar be the structure whose universe is V , and whose basic relations are the
closed sets ZCl(X ), X An ; i.e. those Zariski-closed sets W V n such that W An is
Zariski dense in W .
Let AZark denote the structure whose universe is V , and whose basic relations are
the sets ZClk (X ), X An .
Denition 3.5.2. Let Azar denote the structure whose universe is A, and whose basic
relations are the sets U An , where U is a subvariety of V m , de'ned over the prime
'eld.
Lemma 3.5.3. Let X Am be arbitrary. Then ZCl(X ) is de4nable in AZark ; using
parameters from A. Conversely; there exists a countable k0 L such that if k0 k;
then every basic relation of AZark is also de4nable in AZar . Thus AZar has an essentially
(i.e. up to constants) countable language.
Thus AZar has an essentially (i.e. up to constants) countable language. Note as a
corollary that for any two suQciently large 'elds k  ; k  , AZark  ; AZark  di&er only by
constants.
Proof of Lemma 3.5.3. Let Y V m , X = Y Am , Y = ZCl(X ). Then as a variety Y is
de'ned over k(A), hence over k(a1 ; : : : ; al ) for some a1 ; : : : ; al A. Let a = (a1 ; : : : ; al ),
and let W = ZClk ({a} X ). Then W is a subvariety of V l+m , and W (a) = Y . By
de'nition of AZark , W is one of its basic relations; thus Y is de'nable in AZark , with
parameters from A.
If k0 k, the same proof shows that an AZark -de'nable set is AZark0 -de'nable, with
parameters. Thus it suQces to prove the converse for one countable k. Take k to be
the universe of an elementary submodel of (L; +; ; A). If U V m is a basic relation
of AZark , let X = U Am , Y = ZCl(X ), and let W be as above, so that Y = W (a) for
some a. But for any a k l , if U (k) Am W (a ) U then U = W (a ) (here U (k) is
the set of k-points of U ). By elementarity, for any a Ll , if U (L) Am W (a ) U
then U = W (a ). So U = W (a) = Y . Thus every relation of AZark is also one of AZar ;
we have already seen the other direction.
Lemma 3.5.4. Let U be any subvariety of V m . Then U Am is de4nable in Azar (with
parameters).

140

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

75

Proof. Let U  = ZCl(U Am ). Then U Am = U  Am , so we may assume U = U  .


Now U is a relation of AZar , so by Lemma 3.5.3, U = W (a) for some W AZark and
some a Al , where we take k to be the prime 'eld. Now W  = W Al+m is a relation
of Azar , and W  (a) = U An .
Consider an expansion U = (U; +; ; : : :) of a 'eld.
Denition 3.5.5. Assume U is a 'eld, with additional structure, V an algebraic variety
over U, and denote V = V (U). Let A be a subset of V . Say that A is algebraically
modular (ALM) if for every X Am ; (m = 1; 2; : : :), there exists a 1-based structure
whose universe is V and one of whose m-ary relations is the Zariski closure of X .
Remark 3.5.6.
If X; X  Am , then ZCl(X X  ) = ZCl(X ) ZCl(X  ). Thus in the above de'nition,
one may take 'nitely many X at once. It is thus equivalent to say that AZar is
1-based.
If V is an algebraic group and A is a subgroup, the de'nition agrees with De'nition 3.5.7 below. Also, it agrees with: Th(Azar ) is 1-based.
If A is ALM, then so is every subset of A. (Trivially from De'nition 3.5.5).
If U = ZCl(X ), then X A U U , so U = ZCl(A U ). So the de'nition would
not change if it referred only to subsets of the form X = A V with V a subvariety
of G m .
Denition 3.5.7. Assume U is a 'eld, with additional structure. Let A be a subgroup
of an algebraic group G. Say that A is algebraically modular (ALM) if for ever
m = 1; 2; : : : ; A is m-ALM: for every X Am , the Zariski closure of X is a 'nite union
of cosets of algebraic subgroups of G m .
If A is an LMS de'nable group, it is ALM. For if V is a subvariety of G m , then
V A is a 'nite Boolean combination of cosets. It can be written as a 'nite union

i (Ci \Fi ), where the sets Ci are pairwise disjoint cosets of groups Hi , and Fi is
contained in a 'nite union of cosets of subgroups of Hi of in'nite index. It follows
that the Zariski closure of V A equals the union of the Zariski closures of the Ci .
Conversely, as the following lemma shows, A is ALM i& it is LMS in (K; +; ; A).
Lemma 3.5.8. Let A K n ; K an algebraically closed 4eld. Then Azar is stably embedded in (K; +; ; A).
Proof. Let (K  ; A ) be a saturated model of Th((K; A)). It suQces to show (cf. [5])
that any automorphism f : A A (preserving the relations of Azar ) extends to an
automorphism of K  . We may 'rst extend f to an automorphism fL of the 'eld L
generated by the coordinates of the points in A . This is possible since any relations
f(X1 ; : : : ; Xn ) = 0, f Z[X1 ; : : : ; Xn ], restricted to A , form part of Azar so are respected
by f. Let I be a transcendence basis for K  over L; extend fL IdI to an automorphism

141

76

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

of L(I ). Extending further to the algebraic closure, we obtain an automorphism of K


(preserving A ).
Corollary 3.5.9 (Automatic uniformity). Let G be an algebraic group over an algebraically closed 4eld K; and let A G(K) be ALM. Then it is uniformly ALM: if
Uc G is an algebraic family of subvarieties of G; then there exist 4nitely many
algebraic subgroups Hi of G; and an integer n; such that for any c; ZCl(Uc A) is a
union of at most n cosets of the Hi .
Proof. If not, then by compactness there exists an elementary extension (A ; K ) of
(A; K), and c K , such that ZCl(Uc A ) is not a 'nite union of cosets of K-de'nable
subgroups of G. But any basic relation of (A )zar , i.e. U (A )zar with U de'ned over
the prime 'eld, is a Boolean combination of cosets of subgroups. Hence by quanti'er
elimination for Abelian structures, every de'nable relation of (A )zar is a Boolean
combination of cosets of subgroups. Now by stable embeddedness, Uc A is de'nable
in (A )zar (with parameters), so it is a Boolean combination of cosets of subgroups. By
Hrushovski and Pillay [15], these subgroups are A-de'nable (indeed acl()-de'nable
in (A )zar , so their Zariski closures are K-de'nable. So ZCl(Uc A ) is a 'nite union
of cosets of K-de'nable subgroups of G after all.
Products with ALM groups: Algebraic modularity is not in general inherited by
products. For instance, let (ai : i = 1; 2; : : :) be algebraically independent over Q. Let ,1
be the multiplicative group generated by the ai . Then ,1 is easily seen to be ALM
(directly, or using the results of the next section, embedding it in a group of the form
(x) = x2 ). Let bi = ai + 1. Then (bi : i = 1; 2; : : :) are also algebraically independent
over Q, so the group ,2 generated by them is ALM. However, ,1 ,2 is not ALM;
the graph of addition by 1 is not a group subvariety of Gm2 , yet it meets ,1 ,2 in a
Zariski dense subset.
In the above example, (,1 ,2 )zar is superstable, with locally modular regular types.
But this is not always the case, and even when it is, without 'nite rank it will not
serve our purposes. We thus proceed to make a stronger statement in the context of
di&erence 'elds. It is an application of the material of this section, but requires a
preliminary remark on transformal degree.
k

Lemma 3.5.10. Let B be a variety; E B B a subvariety with dim(E)6k.


Then H = {b B: (b; (b); : : : ; k (b)) E} has transformal degree 6k. Conversely; if
Y is a de4nable set of transformal degree 6k; then ZCl({(y; (y); : : : ; k (y)) : y Y })
has dimension 6k.
Proof. Say E; B are K-de'nable, K a di&erence 'eld, and let b H . Then
tr:deg:K K(b; (b); : : : ; k (b)) 6 dim(E) 6 k:

142

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

77

So tr:degK K(b; (b); : : : ; i1 (b)) = tr:degK K(b; (b); : : : ; i (b)) for some i6k, and
i (b) K(b; (b); : : : ; i1 (b))a :
Applying , transitivity of algebraic closure, and induction, we have
j (b) K(b; (b); : : : ; i1 (b))a

f or all j i:

Thus tr:degK K(b; (b); : : :)6k.


Lemma 3.5.11. Let U = (U; ) be a universal domain for di6erence 4elds. H; G be
commutative algebraic groups over U. Let B be an LMS de4nable subgroup of G;
of 4nite transformal degree d. Assume V is a de4nable subset of H G; whose
projection to G is 4nite-to-one. Let E be a subgroup of H; not necessarily de4nable;
such that E (E) d (E) is 1-ALM. Let X ((E B) V ). Then ZCl(X ) is
a 4nite union of cosets of group subvarieties.
Proof. We will show that X is a union of 'nitely many pieces, each contained in an
LMS group. Since LMS groups are ALM, the lemma will follow. First:
Claim. V (E B) is contained in a 4nite union of cosets of de4nable groups of
4nite S1-rank.
Proof. Let Z = ZCl({(b; (b); : : : ; d (b)) : b B}). Then dim(Z)6d. Write V for



i
(ai ; bi ) V :
((a0 ; : : : ; ad ); (b0 ; : : : ; bd )) :

78

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

3.6. Orthogonality and domination


The following result generalizes Proposition 4.4 of [14]. We will make no stability
assumptions here of any kind. We take the structures to be saturated. We consider
Abelian groups, and write them additively.
Let P; Q be two de'nable sets within the same structure. The strongest orthogonality
assumption on P; Q in the de'nable category is that for any n, any 0-de'nable subset of
P n Qn is a 'nite union of rectangles D D , with D P n and D Qn 0-de'nable. (It
follows that P; Q are stably embedded.) It is easily seen that this condition is inherited
by subsets and quotients under de'nable equivalence relations. Let us call this complete
orthogonality.
Given two completely orthogonal de'nable Abelian groups A; B, the induced structure
on the product A B is known; we wish to understand the possible de'nable relations
on nonsplit de'nable extensions of B by A. Let 0 A G 4 B 0 be an exact
sequence of de'nable groups and homomorphisms.
One possibility is a homomorphic section f : B G of 4 : G B (so that G is
de'nably split). One may also have a partial section f : Y G, with Y B. In this
case let us say that f is (aEne) homomorphic if f extends to an aQne homomorphism
on the coset Y  generated by Y ; equivalently, if there exists a subgroup H of B and
a homomorphic section h : H G, such that f(y) h(y) is constant on Y .

Equivalently, for m = (m1 ; : : : ; mn ) Zn , with
i mi = 0, let Y (m) = {(y1 ; : : : ; yn )

Y n:
mi yi = 0}. De'ne
fm : Y (m) A

06i6d
d

Also write H for H H H , G for G G G , E for E (E)


d (E). Let Z = (H Z) V . As V G is 'nite-to-one (by the assumption

on V ), Z Z is 'nite-to-one, so dim(Z)6dim(Z)
= d. Let V (Z) be the projection
So V (Z) is a constructible set, and dim V (Z)6dim Z = d. Since E is
to H of Z.

1-ALM, ZCl(V (Z) E ) is a 'nite union of cosets Fi of group subvarieties. As Fi


V (Z); dim(Fi )6d. By Lemma 3.5.10, FUi = {x : (x; (x); : : : ; d (x)) Fi } is a 'nite S1rank coset. It remains only to show that if (a; b) V (E B), then a FUi for some

i. (So that V (E B) i (FUi B)).) Let (a; b) V (E B). Then
(a; (a); : : : ; d (a)); (b; (b); : : : ; d (b)) (H Z) V :
So (a; (a); : : : ; d (a)) V (Z). But a E, so (a; (a); : : : ; d (a)) D . Thus (a; (a);
: : : ; d (a)) Fi for some i. So a FUi , proving the claim.
By Lemma 3.2.3, there exist a 'nite number of de'nable sets Pi and de'nable groups
l
Hi , with cosets (Hi +ci ), such that Z = i=1 Pi , Pi (Hi +ci ), and (Pi Pi1 )li = Hi . It suf'ces to consider separately each piece Pi (Hi +ci ). We have Pi V (E B) acl(B);
so Hi acl(B). Now B is a de'nable LMS group of 'nite rank. By Proposition 4.0.4,
Hi is also LMS. Hence so are (Hi + ci ) and (Hi + ci ) B. Since the union of these last
covers V (E B), and hence X , the lemma is proved.

143

by
fm (y1 ; : : : ; yn ) =

mi f(yi ):

(By applying 4, one sees that f indeed goes into A.) Then f is aQne-homomorphic
i& fm = 0, for all m.
Say that f is almost homomorphic if fm has 'nite image, for all m as above.
Equivalently (in a universal domain), there exists a countable subgroup 2 of G, such
that the composition Y G (G=2) extends to an aQne homomorphism Y  (G=2).
If f is an almost homomorphic section, we will also say that the image S of f is
an approximately homomorphic section. Note that S determines f.
If 2 can be taken to be a 4nite subgroup of G, we will call S a virtually homomorphic section. Any de'nable subset of A, or of a coset of A, is also a de'nable
subset of G.
One can take sums of subsets of A, and approximately homomorphic sections of
the previous types. One can also do this modulo a subgroup of A. We thus arrive
at the following de'nition: If N is a de'nable subgroup of A, we have an induced
de'nable exact sequence, 0 (A=N ) (G=N ) B 0. If T a de'nable subset of A=N ,
S (G=N ) an approximately (virtually) homomorphic section, then the pullback to G
of S + T is called an approximate (virtual) rectangle.

144

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

79

Proposition 3.6.1. Let 0 A G 4 B 0 be an exact sequence of 0-de4nable


Abelian groups and maps in a given saturated structure. Assume A is stably embedded in G; and A; B are completely orthogonal. Then every de4nable subset of G
is a 4nite union of approximate rectangles.
If moreover the conclusion of Lemma 3:2:3 holds for G; then every de4nable subset
of G is a 4nite union of virtual rectangles.

80

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

All the required properties of X hold, except that we must still show that K is a
subgroup of 'nite index of H A (i.e. with our assumption K = 0, that H A is 'nite).
Note that each element of J is an alternating sum of 2n elements of B1 . Let R J G
be the relation:

(1)i ai
R(c; b) i& there are a1 ; : : : ; a2n B1 with c =
06i62n1

Proof. Let X be a de'nable subset of G. For b B, let A(b) = 41 (b), a coset of A.


Claim. There exists a 4nite partition of B into de4nable sets Bi ; and de4nable sets
Si A; such that for b Bi ;
() For some g A(b), A(b) X = Si + g.
Proof. For each type q of elements of B, pick b = bq realizing q, and also pick g A(b).
Let Sq = {c A: g + c X }. Then
(q ) For some g A(bq ), A(bq ) X = Sq + g.
As Sq is a subset of A and A is stably embedded, it is de'nable over some Cq A.
By complete orthogonality, q implies a complete type over Cq . Thus () remains true
if bq is replaced by any realization of q. So
() For any b B; for some q; for some g A(bq ), A(b) X = Sq + g.
By compactness, 'nitely many of the Sq suQce. This proves the claim.
Partitioning X according to 41 (Bi ), it suQces to show the conclusion for each
piece; thus we may assume X 41 (B1 ); and even, shrinking B1 now, that 4(X ) = B1 .
Let S = S1 . So X A(b) = S + g, some g. Let K = {c A : c + S = S}. K is a de'nable
subgroup of A. We have also c + X = X for c K. So X is the pullback of a de'nable
subset of G=K; and by passing to the quotient, we may assume K = 0.
Now if S + g = S + g , then g g A (otherwise, S + g and S + g will be in di&erent
cosets of A) so g g K = 1. Thus we have a de'nable map
f : B1 G;
f(b) = g i& A(b) X = S + g.
f is a section of the natural projection (G=K) (G=A) = B; we denote this projection
too by 4.
Now let fm be as in the de'nition of an approximate homomorphism. It is a map
from B1n into A=K. By complete orthogonality, the graph of fm is a 'nite union of
rectangles. But functions contain only 1 1 rectangles. So the image of fm must
be 'nite. Thus f is an approximate homomorphism. If Y is the image of f, then
S + Y = X . This proves the lemma in the general case.
Assuming the conclusion of the Indecomposability Theorem 3.2.3, apply it to Y =
f(B1 ) G. This gives a partition of Y ; by partitioning X further, we may assume
B1 = 4(Y ) is contained in a single coset of J = 4(H ), H a de'nable subgroup of G;
and generated by sums of elements from Y .

145

and b =

0 6 i 6 2n 1(1) f(ai ):

Then for each c J there are 'nitely many b G with R(c; b).

Let W be the subgroup of A generated by m fm (Cm ). Then W is a countable
group. f extends to a well de'ned map fU : J G=W , namely


(1)i f(ai ):
fU
(1)i ai =
Moreover, fU is a group homomorphism.
U
U 1 (((H A) + W )=W ). fU induces an isomorphism between E=ker(f)
Let E = (f)
and ((H A) + W )=W . The graph of this isomorphism is the image of the de'nable
relation R, after factoring out the possibly nonde'nable subgroup W . The orthogonality
U and ((H A) + W )=W
condition (ii) applied to R shows immediately that E=ker(f)
must be 'nite. Hence (H A)=K is at most countable, so being de'nable it is 'nite,
as required.
We include also the analog of the socle lemma, Proposition 4.3 of [14]. Here we
use stabilizers in the sense of theories of 'nite S1-rank [5]. This lemma will not be
used in our application.
Proposition 3.6.2. Let G be a de4nable Abelian group of 4nite S1-rank; in some
structure. Let A be de4nable subgroup; X a de4nable subset of G. Assume:
(i) Every acl(G=A)-de4nable subgroup of A is commensurable to an acl(0)-de4nable
subgroup.
(ii) G has no de4nable subgroup A containing A with A =A in4nite and A acl(Y; A;
C) for some rank-one Y and 4nite C.
(iii) For any complete type X  X over an algebraically closed set; Stab(X  ) A is
4nite.
Then X is contained in 4nitely many cosets of A; up to a set of smaller rank.
Proof. Using compactness, we may replace X by a complete type over an algebraically
closed base set C; we must show that X is contained in a single coset of A. Let
X=A = {x + A: x X }. For b (X=A), let A(b) be b viewed as a coset of A. Let b be
an element of X=A. Then X (b) = X A(b) = . As X is the solution set of tp(a=C) for
a X (b), and b dcl(Ca), X (b) is the solution set of a complete type over C {b}.
Let X1 be the solution set of some type over acl(Cb), extending this type.

146

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

81

The stabilizer S of X1 with respect to the action of A on A(b) is an acl(Cb)-de'nable


subgroup of A. By (i), S is commensurable to a C-de'nable group S0 A. Now a
generic element of S S0 is independent from b over C, and hence is in the stabilizer
of X1 . But by (iii), this stabilizer is 'nite. Thus S is 'nite. In particular S does not have
'nite index in A (A is in'nite, using (ii), letting A G be generated by a rank-one
Y ). It follows that some 'nite intersection of A-translates of X1 is nonempty and 'nite.
A little combinatorics will show that some 'nite intersection U of A-translates of X (b)
is nonempty and 'nite.
Since U is de'ned over {b} A, we have U acl(b; A). Every element of A(b) has
the form a + x for some a A; x U , so A(b) acl(b; A). If rk(b) = 0, then X=A is
'nite, and being complete, it consists of a single element; in other words X is contained in a single coset. Otherwise, we will get a contradiction. Find F = acl(F) so that
rk(b=F) = 1, and let Y be the locus of b over F, and X  = {x X : x + A Y }. Then
A(b) acl(b; A) for b Y , so X  acl(Y A). By the lemma on 'nite generation,


Lemma 3.2.1, for some 'nite m, { ni yi : (y1 ; : : : ; ym ) Y m ; (n1 ; : : : ; nm ) Z m ; i ni

m
= 0} is a subgroup of G=A. So {a + ni bi : a A; (b1 ; : : : ; bm ) X ; (n1 ; : : : ; nm ) Z m ;

i ni = 0} is a subgroup of G; and evidently it contains A and is contained in
dcl(A X  ) acl(Y A). This contradicts assumption (ii).

4. Groups in di$erence elds


The Abelian groups de'nable in di&erence 'elds can be described in detail. We
will concentrate on 'nite rank subgroups of Abelian varieties; this is where the new
minimal sets come in. It is also the only case that will play a role in our number
theoretic application. We will however point out the new phenomena among de'nable
groups of 'nite index of other kinds; enough so, we hope, to make an exhaustive
classi'cation straightforward.
From now on (and for the rest of the paper), de'nable groups are de'nable in a
universal domain for di&erence 'elds. Note that [5] one has elimination of imaginaries;
so there is no di&erence between the category of de'nable groups, and the apparently
more general one of intepretable groups (taken with full induced structure). In particular, the quotient of a de'nable group by a de'nable subgroup is de'nably isomorphic
to a de'nable group.
Proposition 4.0.3. Let A be an algebraic group; not necessarily commutative. Every
de4nable subgroup of A is contained as a subgroup of 4nite index in a group of the
form:
{a A : (a; a; : : : ; n a) S};
n

where S is a Zariski closed subgroup of A A .

147

82

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115


n1

Proof. Let An = A A A . Let H be a de'nable subgroup of A. Let Sn be


the Zariski closure in An of {(a; (a); : : : ; n1 (a)) : a H }. Let Hn = {a A : (a; (a);

: : : ; n1 (a)) Sn }, H  = n Hn . Then H  is an -de'nable subgroup of A, and H H  .
Let p be a generic type of H  over some substructure C  , and let p be an extension
of p to a bigger set C, with p generic over C in a coset of H . Suppose the index
[H  : H ] is in'nite; then SU (p)SU (p ); p is a forking extension of p . It follows
by Chatzidakis and Hrushovski [5] that if a realizes p over C, then (a; (a); 2 (a); : : :)
forks with C over C  , in the sense of 'elds. However (a; (a); : : : ; n1 (a)) is a generic
point over C  of Sn ; and Sn is de'ned over C  ; a contradiction. Thus [H  : H ] is 'nite.
By compactness, for some n, [Hn : H ] is 'nite.
Proposition 4.0.4. Let E be a de4nable group. Then there exists a de4nable map of
E into an algebraic group A; with 4nite kernel.
We mention the proposition here as background, but will not require it; we give the
ideas of the proof in passing. Here is a somewhat more general statement.
Proposition. Let U be a saturated structure of 4nite S1-rank with stable reduct U .
Assume that for algebraically closed A; B U; C = A B;
(1) aclU (A B) = aclU (A B).
(2) A; B are independent over C in U i6 they are independent in U .
Then every U-de4nable group admits a U-de4nable homomorphism on an -de4nable
subgroup of bounded index; into a U -de4nable group; with 4nite kernel.
The proof below (as opposed to the statement) was known in the late 1980s,
as an application of the then-new technology of -de'nable groups. (The compactness argument at the end may be due to Pillay; at all events it was discussed in a
phone conversation with him around 1988.) A -type is a type in in'nitely many
variables; in the stable framework, essentially all techniques with types go through
verbatim for -types, and in particular this is true of the algebraic group con'guration (cf. [2]).The second salient fact is that a -de'nable group is pro-de'nable,
(cf. [12]).
Proof. Let G be a U-de'nable group. For a type p = tp(c) of an element c G,
let c enumerate acl(c), and let p = tp(c ). If a; b are independent generics of G; let
c = ab. Then a ; b ; c are algebraically dependent in pairs, but independent in pairs, as
sequences in U . By the group con'guration, there exists a -de'nable group H in U,
with elements 3; E; F equi-algebraic with a; b; c (in eponymous pairs). Let P = tp(b; E).
So P G H . Let S be the stabilizer. Left multiplication by (a; 3) stabilizes P into
tp(c; F); composing with the inverse of a generic conjugate of (a; 3), one concludes
that the stabilizer S of P contains a generic element (d; G), with d G generic; and
acl(d) = acl(G). It follows that S projects to G and to H with bounded kernel; and
the projection f : S G has image K = f(S) of bounded index.

148

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

83

We may factor out the bounded, in'nitely de'nable (=pro-'nite) group of S, the
kernel of the projection to G. Then we get a well-de'ned map s : K H , with graph
S; it has bounded kernel.
Now recall that H is a projective limit of a de'nable projective system (Hi ; hij )
of de'nable groups and maps in U ; we have hi : H Hi . Then by compactness, for
some i, hi s has bounded, i.e. 'nite kernel.
Remark. (1) The assumption of 'nite S1-rank is used only to make use of the theory
of group generics and stabilizers. The proof is thus valid in any context in which these
are available.
(2) In a 'nite S1-rank theory, an -de'nable group is an intersection of de'nable
groups (unpublished preprint On PAC and related structures). Using this and some
cosmetics, the conclusion of the proposition can be improved to: every U -de4nable
group admits a U -de4nable homomorphism into a U-de4nable group, with 4nite
kernel.
Lemma 4.0.1. Let G be a de4nable group; de4ned over a 4nite or countable set
C; and suppose any two elements of B have distinct types over C k; k = Fix( ).
Then there exists a de4nable homomorphism F : G H (k); H a k-algebraic group; F
injective; with FG of 4nite index in H (k).
Proof. By Chatzidakis and Hrushovski [5], every type over k is de'nable (k is stably
embedded). Hence any element of G is in dcl(C k). By compactness, there is a 'nite
de'nable partition of G into subsets Gi , and de'nable surjective maps fi : Ri Gi , with
Ri a de'nable subset of k ni . Now the relations induced on the Ri by pulling back the
graph of multiplication are de'nable over k; this uses again the stable embeddedness
of k. Putting the pieces back together, and it follows that G is de'nably isomorphic to
a de'nable group G  over k. Now every de'nable group over k has the stated form,
by Hrushovski and Pillay [15].
Lemma 4.0.2. Let G be a de4nable group; de4ned over a set C0 . Suppose C0 C; C
a 4nite or countable set; and every element of G is algebraic over C k. Then there
exists a de4nable homomorphism F : G H (k); H a k-algebraic group; ker F 4nite;
with FG of 4nite index in H (k).
Proof. By Corollary 3.3.6 G has a 'nite normal subgroup K with G=K internal to k.
By Lemma 4.0.1, G=K is of the required form.
Remark 4.0.3. In Lemma 4.0.2, one can take ker F to be C0 -de'nable (but not necessarily F itself).
Proof. Let F : G H (k) be as in the conclusion of Lemma 4.0.2. Let K = ker(F). Let

K = {$(K): $ Aut(U=C0 )}; where U is the universal domain. Then K K so K

is 'nite. It is Aut(U=C0 )-invariant, hence is C0 -de'nable. By 'niteness, K = i=1;:::; m
$i (K) for some $1 ; : : : ; $m Aut(U=C). Let hi : G Hi (k) be the conjugate of h : G

149

84

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

H (k) by $i . Let h = (h1 ; : : : ; hm ). Then clearly h has kernel K . The image of h


need not of course be of 'nite index in the product of the Hi , but is isomorphic to a
subgroup of 'nite index of some H (k), using again the result in [16].
Lemma 4.0.4. Let G be a de4nable group; de4ned over a set C0 . Suppose C0 C; C
a 4nite or countable set; and every element of G is algebraic over C D; where D is
a stable; stably embedded; 1-based de4nable set of 4nite rank (e.g. an LMS group).
Then G is LMS.
Proof. By Corollary 3.3.6, G a 'nite normal subgroup K with G=K internal to D. So
G=K and, certainly, K are LMS. Thus by Proposition 3.4.1, G is LMS.
4.1. Abelian varieties
We seek to classify the de'nable subgroups of 'nite rank of an Abelian variety A.
We refer here to de'nability in the language of di&erence 'elds; in particular, equations
may use as a primitive operation. For more details we refer the reader to [5]. We
will also use the notion of the (S1-)rank of a de'nable set from [5]; it is a kind of
dimension theory. It is convenient to work in the universal domain.
We begin with a classi'cation up to commensurability.
Notation 4.1.1. Let A be a semi-Abelian variety. We let End(A) be the group of
endomorphisms of A given by rational maps. We let E(A) = Q End(A).
E(A) is a semisimple Artin ring; if A is a simple Abelian variety, then E(A) is a
division ring.
Denition 4.1.2. A de'nable group B will be called c-minimal if B is in'nite, and
every in'nite de'nable subgroup of B has 'nite index in B.
Denition 4.1.3. If E is a ring, and is an automorphism of E, we let E be the ring

of formal 'nite sums i ei t i , where the index i ranges over Z, and ei = 0 for almost
all i. Multiplication is de'ned using the commutation rule: ta = (a)t for a E.
Lemma 4.1.4. Let E be a division ring; an automorphism of E; E  = E . Then every
left ideal of E  is principal.
Proof. E  is the direct sum of Et n . Let E + be the sum of Et n for n0. Then E + is a
subring of E  ; if I is an ideal of E + , then I + = I E + is an ideal of E + , and I is gen
erated by I + . Thus it suQces to prove the result for E + . We de'ne deg( i6n ai t i ) = n
where an = 0. The standard proof of the (left) Euclidean algorithm goes through. If I is
a left ideal and f is an element of least degree in I , and g I , then g = rf+s with s = 0
or deg(s)deg(f) by the Euclidean algorithm, so s I and hence deg(s)deg(f), so
s = 0. Thus I = E + f.

150

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

85
n

Lemma 4.1.5. Suppose A is a simple Abelian variety; and for all n; A and A are not
isogenous Abelian varieties. Then every -de4nable subgroup of A is commensurable
to a Zariski closed subgroup.
Proof. Immediate from Lemma 4.0.3; since every Zariski closed subgroup S of
A n A is commensurable to a product of Zariski closed subgroups of the
i A.
Lemma 4.1.6. Suppose Ai is a simple Abelian variety; with no Ai isogenous to k Aj
unless i = j. Then every -de4nable subgroup of Bi Ai is commensurable to a product
of -de4nable subgroups of Ai .
Proof. Similar to the previous Lemma 4.1.5.
Lemma 4.1.7. Let A; E be Abelian varieties; A simple; E isogenous to An ; and let B be
a de4nable subgroup of E. Then there exist de4nable homomorphisms Fj : E m( j) A

such that B is a subgroup of 4nite index of i=1;:::;l Ker(Fi ). For each j; Fj has the
k

following form. There exist Lring -de4nable homomorphisms hjk : E A



Fj (e) = k (hjk k (e))).

m( j)

such that

Proof. By Remark 4.0.3 B is a 'nite index subgroup of {a E: (a; a; : : : ; m a) S}


where S is a Zariski closed subgroup of E m E. By PoincarSe complete reducibility, S is a 'nite index subgroup of the kernel of some homomorphism, or of the
joint kernel of some homomorphisms on E m E into simple Abelian varieties.
Thus there exist algebraically de'nable homomorphisms Fj into m( j) A such that B is
a subgroup of 'nite index of {a E: Fj (a; a; : : : ; m a) = 0 for each j}. The lemma
follows upon decomposing Fj into a sum of components.
Notation 4.1.8. Let A be a de'nable, divisible Abelian group. We let End (A) denote
the ring of de'nable endomorphisms of A, and E (A) = Q End (A).
Lemma 4.1.9. Let A be an Abelian variety. Let
 
A
Bij ;
i=1;:::;m j=1;:::;ni


denotes isogeny; each Bij is isogenous to some j -conjugate of Ai ; and where

where


for i = i and j; j 0; (Ai ) (A ) . (Thus the Ai are representatives for the equivalence relation generated by isogeny and -conjugacy; and the ni is the multiplicity
of Ai in A for this notion.) Then

E (A)
Mni E (Ai ):
i

16i6m

Proof. This follows easily from Lemma 4.1.6; it will not be used, except as motivation
for considering the case of A simple.

151

86

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

Proposition 4.1.1 (Structure of E (A)). Let A be a simple Abelian variety. Then:


(1) E(A) is a division ring.
(2) If A is simple; and A and n A are non-isogenous for all n; then E (A) = E(A).
Suppose now that A is simple; and A and n A are isogenous for some n. Then
(3) E (A) admits a Z-grading. E(A) is the homogeneous component of degree 0. If
$ is any homogeneous element of degree 1; then $ is invertible; and the homogeneous component of degree n is $n E(A) = E(A)$n . In particular E(A) is stabilized
by conjugation by $; and E (A) is generated as a ring by (E(A); $; $1 ). The
powers of $ are linearly independent over E(A). Thus E (A) is isomorphic to
E(A)$ de4ned in De4nition 4:1:3; where $ acts on E(A) by conjugation.
(4) Every de4nable subgroup of A is commensurable to the kernel of a de4nable
endomorphism of A.
(5) B is a C-minimal subgroup (up to commensurability) i6 B is commensurable to
Ker(h) with h a left-irreducible element of E (A)
(6) Any nonzero de4nable endomorphism of A is surjective.
Proof. Conditions (1) and (2) have already been noted. For the rest, let n be least such
that A and n A are isogenous, and 'x an isogeny h from A to n A, and an isogeny
h from n A to A, such that h h = [m] (where [m] is multiplication by an integer m).
We have hh (hx) = h(mx) = mh(x) so as h is onto, hh = [m]. Mapping g to (1=m)hgh
gives an isomorphism F between E(A) and E( n A); the choice of h; h a&ects F only
n
up to conjugation. (If A = A , it is natural to choose h = h = Id.)
 n
Let $ = h ; it is a de'nable endomorphism of A. Let $ = n h. We have $$ =
$ $ = [m].
We will show that E (A) is generated by E(A) and $; $ . For now let E  (A) be the
ring generated by End(A) and $; $ , and let E  (A) be the ring of endomorphisms e of
A such that Me E  (A) for some M . So E  (A) E  (A) Q E  (A) E (A).
Claim 1. Let r : k A l A be an algebraic homomorphism. Then l r k E  (A). It
is a homogeneous element; i.e. a product of a power of $ or $ with an element of
E(A).
Proof. If r = 0 the statement is trivial. Otherwise k A; l A are isogenous, so k = l mod
n. We use induction on |k l|. If k = l, then l r k = l (r) E(A). Say kl
(otherwise use $ instead of $ ). We have k (h ) : k+n A k A. Let r  : k+n A l A,
r  = r k (h ). By induction,
(i) l r  k+n = l r k (h ) k+n is a homogeneous element of E  (A). On the other
hand, consider as an automorphism of the structure including itself, and allow applying it to de'nable functions in this structure. Then n k (h) = k ( n h) = k ($ )
so
(ii) kn k (h) k = k k ($ ) k = $ .
Multiplying (i) and (ii), the claim follows.
Now if Fj : Ap m( j) A is an endomorphism as in Lemma 4.1.7, then each of the
summands of m( j) Fj : Ap A mentioned in Lemma 4.1.7 is of the form appearing in

152

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

87

the Claim, hence can be viewed as an element of E  (A); thus m( j) Fj = sj for some
sj E  (A), and Ker(Fj ) = Ker(sj ). We conclude.
Claim 2. Every de4nable subgroup of Ap is commensurable with one de4ned by a
4nite number of E  (A)-linear equations.
Claim 3. An element of E  (A) is a unit in Q E  (A) i6 it has 4nite kernel.
Proof. Suppose g E  (A) is not invertible in Q E  (A). Then it is not a homogeneous
p
element. So after multiplying by a power of $, it can be written as i=0 ai $i , with a0 = 0
and ap = 0, p0. Opening up the de'nition of $ and multiplying out, we can also write
p
g = i=0 bi i , where bi is a de'nable homomorphism from i A to A, and b0 = 0, bp = 0.

p
Let C be the principal component of the subgroup of A A de'ned by
bi xi = 0,
and let S = {((x0 ; : : : ; xp ); (y0 ; : : : ; yp )) C C: x1 = y0 , x2 = y1 ; : : : ; xp = yp1 }. Then
S projects onto C and onto C, using the surjectiveness of b0 and bp . Thus by the
axioms of model completeness, Lemma 2.2.1, there are in'nitely many x C with
(x; x) S. This implies that Ker(g) is in'nite. The other direction is trivial, since
ker[n] is 'nite for all n.
The same argument shows that the map from E(A)$ of De'nition 4.1.3 to Q E  (A)
is injective, hence an isomorphism. By Claim 2 with p = 1, every de'nable subgroup
B of A is commensurable with one of the form {a A: si a = 0; i = 1; : : : ; q}. However,
by Lemma 4.1.4, the left ideal of Q E  (A) generated by {s1 ; : : : ; sq } is generated by
a single element s. We may replace s by Ms with Ms E  (A); then B Ker(Ms). This
proves (4).
Claim 4. Every de4nable endomorphism of A is in E  (A).
Proof. Let e be a de'nable endomorphism of A, and let E be the graph of e, a subgroup
of A2 . Consider
I2 = {(f; g) E  (A)2 : f(x) + g(y) = 0 for any (x; y) E}:

This is a submodule of E  (A)2 . By Claim 2, E j {(x; y): fj (x) + gj (y) = 0} for
certain (fj ; gj ) I2 . Let I be the second projection of I2 , an ideal of E  (A). We claim
that Q I is the unit ideal of Q E  (A). Otherwise, Q I is generated by some g
which is not a unit, hence by Claim 3 has in'nite kernel K. Each gj is a multiple of g in
Q E  (A). If a K, then gj (a) = 0 for each j, so a subgroup of 'nite index of (0)K
is contained in E. This contradicts the fact that E is the graph of a homomorphism.
Thus 1 Q I , so M I for some integer M . Thus for some f E  (A), f(x)+My = 0
for all (x; y) E. So Me(x) = f(x), hence Me E  (A), and so e E  (A).
Condition (3) follows from Claim 4.
Claim 5. Let f; g E  (A). Then Ker(f)6 Ker(g) i6 for some h Q E  (A),
g = hf.

153

88

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

Proof. One direction is evident. For the other, we may assume upon multiplying by
an integer that Ker(f) Ker(g). Thusone may de'ne an endomorphism h of A by
g(x) = h(f(x)). By Claim 4, h E  (A).
Proof of (5) and (6). The inclusion ordering on de'nable subgroups of A, up to commensurability, is now known to be isomorphic to the divisibility ordering on elements
of Q E  (A). Thus (5) is immediate. For (6), suppose f is a de'nable endomorphism.
Then Im(f) Ker(g) for some g. So ngf = 0 for some n = 0. But the ring E (A) has
no zero-divisors, so f = 0 or g = 0. In the latter case, f is surjective.
Remark 4.1.10. A semi-Abelian variety S has only countably many de'nable subgroups.
Proof. By Proposition 4.0.3, any subgroup of S is commensurable to one determined
n
by an algebraic subgroup of S S for some n. It is well known that there
are at most countably many algebraic subgroups of a semi-Abelian variety (any such
subgroup is the Zariski closure of the torsion points within it). Thus there are countably
many de'nable subgroups up to commensurability. If H is a de'nable subgroup of S,
for any n, the map x  nx has 'nite kernel on H ; so nH has the same rank as H ; so
[H : nH ] is 'nite; thus H has 'nitely many subgroups of index n. Similarly, S=H has
only 'nitely many torsion points of order N , so H has only 'nitely many supergroups
of index n. Thus there countably many de'nable subgroups altogether.
Lemma 4.1.11. Let A be a simple Abelian variety.
(a) Let D (A) be the ring of de4nable homomorphisms A A=T; where T is a
de4nable subgroup of A of 4nite rank. One identi4es h with 4h if h : A A=T; T T  ;
and 4 : A=T A=T  is the natural projection. Then with the natural addition and
multiplication; D (A) is a division ring. E (A) embeds into D (A). Every element of
D (A) can be written as fg1 with f; g E (A) (or alternatively as f1 g).
(b) E (A) is an Ore ring: for any f; g E (A) (0); for some u; v E (A) (0);
gu = fv.
Proof. The fact that D (A) is a division ring is immediate, by de'ning inverses. The
fact that every element of D (A) is a quotient of elements of E (A) can be shown as
in Claim 4 of De'nition 4.1.2. Condition (b) follows formally.
Lemma 4.1.12. Let A be an Abelian variety; B a de4nable subgroup of 4nite rank.
Then B is LMS i6 every c-minimal de4nable subgroup of gB is LMS; for any
g E (A). If B = Ker(f); f = f1 : : : fr fi E (A) irreducible; then B is LMS i6
Ker(fi ) is LMS for each i.
Proof. This reduces easily to the case of simple A. Let E (A) be the ring of de'nable
endomorphisms of A, tensor Q. By Proposition 4.1.1 we have B Ker(f) for some
f E (A). If f is a unit there is nothing to prove. Otherwise we use induction (say

154

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

89

Noetherian induction on the left ideal generated by f, or on the rank of B). If f = gh,
with h not a unit, then Ker(h) Ker(f). If hB B, we are done using the exact
sequence
0 Ker(h) B hB 0:
Otherwise we have the de'nable exact sequence
0 Ker(h) B h1 B B gB 0:
In this sequence, the map Ker(h) B h1 B is the inclusion. The next map is given
by h. The next is given by g. One has gh = 0 on B. One must also verify that if x B,
g(x) = 0, then x = h(y) for some y B. By Proposition 4.1.1(6), we have x = h(y) for
some y; and f(y) = gh(y) = g(x) = 0, so y B. Thus the sequence is indeed exact.
Now g annihilates Ker h, so gB has smaller dimension than B. On the other hand
hB has smaller dimension than B since 0 = Ker h B, and we assumed that hB * B,
so dim(B h1 B)dim(B). In either case we are done by Propostion 3.4.1 and
induction.
Proposition 4.1.2 (Structure of c-minimal subgroups). Let A be a simple Abelian
variety; B a c-minimal de4nable subgroup of A (up to commensurability).
(a) Precisely one of the following cases occurs:
(i) B = A.
(ii) B is de4nably isomorphic to a subgroup of 4nite index of H (k); k = Fix( );
H a k-algebraic group.
(iii) B is LMS; of U -rank one.
n
(b) Case (i) occurs i6 A is not isogenous to A for any n0. Case (iii) occurs if
n
A is isogenous to A for some n0; but not isomorphic to an Abelian variety
A de4ned over Fix( n ) for some n0.
Thus we may assume A is de4ned over Fix( n ).
(c) Suppose A is de4ned over Fix( n ). Then B is not LMS if and only if B Ker( N
1) for some N (with n|N ).
(d) Suppose B is a c-minimal de4nable subgroup of the multiplicative group Gm .
Then (c) holds: B is not LMS if and only if B Ker( n 1) for some n.
n

Proof. If A is not isogenous to A for any n, we have already shown that A has no
n
proper de'nable subgroups. Conversely, if A is isogenous to A , say via an isogeny f,
then one cannot have B = A since for example {a: n (a) = f(a)} is a smaller subgroup;
and it is clear that any B must have 'nite rank. Suppose from now on that indeed B
has 'nite rank. If B is LMS, then it has U -rank one since every in'nite LMS group
has a rank one de'nable subgroup, and B is c-minimal. Suppose B is not LMS; we
must show that A is isomorphic to an Abelian variety A de'ned over Fix( n ) for some
n, and that (c) holds. Choose a base C such that A,B are de'ned over C, and there is
a minimal type X B, also over C. By Lemma 3.2.2, X generates (in boundedly many
steps) a coset of an in'nitely-de'nable subgroup B of B, and B is the intersection

155

90

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

of countably many de'nable subgroups of B (each, by c-minimality, of 'nite index in


B). Further every type of B is a translate of some type of B . Now if X were locally
modular, then so would be B , and hence B. Thus X is not, and so by the trichotomy
theorem for minimal types, there exists a de'nable 'nite-to-one map h : X Fix( )
(de'ned over some 'nite C). Since every element of B is a 'nite sum of elements
of X , every element of B is algebraic over C k. Enlarging C so as to include coset
representatives for B=B (at most continuum), B acl(C k). By Lemma 4.0.2, there
exists an isogeny between B and a de'nable subgroup of 'nite index of H  (k), with
H  an algebraic group over k.
Choose H  a k1 -algebraic group of least possible dimension, such that k1 is a 'nite
extension of k, and with B de'nably isogenous to a subgroup of H  (k1 ). Then it is clear
that H  is a de'nably simple commutative algebraic group (if it has proper subgroups,
enter them or factor them out). The existence of the isogeny between B and a subgroup
of H  , together with Lemma 4.1.6, shows that some conjugate of H  by a power of
is isogenous to A as an Abelian variety. Thus A is isomorphic to H  =F for some
H  de'ned over a 'nite extension of k, and some 'nite subgroup F of H  ; so A is
isomorphic to an Abelian variety de'ned over a 'nite extension of k.
Finally, to prove (c) and (d), suppose A is de'ned over k1 = Fix( n ). If B Ker( N
1) for some N (with n|N ), then B A(Fix( N )), and it is clear that there exists a 'niteto-one map of B into k, and B is not LMS. Conversely, if B is not LMS, then we saw
that there exists a de'nable isogeny h : B H (k1 ) (perhaps after enlarging k1 ). The
isogeny h can be viewed as a de'nable subgroup of A H , and by Remark 4.1.10
there are only countably many such subgroups; hence h is de'ned over some 'nite
extension k2 of k1 . If [k2 : k] divides N , then clearly every point of B=Ker(h) is 'xed
by N . Say Ker(h) has order m; then
N (x) = x(mod Ker(h))
for every x B, so mx is 'xed by N , and since mB has 'nite index in B, every point
of B is 'xed by some power of N ; thus enlarging N if necessary, B is 'xed pointwise
by N .
Corollary 4.1.13. Let A be a semi-Abelian variety; de4ned over Fix( ). Let p(T ) be a
polynomial with integer coeEcients. Then Ker(p( )) is LMS i6 p has no cyclotomic
factors; i.e. i6 p(!) = 0 for ! a root of unity.
Proof. Suppose 'rst that p has a cyclotomic factor q. Then Ker(q( )) Ker(p( )).
But Ker(q( )) is contained in some 'nite extension of the 'xed 'eld, i.e. in Ker( n 1)
for some n. So Ker(q( )) is not LMS, hence Ker(p) is not either.
For the other direction, we may assume A is a simple Abelian variety, or Gm .
Suppose Ker(p) is not LMS. Write f = p( ) = f1 : : : fr , fi E (A) irreducible. Then
by Lemma 4.1.12, Ker(fi ) is not LMS for some i. By Proposition 4.1.2, Ker(fi )
Ker( N 1) for some N . We may choose N large enough so that all endomorphisms
of A as a semi-Abelian variety are de'ned over Fix( N ). By Proposition 4.1.1(3),

156

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

91

C = Ker( N 1) is a submodule of A with respect to the action of E (A). Let R be


the ring E (A) modulo the kernel of this action. Since p(T ) has no cyclotomic factors,
p; T N 1 are relatively prime in Q[T ], so p is invertible in Q[T ]=(T N 1), and hence
f = p( ) is invertible in R. However, Ker(fi ) has nonzero rank, and is contained in
C. Thus rank(fi (C))rank(C), and so rank(f1 : : : fr (C))rank(C), contradicting
the invertibility.
Corollary 4.1.14. Let M Mn (Z) GLn (Q). Let A be a semi-Abelian variety de4ned
over Fix( ); with group law written additively. Let M act on An by matrix multiplication. Let AM = {a An : (a) = Ma}. Then AM is LMS if the characteristic polynomial
of M has no roots of unity among its zeroes.
Proof. Let 4i : An A be the ith projection, Ei = 4i AM . Then AM is LMS i& each such
Ei is LMS. Now acts on AM and on Ei , and the projection is -invariant. Let p be
the characteristic polynomial of M . Then p( ) = 0 on AM , hence also on Ei . Thus the
minimal polynomial pi satis'ed by on Ei divides p. So if p has no root of unity
roots, neither does pi , and each Ei is LMS. Conversely, if p does have a root of unity
root, then so does the minimal polynomial satis'ed by , and since this polynomial
divides Bi pi , so does some pi . This reduces the situation to Corollary 4.1.13.

92

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

Proof. If B is a de'nable subgroup of Gam , then {a: aB B} is a de'nable subring


containing the integers, hence containing k. Conversely, if B admits a de'nable vector
space structure over k, let S be the semi-direct product of k and B. By Lemma 4.0.4
there exists a de'nable group homomorphism > : S H , H an algebraic group, with
'nite kernel. However there are no 'nite normal subgroups of B, so > is injective on
B, and one sees that the Zariski closure of >(B) is an algebraic vector group.
Let A be an Abelian variety. We are interested in exact sequences
0LGA0
with L a vector group; sometimes we will write G when we have the whole exact
sequence in mind.
Denition 4.2.3. Let 0 L G A 0 be exact, and let h : L L be a homomorphism. We de'ne h(G) and an exact sequence
0 L h(G) A 0
as follows. Let H = {(h(x); x): x L} L G. Let h(G) = (L G)=H . De'ne a map
L h(G) by sending x L to (x; 0)=H . De'ne a map h(G) A by (x; y)=H  4(y).
We also have a natural morphism G h(G), given by x  (0; x)=H .
Lemma 4.2.4. There exists an exact sequence

4.2. Extensions of Abelian varieties by vector groups


We begin by observing that the 'nite S1-rank de'nable subgroups of vector groups
are all 'nite-dimensional vector spaces over the 'xed 'eld k, hence are not locally
modular. (This is the only point that we believe to be really di&erent in positive characteristic; there some of the de'nable groups appear to be locally modular, though
unstable.) We then note new phenomena occurring inside extensions of Abelian varieties by vector groups. We will 'nd there, in particular, de'nable groups containing a
k-space as a subgroup, with LMS quotient. We will describe the de'nable subgroups
the maximal extension of A by a vector group. Arbitrary extensions of A by a
of A,
vector group can be viewed as a direct sum of vector groups, and of quotients of A

by vector subgroups; so the entire situation is determined by what happens within A.


Denition 4.2.1. We say that an algebraic group G is a vector group if it also admits
an algebraic vector space structure; equivalently G is isomorphic to Gan for some n.
A de'nable group E of 'nite rank is a vector group if it admits a de'nable vector
space structure over the 'xed 'eld k.

0 L A 4A A 0
With L a vector group; with the following universal property: For any extension
0 L G  A 0
there exists a unique homomorphism h from the A exact sequence to the G  exact
sequence; above the identity on A. We have h = (hL ; h; IdA ); and G  = hL (G) in the
sense of De4nition 4:2:3.
Proof (Serre [28]). We take L = H 1 (A; OA ) . Given any e L ; we form e(G  ); and
obtain an element of H 1 (A; OA ). This describes a map L H 1 (A; OA ); whose dual is
hL . One shows that hL (G) G  . Uniqueness of h can be seen by applying Lemma 4.2.5
below: the graph of h is the unique minimal subgroup of {(x; y) G G  : 4(x) = 4(y)}
projecting onto the diagonal of A.
The map A A can be made into a functor; given a homomorphism of Abelian
varieties
e : A B;

Lemma 4.2.2. Every 4nite rank de4nable subgroup of a vector group Gam is nonorthogonal to the 4xed 4eld k; and indeed is a 4nite-dimensional k-space. A 4nite
rank de4nable group is a vector group i6 it embeds into an algebraic vector group.

157

one can canonically de'ne

e : A B:

158

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

93

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

4.3. Semi-Abelian varieties

Let
G  = {(g; g ) A B : e(4A (g)) = 4B (g )}:
Let L = (Ker(4A ) Ker(4B )) G  . Then 0 L G  4A pr1 A 0 is exact. The
universal property of Ga gives a map h : A G  . We let e = pr2 h.
We can also describe e using the following lemma.
Lemma 4.2.5. Let 0 L G 4 A 0 be an exact sequence of algebraic groups;
with L linear; and A an Abelian variety. Then there exists a unique minimal H G
with 4(H ) = A.
Proof. 4(H ) = A i& G=H is a linear group. If H1 ; H2 have this property, so does their
intersection H1 H2 .
Then the graph of e is the unique minimal subgroup of A B projecting onto the
k
k
graph of e. In particular, for any algebraic homomorphism A A ; we get e : A A .
Using Proposition 4.1.1, we obtain a homomorphism from the de'nable endo i
 i
we map h =
morphisms of A to the de'nable endomorphisms of A:
ei to h =
ei .

Finally, if N = Ker(h); we let N = Ker(h). It is easy to see that this is well de'ned;
and that 4(N ) = N .
The following proposition shows that every de'nable subgroup of A is the pullback
of a subgroup of the vector group ker(4 : A A); under a certain homomorphism,
whose kernel is one of the canonical groups N ; N a subgroup of A.
Proposition 4.2.1. Let A be an Abelian variety; 4 : A A the maximal vector extension A. Let N = Ker(h) be a de4nable subgroup of A of 4nite rank; and let N ; h be
Then:
the corresponding subgroup and endomorphism of A.
(1) N has 4nite rank. There is an exact sequence
0 (L N ) N N 0
whose kernel L N is a de4nable vector group.
(2) There exists a de4nable exact sequence
0 N 41 (N ) ker(4) 0
(3) 4(N ) projects onto N; and is minimal in the following sense: Let M be any
de4nable subgroup of G; projecting onto a 4nite index subgroup of N . Then M
contains a subgroup of N of 4nite index.
Proof. (1) Is clear.
(2) The map h takes 41 (N ) to 41 (h(N )) = 41 (0) = ker(4). The kernel of this
map, by de'nition, is N .
(3) Left to the reader; uses Lemma 4.2.5.

159

94

Let A be an Abelian variety. We are interested in extensions


0T GA0
with T a multiplicative torus. Let X (T ) be the group of rational homomorphisms of T
into Gm ; de'ned over an algebraic closure. For J X (T ); we obtain an extension of A
by Gm ; we pass from G to J(G); with the structure of extension of A by Gm de'ned
in De'nition 4.2.3. Forgetting the group structure, this can be viewed as a line bundle
over A; with the 0-section removed; by [27], the line bundle is algebraically equivalent
to 0. We thus obtain an element of A (U); where A is the dual Abelian variety, and
U is the universal domain. Let F(G) be the set of all elements of A (U) obtained in
this way, using di&erent J. This is a 'nitely generated subgroup of A (U); with (at
most) rank(T ) generators.
Fix a 'nitely generated torsion free subgroup , of A (U); and let G(,) be the class
of extensions G as above with F(G) ,. This class admits a universal object A, ; in
the same sense as in the vector case treated above:
Lemma 4.3.1. There exists an exact sequence in G(,);
0 T, A, 4 A 0
with T, = Hom(,; Gm ). Given an extension
0 T  G  4  A 0
in G(,); there exists a unique homomorphism h : A, G  with 4 h = 4.
Proof. Let {Fi } be a Z-basis for ,. By Serre [28], there exists a (unique) extension
0 Gm Hi 4i A 0
with corresponding line bundle Fi . Let A, = BA;4i Hi be the 'ber product. The kernel
T = Bi Gm of A should be identi'ed with Hom(,; Gm ). Given F ,; evaluation at F
gives a homomorphism eF : T Gm . We form eF (A, ) as in De'nition 4.2.3.
Claim. Let F ,. Then eF (A, ) is the extension of A by Gm corresponding to the
element F of A .
Proof. Let si be a rational section of 4i . Let Di be the corresponding divisor (zero divisor minus polar divisor). Then ( 1 ; : : : ; sk ) is a section of 4 : A, A. Composing with
the projection A, eF (A, ) we obtain a rational section of eF (A, ). If eF (t1 ; : : : ; tk ) =


Bi timi ; then the divisor of s is given by i mi Di . Since F = i mi Fi within ,; s corresponds to F as required.
Now let G  be as in the lemma. Given J Hom(T  ; Gm ); we obtain as above an
extension J(G  ) by Gm ; corresponding to an element F(J) of ,. By the claim, we

160

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

95

obtain a surjective map hJ : A, J(G  ); with kernel Ker(eF ). Now if J1 ; : : : ; Jr form a


Z-basis of Hom(T  ; Gm ); then the maps Ji : G  G  =Ker(J) induce an isomorphism of
G  with the 'ber product over A of the groups G  =Ker(J). Thus the hJi glue together
to give a map h : A, G  ; with 4 h = 4.
Uniqueness can be proved as in the vector extension case, using Lemma 4.2.5: the
graph of h is the unique minimal subgroup of the pullback to GA G  projecting via
(4; 4 ) onto the diagonal of A.
Remark 4.3.2. (1) In the situation of the lemma, the restriction of h to T, is given
as follows: Given J Hom(T  ; Gm ); we obtain as above J(G  ); an element of ,. This
gives a map X (T  ) , = X (T ). Dualizing we get a map T, T  ; this agrees with h.
(2) Torsion points in A (U) correspond to extensions by Gm that become trivial
upon a base change to an isogenous Abelian variety. Thus by considering only the
case of torsion free ,; we will not miss out any groups.
(3) More generally, let , 2 be torsion free subgroups of A (U). We have a
surjective restriction map i : Hom(2; Gm ) Hom(,; Gm ). The following lemma will
describe how to extend i to a map A2 A, ; with the same kernel. If , has 'nite
index in 2; i has 'nite kernel.
(4) The group A, is de'ned abstractly, but not as a group variety over k; if we
choose a Z-basis b = (b1 ; : : : ; br ) for ,; then A, can be realized as a group variety over
k(b); the torus T, can then be identi'ed with Gmr .
Lemma 4.3.3. Let h : A A be a homomorphism of algebraic groups. h induces a
map h : A A ; by pulling back line bundles. Let 2 be a 4nitely generated; torsion
free group contained in h 1 (,). There exists a unique homomorphism of extensions
of A: h, : A, A2 . Conversely; if 2 is a 4nitely generated torsion free subgroup of
A ; and g : A, A2 is a homomorphism of algebraic group extensions of A; then
2 h 1 (,).
Proof. First let 2 be a 'nitely generated torsion free subgroup of A ; and g : A, A2
a homomorphism compatible with h. Let G 2. We wish to show that h (G) ,. Let
2 be the subgroup generated by G. Composing g with the natural surjection from A2
to A2  we obtain another homomorphism compatible with h. Thus we may assume 2
is generated by G. In this case A2 is an extension of A by Gm ; corresponding precisely
to G A . Let J be the restriction of g to the torus T of A, . Then J(A, ) = A2 A ; h A.
As line bundles, A2 A ; h = h (A2 ). Thus h (G) F(A, ) = ,.
Now assume 2 h 1 (,). Let G = A2 A ; h A. G 'ts into an exact sequence
0 T2 G A 0. We have G G(,); since h (G) , for G 2. By the universal
property, there exists a unique homomorphism f : A, G compatible with the identity
on A; and the map h : 2 ,. Composing f with the natural map of G in A2 ; we
obtain the desired homomorphism h, .
For the uniqueness, note 'rst that if h : A, A2 is any homomorphism compatible with h; and 4 : A, A is the structure map, then h (x) = (h (x); 4(x)) de'nes a

161

96

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

homomorphism of extensions of A h : A, G. Thus the uniqueness follows from the


corresponding statement for h in the universal property.
The de'nable subgroups of A, can now be determined in a manner analogous to
the case of a vector group extension. We 'rst extend the contravariant duality functor
A  A to the category of Abelian variety and de'nable homomorphisms (rather than
just algebraic homomorphisms). To do so, using Claim 4 of Proposition 4.1.1, it suQces
n
to set: ( n ) : A A; ( n ) (x) = n (x).
Next, if f : A A is a de'nable homomorphism of Abelian varieties, , is a 'nitely
generated torsion free subgroup of A ; and 2 of A ; and 2 f 1 (,); we construct
a de'nable map f, : A, A2 ; as follows. We can write f = e (IdA ; ; : : : ; m ). Let
m
m
B = A A ; and let G = A, A , . Then G = Bpr0 ,++prn , m . By Lemma

= e((x;
4.3.3, the map e : B A lifts to a map e : G A2 . Let f(x)

(x); : : : ; m (x))).
The kernels of the maps f are (up to commensurability) the de'nable subgroups of
the groups A, . We leave the remaining details to the reader.

4.4. Mixed structures


If A is a 'nite rank subgroup of the 'xed 'eld, or is LMS, then the structure on A
is clear. A bit more remains to be said in the mixed case; in the case of subgroups of
vector extensions of Abelian varieties, we saw that a group may have a k-space as a
subgroup, with LMS quotient, yet be indecomposable. Similarly, there are subgroups
of semi-Abelian varieties with an LMS subgroup, whose quotient is an orthogonal
LMS group. In this section we include two lemmas that can be used to determine
the structure in such cases. In the case of 'nite Morley rank, similar results were
proved in [14]. Curiously, the almost-orthogonality case was important there, and
the full-orthogonality case was included but not used; here just the opposite situation
prevails.
Denition 4.4.1. Let A be a commutative algebraic group. Let V be the maximal vector
subgroup of A. By a special subvariety of A we will mean one of the form C + Y;
with C a coset of a connected group subvariety of A; and Y a subvariety of V .
If more generally A is a de'nable group, B the maximal vector subgroup, a de'nable
subset of A will be called special if it has the form C + Y; C a coset of a de'nable
subgroup of A; Y a de'nable subset of V .
Corollary 4.4.2. Let A be a commutative algebraic group; assumed for simplicity
to be de4ned over the 4xed 4eld Fix( ) in a di6erence-closed di6erence 4eld K. Let
F Z[T ] be an integral polynomial with no cyclotomic factors; and let G = {a A(K):
F( )(a) = 0}. Then every de4nable subset of G is a 4nite Boolean combination of
special subsets. If X is a subvariety of A; then the Zariski closure of G X is a 4nite
union of special subvarieties of A.

162

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

97

Proof. Let 0 V A 4 B 0 be exact, V the maximal vector subgroup of A; B a


semi-Abelian variety. Let 0 (V G) G BU 0 be the restriction to G. By Corollary 4.1.13, BU is LMS; on the other hand V G is a de'nable vector group. Hence,
using Lemmas 3.4.8 and 3.4.9, the hypotheses of Proposition 3.6.1 hold for this latter
sequence. Moreover, since BU is stable, the approximate aQne homomorphism in the
conclusion is an aQne homomorphism, and further since B is LMS, the set B1 in the
U Thus
conclusion is a 'nite Boolean combination of cosets of group subvarieties of B.
by Proposition 3.6.1, every de'nable subset of G is a 'nite Boolean combination of
special de'nable subsets of G. Applying this to G X; and taking Zariski closure, we
get the desired conclusion.
We observe that there is little di&erence between special subvarieties and group
subvarieties, as far as the torsion points are concerned.
Lemma 4.4.3. Let A be a commutative algebraic group over Qa ; T the group of
torsion points (or; prime-to-p torsion points) of A(Qa ); X a subvariety of A. Suppose
M
X T i=1 Di ; where Di X is a special subvariety of A. Then the Zariski closure
of X T is the union of at most M cosets of connected group subvarieties of A.
Proof. Let D be a special subvariety of A; with D T = . We will show that the
Zariski closure Z of T D is a coset of a connected group subvariety of A. Since the
Zariski closure of T X equals the union of the Zariski closures of the sets T Di ;
this will prove the lemma.
Write D = C + Y; C a coset of a connected group subvariety E of A; and Y a
subvariety of a vector subgroup V of A. Since V is a vector group, there exists a
de'nable endomorphism 4 : V V with kernel V E; 42 = 4. We have D = C+Y = C+
E + Y = C + E + 4(Y ) = C + 4(Y ). Thus replacing Y by 4(Y ) and V by 4(V ); we may
assume E V = (0). Pick d0 (D T ). For any d (D T ); let d = dd0 ; Y  = Y d0 .
Then d (E+Y  ) T . Write d = e+y; e E; y Y . Then me+my = 0 for some m0.
So me = my (E V ) = (0). Hence my = 0 so y = 0. Thus d = e E; so d E + d0 .
We have shown that (D T ) = (E + d0 ) T = (E T ) + d0 ; the last equality because
d0 is torsion. Let E  be the Zariski closure of (E T ); E1 the connected component.
Since E1 is divisible, the sequence 0 T (E1 ) T (E) T (E=E1 ) 0 is exact, so E=E1
has a 'nite torsion group, hence is a vector group and actually has no torsion. Thus
E  = E1 is connected. The Zariski closure of (D T ) is a coset of E1 .
4.5. Several automorphisms
It is illuminating to trace the local modularity demarcation line in the case of several
automorphisms. The results will be obtained in a weak form; algebraic modularity in
place of LMS. The stronger form (including stability of the full induced structure)
does not hold in the more general context. The present results will be deduced from
the ordinary (r = 1) case without reopening [5].

163

98

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

In this subsection, we work in a universal domain U for the theory of 'elds with r
automorphisms, 1 ; : : : ; r . F denotes the free group generated by 1 ; : : : ; r . If $ F;
we denote by (U; $) the structure consisting of the underlying 'eld of U; and the automorphism $. It is a universal domain for ordinary di&erence 'elds. Write Ui = (U; i ).
Recall the action of F on di&erence equations. We obtain an induced action on the

class of all de'nable sets, that we denote: B  B$ . Also write BF = $F B$ .
In the ordinary case one could indi&erently study de'nable subgroups, or in'nitely
de'nable ones; the latter were connected components of de'nable subgroups. For r1
the situation is di&erent; one must allow in'nite intersections of de'nable subgroups in
order to obtain a group of 'nite rank. It remains true (with the same proof) that every
de'nable group maps into an algebraic one, with 'nite kernel. Finite rank subgroups
of the additive group are vector spaces over the common 'xed 'eld k of all the
automorphisms. Minimal de'nable groups thus live (up to isogeny) in simple Abelian
varieties, or in Gm ; as before.
Finite transformal degree: Consider a simple Abelian variety (or torus) G de'ned
over k. Let E = Q End(G). The twisted group ring E[F] is de'ned in the obvious
way, taking into account the action of the i on End(G). (Of course this ring is no
longer Euclidean.) Let A be an -de'nable subgroup of G; of 'nite transformal degree.
Associate to A a left ideal and a two-sided ideal of E[F]:
I0 (A) = {r : rA is 'nite},
I (A) = {r : rsA is 'nite for all s E[F]},
R(A) = E[F]=I (A).
Lemma 4.5.1.
dimE E[F]=I0 (A)6dim(A).
dimQ E[F]=I (A)6(dim(A)dimQ E)2 .
There exists an -de4nable subgroup B of G; of 4nite transformal degree; containing A; such that I0 (B) = I (B) = I (A) and R(A) = E[F]=I (B).
Proof. Let d = dim(A). Note 'rst that
dimQE E[F]=I0 (A)] 6 dim(A):
To prove this it suQces to 'nd an E-dependence relation among any d + 1 elements h0 ; : : : ; hd E[F]. Let h(x) = (h0 (x); : : : ; hd (x)) and consider the subgroup h(A)
of G d+1 . This is a de'nable subgroup of transformal degree at most d. Thus the Zariski
closure has dimension at most d. Using the simplicity of G one obtains an E-linear
dependence relation, as in the case r = 1.
Thus dimQ E[F]=I0 (A)6dimQ (E) dim(A).
n
Let r1 ; : : : ; rn be a Q-basis for E[F]=I0 (A). Let B = i=1 ri (A). If r E[F]; then

mr =
ai ri + s with s(A) 'nite, m0 and m; a1 ; : : : ; an Z. So mr(A) B + s(A); i.e.
r(A) B has 'nite index. Thus r(B) B has 'nite index for any r E[F]. Now clearly
I0 (B) = I (B) = I (A).

164

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

99

Applying the 'rst item to B; the second follows.


Proposition 4.5.1. Let G be a simple Abelian variety de4ned over k. Let A be a
de4nable subgroup of 4nite transformal degree:
1. If the image of F in E[F]=I (A) is a 4nite group; then A is de4nably isomorphic
to an -de4nable subgroup of H (k); H a k-algebraic group.
2. Otherwise; A is algebraically modular. Moreover; every quanti4er-free de4nable
subset of A is a Boolean combination of de4nable cosets.
Proof. Let K be the kernel of the homomorphism F E[F]=I (A) . If the image of
this homomorphism is 'nite, of order n; then K is 'nitely generated, say by g1 ; : : : ; gm .
By de'nition, (1 gi )(A) is 'nite; so there exists a subgroup A of A of 'nite index
in A; such that each gi 'xes A . Thus A G(kn ) where kn = Fix(K); a Galois 'eld
extension of k of order n; k = K F being the total 'xed 'eld. By reduction of scalars
we obtain the desired statement.
Assume the image F is in'nite. A 'nitely generated periodic linear group over Q
is 'nite; hence if the image of F is in'nite, it must contain an element U of in'nite
order. U is the image of some in F. Now F( )
U = 0 for some F Z[X ]; and we may
take F irreducible in Q[X ]. Then F has no cyclotomic factors. The solution set in U1
of F( ) = 0 is ALM, and contains A. Thus A is also ALM.
For the moreover, as I is a two-sided ideal, F($ $1 ) = 0 for any $. So A
Ker F( $1 ); and $(A) Ker(F( )). Thus any 'nite product of the sets $(A) is contained in some power of Ker(F( )); so it is ALM. Consider the many-sorted structure
V whose universes are the groups $(A); and whose relations are the intersections of
subvarieties of G m with products on m of these. Then V is a many-sorted Abelian
structure. Let V  be the result of adding the isomorphisms : $(A) >(A); for any
pair ; $ and > = $. These isomorphisms are additive, so V  is still an Abelian structure. Every quanti'er-free de'nable subset of A (in U) is also de'nable in V  ; and the
conclusion follows.
It seems likely that if no U is a root of unity, = 1 F; then A is LMS. In case
the image of F is in'nite, but the image of some = 1 is a root of unity, A is ALM
by Proposition 4.5.1, but one does not have stability.
Problem 4.5.2. Is there an analog of the proposition; for in'nitely-de'nable sets of
'nite rank; not assumed to carry a group structure?
Products of ALM groups: In the case of one automorphism, arbitrary de'nable
groups could be decomposed into 'nite rank groups, and algebraic groups; the 'nite
rank theory is thus decisive. Here the situation is more complex; even where the
automorphisms commute, one has groups of intermediate orders of magnitude; and
beyond that, the theory is not even supersimple. In particular, the 'nite rank results
of the previous paragraph do not yield a classi'cation of all ALM in'nitely de'nable
subgroups.

165

100

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

We restrict ourselves to pointing out one class of ALM groups. It consists of groups
AF ; where A is a de'nable LMS group in some ordinary reduct Ui ; and of their products. The proof that AF is ALM is the same as that given in the previous paragraph;
but the proof that the property holds for products is di&erent, and may be of use in
other situations.
Proposition 4.5.2. Let Gi be a commutative algebraic group over U. Let k = Fix(F).
Let Ai be a Ui -de4nable subgroup; LMS of 4nite dimension as such; de4ned over k.
Then (A1 )F (Ar )F is algebraically modular.
For notational simplicity, we prove Proposition 4.5.2 in case r = 2. But in this
case we formulate a sharper statement, paying attention to the number of conjugates
used. Assume r = 2; and write = 1 ; $ = 2 . Then Proposition 4.5.2 follows from
Lemma 4.5.3 (keeping in mind the proof of 3:5:6(3)).
Lemma 4.5.3. Let Gi be a commutative algebraic group over U (i = 1; 2); Gi de4ned
over Fix($j ) ({i; j} = {1; 2}); and V a constructible subset of G1 G2 . Let Ai be a
Ui -de4nable subgroup of Gi ; LMS of 4nite transformal degree di as such.
d1
(2d +1)d2 $n
n
Let E2 = n=d
A 2 ; E1 = n=01
A1 .
1
Then the Zariski closure of (E1 E2 ) V is a 4nite union of cosets of group
subvarieties.
Proof. Assume the data are de'ned over an algebraically closed di&erence 'eld K. For
b G2 ; let V (b) = {a G1 : (a; b) V }. Let Z(b) = ZCl(V (b) A1 ). As A1 is ALM,
A1 V (b) is a 'nite union of cosets of U1 -de'nable subgroups. By Hrushovski and
Pillay [15], in a 1-based group, there are no in'nite de'nable families of de'nable subgroups, so as b varies only 'nitely many distinct subgroups arise. Thus also upon taking Zariski closure, there exist K-algebraic subgroups H1 ; : : : ; Hl G1 ; such that for any
b G2 ; Z(b) is a 'nite union of cosets of the Hi . By Proposition 2.2.1, each such coset is
d
d
d
d
de'nable over K(b 1 ; : : : ; b 1 )a . Let G = G22d1 +1 . For b G2 ; let b = (b 1 ; : : : ; b 1 )
G. Let 4((yd1 ; : : : ; yd1 )) = y0 ; 4 : G G2 . By compactness, there exists constructible

sets Wi G1 G such that for any b G2 ; Z(b) = i Wi (b ); and for any y = (yd1 ; : : : ;
yd1 ) G; Wi (y) is a 'nite union of cosets of Hi ; and Wi (y) V (y0 ).
So far, we have not used $ at all.

Let Xi = {(a; b) E1 E2 : (a; b ) Wi }. Then (E1 E2 ) V = i Xi . So it suQces
to prove that ZCl(Xi ) is a 'nite union of cosets of group subvarieties, for each i.
Fix one value of i. Let H = G1 =Hi ; E = the image of E1 in H; B = A22d1 +1 G; V  =
image of Wi in H G; X = {(a; b ) : (a; b) Xi }. Note that if b E2 then b B.
Applying Lemma 3.5.11 to this data, and to the automorphism $ (viewing E as a

possibly unde'nable group of points), we 'nd that ZCl(X ) = j Cj ; with Cj cosets
of group subvarieties. Write 4 also for the map (Id; 4) : (G1 G) (G1 G2 ). Then
4Cj is a constructible coset of G1 G2 (hence a coset of a group subvariety). We

have Xi = 4X j 4Cj . As X Cj is Zariski dense in Cj ; 4(X Cj ) is Zariski dense

166

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

in 4Cj ; but 4(X Cj ) (4X 4Cj ) = (Xi 4Cj ). Thus ZCl(Xi ) =


lemma.

101

4Cj ; proving the

Let G be an algebraic group de'ned over U. Say that a subgroup H of G(U) is


di6erence-algebraically modular if for any di&erence-algebraic variety V G m ; H V
is a 'nite union of cosets. The following remark applies, in particular, to the group
obtained in the conclusion of Proposition 4.5.2, and can be used to strengthen that
conclusion.
Lemma 4.5.4. Let G be an algebraic group de4ned over k. Let H be an F-invariant
group; i H H; and suppose H is algebraically modular. Then H is di6erencealgebraically modular.
Proof. V G be a di&erence-algebraic variety (it suQces to consider this case, applying it to powers of G and of H ). Then for some 'nite F F; and some variety
V  G F ; V = {a G : aF V  }. Now V  H F is a 'nite union of cosets, since H is
ALM. Thus so is V H = {a : aF (V  H F )}.
4.6. Locally modular subgroups (complements added in proof)
While our main results concern individual locally modular subgroups, we add two
remarks on the family of all such subgroups taken together. They are stated for LMS
de'nable subgroups of an Abelian variety, in a universal domain for ordinary di&erence
'elds; but should go through in positive characteristic, or with several automorphisms,
for ALM groups. This section is not used in the rest of the paper.
Denition 4.6.1. Let A be a semi-Abelian variety de'ned over a di&erence 'eld K.
Let ALM be the union of all LMS de'nable subgroups of A.
Remark 4.6.2. By Remark 4.1.10, every de'nable subgroups of A is commensurable
with one de'ned over K a ; and indeed is a subgroup of 'nite index of a group de'ned
over K a . The sum of two LMS groups is LMS (Lemma 3.4.1). Thus if B is LMS and
de'ned over K a ; the sum of conjugates of B is also LMS, and de'ned over K. So
ALMS is a union of k-de'nable groups. It is moreover a subgroup of A.
If C is an Abelian group, we write rkQ (C) for the Q-dimension of C Z Q.
Lemma 4.6.3. Let A be a semi-Abelian variety de4ned over a di6erence 4eld K = K a .
Let L be a di6erence 4eld extension of K and suppose tr:degK L = n. Then
rk Q (ALM (L)=ALM (K))6n dimQ (End(A)):
Proof. For simplicity of notation, we prove the lemma when A is de'ned over the
'xed 'eld of ; then comment on the generalization.

167

102

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

Pick a1 ; : : : ; ar ALM (L) with t = tr:deg:K (a1 ; : : : ; ar ) as large as possible; so t6n. In


particular, (ai ) k(a1 ; : : : ; ar )a . Each ai satis'es an equation of the form mk k (ai ) =

i
ik ei (ai ); where the ei are endomorphisms of A; and mk is an integer. (Reduces to simple case, and there follows from Proposition 4.1.1, and from the fact
that Q End(A) is a division ring there.)
U = ALM (L)=ALM (K) is a divisible Abelian group. Let V be the Q End(A)-subspace
of U generated by { k (ai ) : k = 1; 2; : : : ; i = 1; : : : ; r}. By the above remark, V is 'nite
dimensional (as an Q End(A)-space, hence as a Q-space).
Let b ALM (L). We will show that b V ; i.e. that mb = h(a) + e for some h
Hom(A r ; A); e ALM (K); and some integer m0.
a; b are both part of the same LM group B; of 'nite rank, and we have b acl k (a1 ; : : : ;
ar ). Thus by LM there exists a de'nable subgroup S0 B B r ; and e B(k a ); with
S0 (B 0) 'nite, and with (e + b; a) S0 . So there exists a de'nable homomorphism
h0 : B r B; and an integer m = 0; with m(e + b) = h0 (a).
This proves the lemma when A is de'ned over the 'xed 'eld. For the general
case, one may reduce to A = Gm or A a simple Abelian variety. If A has a 'nite rank
subgroup at all, then A is isogenous to A$ for some $ = n (take n0 least possible). An
entirely analogous proof then shows that rkQ (ALM (L)=ALM (K))6n dimQ (Hom(A; A$ )).
But Q Hom(A; A$ ) is a one-dimensional vector space over Q End(A); so dimQ Q
Hom(A; A$ )6dimQ Q End(A).
Problem 4.6.4. Suppose A is de4ned over a di6erence 4eld K; tr:deg:Q (K). Is
rkQ (ALM (K)) 4nite?
This problem is analogous to the ManinChai theorem of the kernel, for di&erential
'elds. We prove the case: A de'ned over k a :
Lemma 4.6.5. Let (L; ) be a di6erence 4eld; A a semi-Abelian variety de4ned over
k = Fix( l ) for some l. Suppose tr:degk L = n. Then rkQ (ALM (L))6nrkQ (End(A)).
In particular; if L k a then ALM (L) is torsion.
Proof. By the relative case, Lemma 4.6.3, it suQces to show that every element of
e ALM (k a ) is torsion. This reduces to the case that A is a simple Abelian variety,
or Gm . We have l (e) = e for some l. So ( l 1) (e) = 0. e lies in an LMS group
B. F End(A)[T ]; Ker(F( )) is commensurable with B; we may assume (replacing F
by some nF if necessary) that B Ker(F( )). Now F; T l 1 generate a left ideal I
bigger than E(A)[T ]F; say generated by G. Then Ker(G( )) Ker( l 1) Ker(F);
as Ker(F) is LMS, the intersection is 'nite. So Ker(G( )) is 'nite, hence G E(A);
as also G End(A)[T ]; G End(A); and G = 0. Every element g of the ideal I satis'es
g( )e = 0. So Ge = 0. Now G = 0; so there exists G  End(A); G  G = m Z\0. Thus
me = 0; hence e is torsion.
Lemma 4.6.6. Let A be a commutative algebraic group de4ned over a di6erence 4eld
K = K a . Let V be a constructible subset of A containing no translates of positive-

168

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

103

dimensional group subvarieties of A. Then there exists a di6erence 4eld L of 4nite


transcendence degree over K; such that every point of V ALM ; in any di6erence 4eld
extension; lies in A(L).
For simplicity, we stated the lemma in the case when V contains no translates of
positive-dimensional group subvarieties of A; in general, ZCl(V ALM ) is a 'nite union
of A(L)-de'nable cosets.
Proof. Let L1 be a 'nitely generated di&erence 'eld over K; with V de'ned over
L1 . Note that there is no properly increasing chain K L(1) L(2) : : : of relatively
algebraically closed di&erence sub'elds of (L1 )a . For in such a chain, let I (n) be the
ideal of di&erence polynomials over L(n) vanishing on a given tuple of generators of
L1 over K. If I (n) is generated by I (n 1); it is easy to see that L(n) = L(n + 1). Thus
we have an increasing chain of prime di&erence ideals, contradiction. Hence, there
exists a maximal algebraically closed di&erence sub'eld L of (L1 )a satisfying K L
and tr:deg:K (L). Now if c ALM V; then c B for some LM group B A; and by
LM and the assumption on V; B V is 'nite. So c (L1 )a . But c B; so the di&erence
sub'eld generated by c over K has 'nite transcendence degree over K. Thus c L.
Putting together Lemmas 4.6.6, 4.6.5, and the theorem of FaltingsMcQuillan, we
see that if A is a semi-Abelian variety de'ned over Fix( n ); then ALM is Mordellic,
i.e. the Zariski closure of any subset of ALM is a 'nite union of cosets of subgroups.
It would be very good to prove this without quoting Faltings theorem, and to remove
the assumption that A is de'ned over Fix( n ).

5. Finding the di$erence equations


This section contains all the number theory that we will need for the main results. We
show there that the torsion points are contained in a group de'ned by an appropriate
di&erence equation. This equation arises from characteristic equations of Frobenius
maps, lifted to characteristic zero.
We begin by 'xing a prime p and restricting attention to the group of points of
'nite order prime to p. We will be able to get bounds on the full group T of torsion
points by using two di&erent primes; however our bounds for Tp are better.

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

104

Now suppose A is de'ned over the ring of integers R of a number 'eld K. A prime
p of K is a prime of good reduction for A if the following holds. The reduced variety
Ak over the residue 'eld k = Rp =p becomes also a commutative, connected algebraic
group. Moreover, dimab (Ak ) = dimab (A); and dimm (Ak ) = dimm (A).
Denition 5.0.8. Let p be a rational prime. Tp (A) denotes the group of points of A(L)
of 'nite order prime to p; where L is some algebraically closed 'eld over which A
is de'ned. If p is a prime of a number 'eld, we will sometimes write Tp (A) with
reference to the residue characteristic of p.
We begin with a well-known result of Weils concerning Abelian varieties; it generalizes without e&ort to arbitrary commutative algebraic groups. Fix a prime p; let
k = GF(q) be a 'nite 'eld of characteristic p; and let k a be an algebraic closure. Let
'q be the q-Frobenius automorphism of k a .
Lemma 5.0.9. Let A be a commutative algebraic group over k = GF(q). Then there
exists a polynomial F(T ) Z[T ] with no cyclotomic factors such that F('q ) vanishes on Tp (A). We have deg(F)62 dimr (A). The sum of the absolute values of the
coeEcients of F is at most (1 + q1=2 ) 2 dimr (A) .
Proof. Observe that if f; g are two complex polynomials, and s(f) denotes the sum
of the absolute values of the coeQcients of f; then (): s(fg)6s(f)s(g). This will be
used twice. First () permits a decomposition of A. Note that there exists over k an
exact sequence 0 L A AU 0; with L a linear algebraic group and AU an Abelian
U Thus if F1 ('q )
variety. This gives rise to an exact sequence 0 T (L) T (A) T (A).
U then F1 F2 ('q ) vanishes on T (A). This
vanishes on T (L) and F2 ('q ) vanishes on T (A);
reduces the problem to the three cases of linear tori, commutative unipotent groups,
and Abelian varieties. When A is an Abelian variety, the result comes from [31]. Weil
actually shows the existence of a monic F of degree 2 dimr (A) whose eigenvalues are
all of absolute value q1=2 ; and then we can use the observation () above.
When A is an algebraic torus, there exists an isomorphism g : A Gmn ; de'ned over
a 'nite 'eld GF(ql ). The Frobenius conjugate 'q (g) is another such isomorphism, so
is 'xed
= g 'q (g)1 is an algebraic automorphism of Gmn ; i.e. GLn (Z). So
by 'q ; and
l

Denition 5.0.7. Let A be a commutative algebraic group.


1. dimm (A) is the dimension of the maximal algebraic torus embedded as a subgroup
of A.
2. dimab (A) is the maximal dimension of an Abelian variety quotient of A.
3. dimr (A)6dim(A) is de'ned as follows: let 0 = A0 Am = A; with Ai a group
subvariety of A over K; and Ai+1 =Ai a K-simple Abelian variety or torus. Let {Bj }
be the quotients Ai+1 =Ai ; but choose only one Bj for each K-isogeny type. Let

dimr (A) = j dim(Bj ).

169

'q ( ) 'l1
q ( ) = Id

using also that g is 'xed by 'ql . Now (A; q ) is isomorphic (via g) to (Gmn ; g'q g1 );
and g'q g1 ) = 'q . Now ( 'q )l = ('q )l = 'ql ; so the polynomial T l ql works.
When A is a unipotent group, it has no points of 'nite order prime to p; so the
constant polynomial 1 will do.
To lift this to characteristic zero, suppose now that A is a connected commutative
algebraic group over a number 'eld K; and p is a prime of good reduction. Then

170

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

105

Tp (A) A(L); where L is the maximal unrami'ed extension of the completion Kp of
K at p. The Frobenius automorphism '0 of k a lifts to an automorphism ' of L. 4 The
reduction map from L to k a induces an injective map on Tp (A). 5 It follows that '
satis'es the same functional equation on Tp (A) as 'q does on Tp (Ak ). Thus:
Lemma 5.0.10. With the above assumptions; there exists an automorphism 0 of K a
and an integral polynomial F with no cyclotomic factors; of degree dimr (A); absolute
coeEcient sum bounded by (1 + q1=2 ) 2 dimr (A) ; such that F( 0 ) vanishes on the primeto-p torsion points of A.
If (L; ) is any di&erence 'eld extending (K a ; 0 ); then F( ) vanishes on the primeto-p torsion points of A; since they all lie in K a . We may thus embed (K a ; 0 ) in a
universal domain (L; ) for di&erence 'elds, and work there.

106

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

on all torsion points. This requires more than the elementary number theory we have
used so far. If one is willing to quote Serre to show the existence of such an automorphism, results (1), (2) and (4) become equally easy for all torsion points, still
using the ordinary theory. (Condition (3) would require a p-adically continuous automorphism of this nature.)
The second route keeps the number theory of this paper at its present primitive
level, but uses two automorphisms instead of one. Using this method we prove Manin
Mumford for semi-Abelian varieties (5), and 'nd the explicit bounds (6). Since we
have not developed orthogonality beyond the 'nite rank context, this method does not
immediately yield the analogs of (1+ ) or (2). We expect these can be done either by
an ad hoc extension of the proof, or by developing orthogonality theory, but will be
content with (5) and (6).
6.1. Qualitative results

6. The theorems on torsion points


In this section we prove the number theoretic applications. For the group Tp of torsion
points of order prime to a given prime p (characteristic of a prime of good reduction),
we show: (1) Tp is algebraically modular, i.e. the ManinMumford conjecture holds
for it; (1+ ) The same is true for vector extensions of semi-Abelian varieties; (2)
relative version, reducing the MordellLang conjecture for groups of 'nite Q-rank to
Faltings theorem regarding 'nitely generated groups; (3) For groups over Qp with
good reduction, the distance of a subvariety to points of Tp not on it is bounded away
from 0. This partially con'rms a conjecture of Silverman, Tate, Voloch.
We then show (4) that the bounds in (1) (3) are e&ective; in the case of (1) and
(2) we write them e&ectively. They are also uniform when the subvariety varies in an
algebraic family, and when the Abelian variety is perturbed p-adically.
Conditions (1), (2) and (4) also follow from results of Faltings, Serre, Raynaud,
Hindry, McQuillan, Bost, David, Masser-WNustholz, as explained in the introduction.
These results will be easy applications of our theory of de'nable groups in the
universal domain for ordinary di&erence 'elds. Condition (1) will be an immediate
consequence of embedding Tp in an LMS de'nable group. For (1+ ) and (2), orthogonality is also used. For the e&ectivity, the general numerical bounds of Section 2
apply. With the intervention of a Robinson-style compactness argument, (3) becomes
as easy as (1),
If we wish to extend the results to the group of all torsion points, two routes are
available. One is to 'nd an automorphism satisfying an appropriate functional equation
4 For model theorists, this can be viewed as a consequence of quanti'er-elimination for algebraically
closed valued 'elds, in a language with a sort for the residue 'eld; it follows that the residue 'eld is stably
embedded, with no additional induced structure.
5 The tangent bundle of A can be trivialized, using the di&erentials of the translations. Then at any point
a, map x  p x induces the map (x  p x) on the tangent spaces. It follows that the roots of p x = 0 are
simple modulo p).

171

Proposition 6.1.1. Let A be a semi-Abelian variety de4ned over a number 4eld K.


Let T = T (A) be the group of torsion points of A. For any subvariety X A; the
Zariski closure of X T is a 4nite union of cosets of group subvarieties of A.
Proof. Pick two primes of good reduction of A; of distinct residual characteristics p; l.
We will treat the group T of all torsion points as a sum T = Tp +Tl  . Choose and $ and
polynomials Fp ; Fl as in Lemma 5.0.9, independently for the two primes p; l; so that
Fp ( ) vanishes on Tp ; and Fl ($) vanishes on Tl  . Let Ap = Ker(Fp ( )), Al = Ker(Fl ($)).
Let F the free group generated by ; $. With notation as in Proposition 4.5.2, by
Corollary 4.1.13 and Proposition 4.5.2, ApF AlF is algebraically modular. Hence (see
Proposition 6.1.1 below) so is ApF +AlF . Since Tp Ap and Tp is F-invariant, Tp ApF ;
and Tl  Al ; we have T A. So T is algebraically modular.
To clarify the relation between Ap Al and Ap + Al ; note:
Lemma 6.1.1. Let X be a subvariety of A;
Y = {(a; b) A2 : a + b X };
Z = ZCl(Y (Tp Tl ));
f : A2 A;

f(x; y) = x + y:

Then f(Z) = ZCl(X T ):


Proof. By the algebraic modularity of Ap Al ; Z is a 'nite union of cosets of group
subvarieties. For each such coset C; f(C) is a constructible coset, hence is Zariski
closed. So f(Z) is Zariski closed. Clearly (X T ) f(Z); so ZCl(T X ) f(Z).
On the other hand Y (Tp Tl  ) f1 (X T ); so Z f1 (ZCl(X T )); and f(Z)
ZCl(X T ). So they are equal.

172

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

107

Similarly, one can show:


Proposition 6.1.2. Let K be a di6erence 4eld. Let A be a semi-Abelian variety over
K. Let T be the group of torsion points of A. Then T is di6erence-algebraically
modular.
Proof. This can be deduced just as in the proof of Proposition 6.1.1, adding two
automorphisms in order to de'ne the torsion points, and quoting Proposition 4.5.2 and
Lemma 4.5.4.
6.2. A second route
In this subsection we assume () that S is a semi-Abelian variety over a number
'eld K; that p; l are two primes of good reduction, and that the 4elds K(Tp ); K(Tp )
are linearly disjoint over K. It follows from Serres results that this last hypothesis can
always be achieved after a 'nite 'eld extension of K. (And presumably, using [1, 20],
the extension can be found e&ectively if not explicitly.) With the hypothesis in place,
we obtain the results of the previous section using one automorphism only. Choose
and $ and polynomials Fp ; Fl in the previous section, so that Fp ( ) vanishes on Tp ; and
Fl ($) vanishes on Tl  . Now Tp Tl  ; so Fl ($) vanishes on Tp . Using (), we 'nd a single
automorphism > of K a agreeing with on Tp ; and with $ on Tp . Then Fp (>) vanishes
on Tp and Fl (>) vanishes on Tp . Let F = Fp Fl . Then F vanishes on Tp + Tp = T . And
F has no cyclotomic factors.
We obtain:
Proposition 6.2.1. Assume (); and let 0  U A S 0 be an extension of S by
a vector group. Let V be a subvariety of A. Then the torsion points on V lie on a
4nite union of torsion translates of group subvarieties of V .

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

108

as in Section 2. In particular, we have the graph of addition contained in A3 ; we refer


to the degree of this variety as d+ . Let dr = dimr (A); Dr = 2dr + 1; and
Sq = {(a0 ; : : : ; a2dr ) : ?mi ai = 0} A2dr +1 ;
Sq = {a A) : (a; (a); : : : ; 2dr (a)) Sq }:
Sq is to be viewed as a di&erence equation; the points are from a universal domain for
di&erence 'elds.
Lemma 6.3.2.
Sq is a group subvariety of A 2dr +1 of dimension 2dr (dim A):
4d (2d +1) log (1+q1=2 )

2D 2 log (1+q1=2 )

2
deg(Sq )6d+ r r
d+ r 2
:
There exists an automorphism of K a such that F( ) vanishes on Tp ; the primeto-p torsion points of A.
If A is a semi-Abelian variety; {a A : (a; : : : ; 2dr (a)) S} is LMS.

Proof. Everything but the bound on degree has been demonstrated. Multiplication by
an integer M 2; in an Abelian group, can be achieved by 62 log2 (M ) 1 operations of addition or of multiplication by 2 (express M in base 2). Hence a linear
g
polynomial
m x = 0 can be expressed (using additional variables) by means of
i=0 i i
at most g + i (2 log2 (|mi |) 1)6(g + 1) 2 ( log2 (M )) additions or subtractions, where
M = max{|mi |; 2}.
In our case by Lemma 5.0.9, M 6(1 + q1=2 ) 2dr ; g62dr ; so we can express Sq as a
projection of the intersection of
(2dr + 1)2 log2 ((1 + q1=2 )2dr ) = 4dr (2dr + 1) log2 ((1 + q1=2 )
varieties of the form aj + aj = aj . By Lemma 2.1.2(1) and (2), this has degree at most
4dr (2dr +1) log2 ((1+q1=2 )

d+

Proof. Immediate from Corollary 4.4.2 and Lemma 4.4.3.


Observe that the proposition is equally valid without () if one restricts to prime-to-p
torsion points.
6.3. Explicit bounds: p -torsion points
Here is a variant applicable to p -torsion points on arbitrary commutative algebraic
groups (without the assumption ()).
Notation 6.3.1. A is a commutative algebraic group over a number 'eld K. p is a

prime of K; with residue 'eld GF(q) of characteristic p. Fq = i mi T i is the equation
in Lemma 5.0.9. We 'x a projective embedding of A; we then obtain an embedding of
any subvariety of A r ; in multi-projective space; degrees will refer to this embedding,

173

Proposition 6.3.1. Let A be a connected; commutative algebraic group de4ned over


a number 4eld K. Let p be a prime of good reduction; with residue 4eld GF(q) of
characteristic p. Let Tp be the group of points of A(K a ) of 4nite order prime to p.
Let X be a subvariety of A. Then the Zariski closure of X Tp is a union of at most
4d (2dr +1) log2 (1+q1=2 ) 22dr dim(X )

(deg(X )2dr +1 d+ r

cosets of group subvarieties of A.


Proof. By Corollary 4.4.2, the Zariski closure X Sq is a 'nite union of a 'nite
number M of special subvarieties of A. By Lemma 4.4.3, the Zariski closure of
X Tp is a union of at most M cosets of group subvarieties of A (necessarily contained in X ). It remains only to bound M ; i.e. to bound the number of compo2dr
nents of the Zariski closure Z of X Sq . Let S = (X X ) Sq . Then Z is

174

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

109

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

110

(2d1 +1)d2

d1

the Zariski closure of {x : (x; (x); : : : ; 2dr (x)) S}. By Corollary 2.2.3, Z has a
dim(S)
. We have dim(S)62dr dim(X ),
projective embedding of degree at most deg(S)2
deg(S)6 deg(X )2dr +1 deg(Sq ). The result follows using Lemma 6.3.2.

Let S  =

Remark 6.3.3. Assuming () of Section 6.2, we obtain a similar bound for all
torsion points, using two primes of good reduction but a single automorphism, as in
Proposition 6.2.1.

Using the estimate dim(S  )6|A| dim(A), we get

6.4. Explicit bounds: all torsion points


Fix a number 'eld K and a semi-Abelian variety A over K, and a subvariety X A;
X need not be de'ned over K. We compute the bound arising from Proposition 6.1.1.
It is a question of mechanically putting together Lemma 4.5.3, Proposition 2.3.1, and
Lemma 5.0.9.
Fix q and p as in Proposition 6.3.1, and another prime of good reduction l, with
residue 'eld GF(l); assume l6q. We let Fq ; Fl be the equations as in Lemma 5.0.9,
and 'x automorphisms and $ such that Fq ( ) = 0 on Tp and Fl ($) = 0 on Tl  . Let Y
be as in Proposition 6.1.1. By Proposition 6.1.1, the number of components of X T
is at most that of Y (Tp Tl  ). Let notation be as in Proposition 6.3.1. Let d1 ; d2 62dr
be the degrees of Fq ; Fl , respectively; and let
j i

i=0

deg(S  ) 6

Sq

j=d1

Sl ;

29 (d +1)7 (log2 (1+q1=2 ))2


d+ r
:

deg ZCl((E1 E2 ) Y ) 6 deg(S  ) deg(Y )

2|A| dim(A)

where each of the terms deg(S  ), deg(Y ), |A| is estimated above. By Proposition 4.5.2
and Lemma 4.4.3, this also bounds the number of components of ZCl((Tp Tl  ) Y ),
and hence of ZCl(T X ). To summarize:
Proposition 6.4.1. The number of components of ZCl(T X ) is at most
29 (dr +1)7 (log22 (1+q1=2 ))

d+

deg(X )2

16(dr +1)3 dim(A)

6.5. The relative case; McQuillans theorem


Let A be a semi-Abelian variety over a number 'eld K, Let
, = {a A(K a ) : ma A(K) f or some integer m}:

j i

A = {$ : d1 6 i 6 d1 ; 0 6 j 6 d2 } { $ : 0 6 i 6 (2d1 + 1)d2 ; 0 6 j 6 d1 }:
So A is a connected subset of F of size
|A| 6 16(dr + 1)3 :
Note that Y is the projection to A A of the intersection of the graph of addition on
A, with (P P X ). Thus
deg(Y ) 6 deg(X )d+ :

The MordellLang conjecture stated that , is ALM. Raynaud, Hindry and McQuillan
reduced it to showing that A(K) is ALM (Faltings theorem). We wish to show here
how to do the same with our methods. If F( )(x) = 0 is an equation capturing the
and the orthogonality theory applies
torsion points, ( 1)F( )(x) = 0 will capture ,;
to ker( 1); ker F( ).
Pick a prime of K, of good reduction, with residue 'eld k of characteristic p, and
let
, p = {a A(K a ) : ma A(K) f or some integer m prime to p}:

We have as before
deg(Sq ) 6 d+ 4dr (2dr +1) log2 (1+q

1=2

For variety, we write the statements for varieties containing no translates of in'nite
group subvarieties. We 'rst give the statement for the prime-to-p torsion points.

so
deg

(2d1 +1)d2


Sq

6 d+ 4(2d1 +1)d2 dr (2dr +1) log2 (1+q

1=2

6 d+ 16(dr +1)

log2 (1+q1=2 )

i=0

(2d +1)d2 $i
A l ; these amount to the
The last estimate refers to the equations for E1 = i=01
i
equations for Al , applied to $ (x) for each i; 06i6(2d1 + 1)d2 .
d1
j
Similarly, if E2 = j=d1 Aq , the corresponding equations have degree

d1


deg
Sl 6 deg(Sl )2d1 +1 6 32(dr + 1)3 :
j=d1

175

Proposition 6.5.1. Let A be a semi-Abelian variety over K; X A a subvariety of A


containing no translates of in4nite group subvarieties of A. One can e6ectively 4nd
an integer L such that ,p X (K a ) L1 A(K). Moreover one can e6ectively 4nd coset

representatives ri of A(K)=LA(K); such that ,p X (K a ) i L1 (ri + LA(K)).
Pick a prime of K, of good reduction, with residue 'eld k of characteristic p. Let
F = Fp be the Weil polynomial; and let be a lifting and extension of Frobenius to
K a , so that F( ) vanishes on the group Tp of prime-to-p torsion points of A. Let
, p = {a A(K a ) : ma A(K) f or some integer m prime to p}:

176

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

111

Lemma 6.5.1. ( 1)F( ) vanishes on ,p .


Proof. F( ),p has no p -torsion. For let c = F( )(b); b ,; n1 c = 0; (p; n1 ) = 1. We
have n2 b A(K) for some n2 prime to p. 'xes n2 b, hence
0 = n2 n1 F( )(b) = n1 F( )(n2 b) = n1 F(1)n2 b
so b is torsion. Let r be the maximal power of p dividing F(1). We can write b = b1 +b2
with b1 ; b2 ,p , b1 Tp , and rb2 = 0. Then F( )(b1 ) = 0, and F(1)b2 = 0. But also
b2 A(K); so F( )(b2 ) = F(1)b2 = 0. Thus c = F( )(b) = 0.
It follows that 'xes F( ),p . For if c F( ),p , then nc A(K) for some n prime
to p; so n(c (c)) = nc (nc) = 0, and thus c (c) = 0. So ( 1) annihiliates
F( ), p .
Lemma 6.5.2. The -de4nable groups ker( n 1); ker(F( )) are orthogonal. Their
intersection consists of 4nitely many torsion points.

112

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

The same proof applies to any automorphism  with the same properties as ; and
since we chose a priori di&erence-'eld bounds on the 'nite numbers involved, the same
number n applies to all such  . Thus
{Fix($) : $ a conjugate of } = B:
n1 (X ,)
Now the group B on the right is invariant under Aut(K a =K). If a B Tp , then the
reduction map takes a into A(k) (the 'xed 'eld of Frobenius). Let mp be the order
of A(k); then mp a reduces to 0; since the reduction map is injective on Tp , mp a = 0.
So H = mp B K a is p -torsion free, and Aut(K a =K)-invariant. It follows that if b H ,
and mb A(K), (m; p) = 1, then b A(K); for b is the unique mth root of mb in H ,
hence is invariant under Aut(K a =K). Since every element of ,p has a p -multiple in
A(K), we obtain:
nmp (X , p ) A(K):
This 'nishes the proof of Proposition 6.5.1.

Proof. ker( n 1) is internal to the 'xed 'eld k, while ker(F( )) is LMS. Hence
they are orthogonal, and in particular have 'nite intersection; the intersection is a 'nite
subgroup, consisting of torsion points. The last statement is also easy to see directly;
both n 1 and F( ) vanish on the intersection, hence so does G( ) whenever G
is a polynomial of Z[T ] in the ideal generated by F and T n 1. But some constant
polynomial is in this ideal.

Remark 6.5.4. Assuming (), Proposition 6.5.1 holds also for ,.


Proof. Identical, using a single -equation capturing all torsion points as in Section 6.2.

(No doubt a proof can also be given without (), using two automorphisms.)

Lemma 6.5.3. X ker(( 1)F( )) is contained in 4nitely many cosets of ker( 1).
6.6. TateVoloch conjecture
Proof. Let h : ker(( 1) ker(F( )) ker(( 1)F( )) be the map h(x; y) = x+y.
h has 'nite kernel. h1 (X ) is a 'nite union of rectangles, U V , by Lemma 3.4.9.
Since ker(F( )) is LMS, V is a Boolean combination of de'nable cosets; their Zariski
closure is a Zariski closed coset contained in X , so by the assumption on X; h(V ) is
'nite. The lemma follows.
The cosets of ker( 1) have the form {a A : ( 1)(a) = 3}; 'nitely many 3
occur in the conclusion of the lemma. Let n1 be such that n1 3 = 0 for each of these 3
that is torsion. (Actually, we only need that n1 pm 3 = 0 for some m; for this, if one is
interested in the explicit version, one can take n1 to be the order of the group A(k  ),
where k  is the 'eld extension of k of degree [K(3) : K].)
Note that if a Ker(( 1)F( )) A(K a ), and ( 1)(a) = 3, then 3 A(K a ), and
F( )(3) = 0; it follows by Lemma 6.5.2 that 3 is torsion. Thus n1 3 = 0, so n1 a Fix( ).
We have
X ker(( 1)F( )) A(K a ) {x : n1 x Fix( )}:
Hence by Proposition 6.5.1
n1 (X , p ) Fix( ):

177

Tate and Voloch conjectured that the torsion points on an Abelian variety A over
Cp that do not lie on a subvariety V A, are bounded away from that variety. Certain
special cases were proved by TateVoloch, and by Buium and Silverman. The proof
of the ManinMumford conjecture given above lends itself immediately to a proof of
the TateVoloch conjecture under some restrictions: A must be assumed de'ned over
a 'nite extension of Qp ; must have good reduction; and the prime-to-p torsion points
only are considered. We show this easy deduction here. In later work, using much
more p-adic Galois theory, Scanlon removed the last two constraints. See his papers
[26] for this and for references to the history of the problem.
Assumptions, Notation. Cp is the completion of the algebraic closure of Qp ; the ring
of integers of L is denoted OL , the residue 'eld k. |x| is the p-adic absolute value.
L is a 'nite extension of Qp : OL is the ring of integers of L. kL the residue 'eld.
K = La is the 'eld of algebraic numbers.
S is a group scheme over OL , with generic and special 'bers S and Sp respectively.
S is a semi-Abelian variety, and has good reduction in the sense of Lemma 5.0.10.
Tp (S) = {a S(K) : ma = 0 f or some m; (p; m) = 1};

178

The Manin-Mumford conjecture and the model theory of difference fields

The Manin-Mumford conjecture and the model theory of difference fields

E. Hrushovski

E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

113

TU = {a Sp (k) : ma = 0 f or some m; (p; m) = 1}:


Because of the good reduction assumption, there is a bijective reduction map rS : Tp
(S) TU .
We assume a notion of a distance from a point of A to a subvariety X , such that
if d(ai ; X ) 0, then for any aQne open U of A, and any f in the aQne polynomial
ring of U vanishing on X , |f(ai )| 0.
Proposition 6.6.1. Let X be a closed subvariety of S. There exists a bound b0 such
that for a Tp (S); either a X (K) or the distance from a to X is b.
Proof. By Lemma 5.0.10, there exists an automorphism of K = La and an integral
polynomial F with no cyclotomic factors, such that F( 0 ) vanishes on the prime-to-p
torsion points of Sp . As the reduction map is injective, F( ) vanishes on S: (#)
By Corollary 4.1.13, F( ) = 0 is LMS, and therefore ALM:
Lemma 6.6.1. Let K be an algebraically closed 4eld of characteristic 0; with an
automorphism . Let A be a semi-Abelian variety over the 4xed 4eld of . Let
F Z[T ] be a polynomial with no cyclotomic factors. Let B be the set of solutions
to F( ) = 0 in K; and let X be a subvariety of A. Then there are 4nitely many group
subvarieties Ai of A; and cosets Yi of Ai ; de4ned over K; with Yi X; and such that

with Y = i Yi ;
X B = Y B:
Moreover; this remains true if B is replaced by the set of solutions in any di6erence
4eld (K  ;  ) extending (K; ).
Proof. In a universal domain extending (K; ), the equation F( ) = 0 is LMS (Corollary 4.1.13), so the solution set is a 'nite union of de'nable subgroups. Let the Yi be
the components of the Zariski closure of these de'nable subgroups. They are de'ned
over K in the sense of di&erence 'elds. Since K is algebraically closed and an inversive di&erence 'eld, it is also algebraically closed as a di&erence 'eld [5], so the Yi
are de'ned over K algebraically.
Returning to the proof of the proposition, let ai Tp (S), with d(ai ; X ) 0. We
will show that ai X for almost all i. Suppose otherwise, and let U be an ultra'lter,
concentrating on the indices i with ai = X . Let R = l (K) be the ring of bounded
sequences (b0 ; b1 ; : : :) from K. Let

In :
In = {r R : r = (b0 ; b1 ; : : :); {i : |bi | 6 pn } U }; I =
n

Then I is an ideal of R. Evidently, lifts to R and respects I ; so it induces an


automorphism of the domain D = R=I . It is also easy to see that R=I is a 'eld. (If
r = (b0 ; b1 ; : : :) = I , then r = In for some n, so |bi |pn for i X , X U . Letting

179

114

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

ci = b1
for i X; ci = 0 for i = X; s = (c0 ; c1 ; : : :), we see that rs 1 I .) Every dii
agonal sequence is in R, and we obtain an embedding j : K D. Since O is bounded,
Bi O R, and we obtain a map Bi O D, yielding a natural map 5 : Bi S(O) S(D).
Let a S(D) denote the image of (a0 ; a1 ; : : :) there. The assumption d(ai ; X ) 0 implies that for any rational f regular near a and vanishing on X; |f(ai )| 0, hence
the sequence (f(ai ))i lies in I . It follows that f(a ) = 0, hence a X .
Note that since each ai Tp (S), by (#) above, F( )ai = 0; and so F( )a = 0. By
Lemma 6.6.1, a Y for some K-de'ned coset Y of a connected subgroup W of S. Let
4 : A S=W be the projection, c = 4(Y ). So 4(a )=c (S=W )(K). But 4(a )=5((4(a0 );
4(a1 ); : : :)). It follows that the sequence 4(ai ) comes arbitrarily close to c, and in particular, for large i, rS=W (4ai ) = rS=W (c). Now rS=W is injective on the prime-to-p torsion
points of S=W , so 4(ai ) = c for large i. Thus for large i; ai Y , and in particular,
ai X .

References
[1] J.-B. Bost, PSeriodes et isogenies des variSetSes abSeliennes sur les corps de nombres, dapres D. Masser
et G. WNustholz. SSeminarie Bourbaki Exp. No. 795.
[2] E. Bouscaren, E. Hrushovski, On one-based theories, J. Symbolic Logic 59 (2) (1994) 579595.
[3] S. Buechler, Locally modular theories of 'nite rank, Ann. Pure Appl. Logic 30(1) (1986) 8394.
[4] A. Buium, Geometry of p-adic jets, Duke J. Math 82 (2) (1996) 349367.
[5] Z. Chatzidakis, E. Hrushovski, Model theory of di&erence 'elds, AMS Trans. 351 (8) (2000) 2997
3071.
[6] Z. Chatzidakis, E. Hrushovski, Y. Peterzil, Model Theory of di&erence 'elds II: periodic ideals and the
trichotomy in all characteristics, Trans. AMS, to appear.
[7] C.C. Chang, J. Keisler, Model Theory, 3rd ed., North-Holland, Amsterdam, Tokyo, 1990.
[8] G. Cherlin, E. Hrushovski, Quasi-'nite structures (preprint available in www.math.rutgers.edu==cherlin).
[9] G. Faltings, EndlichkeitssNatze fNur abelsche VarietNaten uN ber ZahlkNorpern, Invent. Math. 73 (3) (1983)
349366.
[10] W. Fulton, Intersection Theory, Springer, Berlin, Tokyo, 1984.
[11] M. Hindry, Autour dune conjecture de Serge Lang, Invent. Math. 94 (3) (1988) 575603.
[12] E. Hrushovski, Unidimensional theories are superstable, Ann. Pure Appl. Logic 50 (1990) 117138.
[13] E. Hrushovski, The ManinMumford conjecture and the model theory of di&erence 'elds, extended
abstract (5pp), in: M. Jarden (Ed.), Proc. Field Arithmetic conf., Institute for Advanced Study,
Jerusalem, 1995.
[14] E. Hrushovski, The MordellLang conjecture for function 'elds, J. AMS 9 (3) (1996) 667690.
[15] E. Hrushovski, A. Pillay, Weakly normal groups, in: Logic Colloquium 85, North-Holland, Amsterdam,
1986.
[16] E. Hrushovski, A. Pillay, De'nable subgroups of algebraic groups over 'nite 'elds, J. Reine Angew.
Math. 462 (1995) 6991.
[17] B. Kim, A. Pillay, Simple theories, Ann. Pure Appl. Logic 88 (1997) 149164.
[18] S. Lang, in: Number Theory III: Diophantine Geometry, Encyclopaedia of Mathematical Sciences, Vol.
60, Springer, Berlin, Heidelberg, 1991.
[19] S. Lang, J. Tate, Principal homogeneous spaces over abelian varieties, Amer. J. Math. 80 (1958) 659
684.
[20] D. Masser, G. WNustholz, Factorisation estimates for abelian varieties, Pub. Math. IHES 81 (1995) 524.
[21] M. McQuillan, Division points on semi-abelian varieties, Invent. Math. 120 (1995) 143149.
[22] A. Pillay, Model theory, stability theory, and stable groups, in: A. Nesin, A. Pillay (Eds.), The Model
Theory of Groups, Notre Dame Mathematical Lectures 11, University of Notre Dame Press, Notre
Dame, Indiana, 1989.

180

The Manin-Mumford conjecture and the model theory of difference fields


E. Hrushovski

E. Hrushovski / Annals of Pure and Applied Logic 112 (2001) 43115

115

[23] M. Raynaud, Around the Mordell conjecture for function 'elds and a conjecture of Serge Lang, in:
Proc. Algebraic Geometry of Tokyo, Lecture Notes, vol. 1016, Springer, Berlin, 1982.
[24] A. Robinson, Introduction to Model Theory and the Metamathematics of Algebra, North-Holland,
Amsterdam, 1963.
[25] G. Sacks, Saturated Model Theory, W.A. Benjamin, Reading, MA, 1972.
[26] T. Scanlon, Conjecture of Tate and Voloch on p-adic proximity to torsion, International Math. Research
Notices 1999 no 17, 909 914; p-adic distance from torsion points of semi-Abelian varieties, J. Reine
Angew. Math. 499 (1998) 225 236.
[27] J.-P. Serre, Lectures on the MordellWeil Theorem, Vieweg, Braunschweig=Wiesbaden, 1997.
[28] J.-P. Serre, in: Groupes algSebriques et corps de classes, ActualitSes scienti'ques et industrielles, Vol.
1264, Hermann, Paris, 1959.
[29] J.-P. Serre, Oeuvres, Vol. IV, 1985 1998, Springer, Berlin, 2000.
[30] S. Shelah, Simple unstable theories, Ann. Math. Logic 19 (1980) 177203.
[31] A. Weil, VariSetSes abeliennes et courbes algSebriques, Hermann, Paris, 1948.

181

También podría gustarte