Documentos de Académico
Documentos de Profesional
Documentos de Cultura
ISSN: 0168-0072
Imprint: Elsevier
Commenced publication in 1969
Subscriptions for the year 2008,
Volumes 149-154, 18 issues
Submitting Articles
Contributions, which should be in English, may be
submitted to any editor, preferably in electronic form.
Final decision for publication will be taken by a Managing Editor.
www.elsevier.com/locate/apal/authorinstructions
www.elsevier.com/locate/apal
Editor selects
Articles
from Annals of
Pure and Applied Logic
Annals of
Pure and
Applied Logic
It is a pleasure to present you this
collection of selected papers which have
appeared in the Annals of Pure and Applied
Logic. These papers illustrate the vitality of
the field of mathematical logic, as well as
its many interactions with other parts of
mathematics and computer science.
Guiraud's paper forms a good illustration of the new
categorical, geometric and topological aspects of
logical proofs, absent in the more traditional
approaches to proof theory. The contribution by Frick
and Grohe, about lower bounds for the complexity of
model checking, is a fine example on the borderline of
logic and computer science. Hjorth's paper relates
basic concepts of descriptive set theory to ergodic
theory, while Lewis and Barmpalias explore relations
between computability and various notions of
randomness. Gitik's paper deals with the arithmetic of
singular cardinals, and brings together many
important concepts and techniques of modern set
theory. Finally, Hrushovski's seminal paper on the
Manin-Mumford conjecture is a striking illustration of
the impact of modern model theory on problems of
algebraic geometry and number theory.
I am convinced that many of you will enjoy this
present from the publisher, showing how lively and
exciting mathematical logic nowadays is.
I. Moerdijk
Contents
Yves Guiraud
Journal of Pure and Applied Algebra,
207 (2006), Pages 341-371
35
65
Greg Hjorth
Annals of Pure and Applied Logic
143 (2006), Pages 87-102
82
89
Moti Gitik
Annals of Pure and Applied Logic
119 (2003), Pages 1-18
108
Ehud Hrushovski
Annals of Pure and Applied Logic
112 (2001), Pages 43-115
Coordinating Editor
Annals of Pure and Applied Logic
Termination
orders for
three-dimensional
rewriting
Yves Guiraud
Abstract
0. Outline
This paper starts with the introductory Section 1 on equational theories and term
rewriting systems. It gives notations and graphical representations that are used in the
following. Then, it focuses on one major restriction of term rewriting, namely the fact that
it cannot provide convergent presentations for commutative equational theories: equational
theories that contain a commutative binary operator.
Section 2 studies the resource management operations of permutation, erasure and
duplication: they are implicit and global in term rewriting and it is sketched there how
to make them explicit. However, the framework for rewriting in algebraic structures needs
to be extended to include this change; Section 3 proposes 3-polygraphs to fulfil this role.
E-mail address: guiraud@iml.univ-mrs.fr.
c 2005 Elsevier B.V. All rights reserved.
0022-4049/$ - see front matter
doi:10.1016/j.jpaa.2005.10.011
The
Photophone
A. G.
Bellthree-dimensional rewriting Y. Guiraud
Termination
orders
for
342
Here, these objects, introduced in [3], are used as equational presentations of a special case
of 2-categories: MacLanes product categories, called PROs, for short, in [11].
These first three sections do not introduce new material, but focus on the notations,
representations, terminology and philosophy of this paper. Then Section 4 gives some
relations between term rewriting systems and 3-polygraphs: a translation from the former
to the latter is built and some properties are given. The main result of the section is the
proof of a conjecture from [10]: any left-linear convergent term rewriting system can be
translated into a convergent 3-polygraph.
To prove some of these results, one needs new tools, adequate with the more
complicated structure of polygraphs. In particular, Section 5 introduces a recipe to build
termination orders for them. Section 6 consists in the application of this technique to prove
some termination results of Section 4. Finally, Section 7 applies the same technique to
prove the termination of the 3-polygraph L(Z2 ) which was introduced in [10] and, since
then, was already known to be a confluent presentation of the equational theory of Z/2Zvector spaces. It is therefore the first known convergent presentation of a commutative
equational theory.
343
Then, the terms are all the trees one can build from these two generating trees and which
leaves are labelled with variables. As an example, the following figure pictures terms built
from the signature , with the two representations for each one:
The equations from the theory of monoids generate equalities between terms that represent
the same operation, through a rewriting process. Let us sketch how this works. For
example, the following term contains the tree-part of the associativity rule left-member,
which has been greyed out:
Hence, the associativity equation generates an equality between the chosen term and
another. To determine which one, let us follow the following method, which consists of
three steps: at first, the remaining (black) part of the term is copied; then, in the space left
empty, the other member of the rule is placed; finally, the two parts obtained are joined (by
dotted lines), according to the respective position of the variables in each member of the
equation. Concerning our example, this process is pictured as follows:
E 0 = ((x, y), z) = (x, (y, z)), (, x) = x, (x, ) = x .
Each operator has a finite number of inputs and of outputs. When each one has exactly one
output, which is the case here, the signature is said to be algebraic. The given equational
theory ( , E 0 ) is said to be the theory of monoids since monoids are exactly sets endowed
with a binary operation and a constant, such that the operation is associative and admits
the constant as a left and right unit.
The formal operations one can form on any set with a binary operation and a constant
are called the terms built from the signature . There exist numerous ways to build the
set T of such terms, and each one gives a different representation for them. Two are used
here, a syntactic one and a diagrammatic one. For each one, a fixed countable set V is
needed; its elements are called variables.
The classical representation of terms define them inductively with the following construction rules: the first one states that each variable is a term; furthermore, the constant
is a term; then, for any two terms u and v, the formal expression (u, v) is a term.
The diagrammatic representation starts with the assignment, for each operator with n
inputs, of an arbitrarily chosen tree of height one with n leaves. For example, one can fix
the following trees:
Note that each variable appears once and in the same position in each member of the
associativity rule, so that the links are direct. When the second term is compacted, the
following equality holds and is said to be generated by the associativity equation:
In order to study the computational properties of these rewriting processes, term rewriting
systems are useful; they can be defined as oriented equational theories. Indeed, such a
rewriting system is defined from an equational theory by keeping the same operators
344
and replacing each equation by a rewrite rule: it is an oriented version of the equation,
which can only be used in one way. As an example, starting from the equational theory
of monoids, one can form the term rewriting system ( , R0 ), where is still the same
algebraic signature made of a product and a unit and R0 is the following set of three
rules:
((x, y), z) (x, (y, z)),
(, x) x,
(x, ) x.
Rewrite rules generate reductions instead of equalities, and a graph containing terms
as vertices and reductions as edges is called a reduction graph. Some geometrical
properties of reduction graphs are of particular interest since they have consequences on
computational properties of the rewriting process. Among these geometrical properties,
three are particularly studied: termination, confluence and convergence.
A rewriting system terminates if it contains no infinite length reduction paths such as:
u 0 u 1 u 2 u n u n+1
Intuitively, this means that the rewriting calculus must end after a finite time, whatever
the input is. This is formalized by the following consequence of termination: every term u
has at least one normal form u;
this means that u is a term such that there exists a finite
and u is irreducible (no rule can apply on
reduction path from u to u (denoted by u u)
it).
A rewriting system is confluent if, whenever there exist three terms u, v and w such
that u v and u w, then there exists a fourth term t such that v t and w t.
Intuitively, this means that choices made between two rules that can transform the same
term do not have any consequence on a potential final result; equivalently, this means that
any term has at most one normal form.
Thus, one defines the last property: a rewriting system is convergent when it is both
terminating and confluent. One immediate consequence is that any term has exactly one
normal form. This property is very useful for several purposes.
One of the most well known is the following usage: let us assume that ( , E) is
an equational theory and that ( , R) is a rewriting system that is a finite convergent
presentation of ( , R), which means that it is a convergent rewriting system with a finite
number of rules and such that two terms are equal in the equational theory if and only if
there exists a non-oriented reduction path between these two terms in the rewriting system.
Then there exists a decision procedure to check if two terms u and v are equal or not.
Indeed, one computes their unique normal forms u and v.
Note that this is where the
finiteness condition is useful: it allows one to check if a term is a normal form. Then the
two normal forms u and v are compared: u and v are equal in the equational theory if and
only if u and v are (syntactically) equal.
However, term rewriting systems have a major restriction in this field: there is a large
class of equational theories for which they cannot provide a convergent presentation. These
are the commutative theories, fairly frequent in algebra, which are equational theories with
a commutative binary operator. As an example, let us take a look at one of the simplest,
namely the equational theory of commutative monoids. Its signature is still ; its set E 1
of equations is made of the same three as the ones for monoids (associativity and left and
345
right units) plus the following one expressing the commutativity of the product:
(x, y) = (y, x).
From this theory, one can form a number of term rewriting systems, such as the one with
as signature and with the following choice R1 of orientations for equations:
((x, y), z) (x, (y, z)),
(x, ) x,
(, x) x,
Note that the last rule could have been chosen in the reverse direction, but it would not
change the following fact: this rule generates infinite reduction paths. Indeed, for any two
terms u and v, the commutativity rules generate:
(u, v) (v, u) (u, v) (v, u) .
The purpose of this paper is to provide a framework where some commutative equational
theories admit convergent presentations: 3-polygraphs. Links between term rewriting
systems and 3-polygraphs are studied and a new tool to prove termination is given and
applied on some examples.
The equational theory that provides the main example here is the one of Z/2Z-vector
spaces: it has the same operators as the previous ones (the binary product embodies the
sum and the unit is the zero) and a set E 2 of five equations made of the four from E 1
(associativity, left and right units and commutativity) plus the following fifth equation:
(x, x) = .
It expresses the fact that, in a Z/2Z-vector space, any element is its own opposite. This
theory is preferred to the theory of commutative monoids for two reasons. The first one
is theoretical: any boolean algebra has an underlying Z/2Z-vector space, so that any
convergent presentation for Z/2Z-vector spaces is a first step towards one for boolean
circuits. The second one concerns the application range of the tools developed here: this
fifth equation has some nasty computational effects and is thus important to encompass in
the new framework, so that it can be used for other applications.
From the theory of Z/2Z-vector spaces, the term rewriting system ( , R2 ) is built,
where R2 is the following choice of orientations:
((x, y), z) (x, (y, z)),
(x, ) x,
(, x) x,
(x, x) .
Note that this rewriting system is neither terminating nor confluent but will serve as a
starting point to build a convergent presentation. This transformation will start with the
study of the so-called resource management operations. For further information on (term)
rewriting systems, one can refer to [2].
2. Resource management operations
Let us recall the last step of the term rewriting process: one has to draw links between
two parts of a term, according to the variables occurring in the corresponding rule. As
346
mentioned earlier, the rewriting example in Section 1 is the simplest case: indeed, the
variables occur once each and in the same order in each member of the associativity rule.
However, if this is not the case, one has to use additional operations before links are drawn:
these operations are called the resource management operations and there are three of their
kind, permutation, erasure and duplication.
Permutation is used, for example, when the commutativity rule is applied. Indeed,
when in this case, one has to use a permutation operation that will exchange the two grey
subterms in any term such as the following generic one:
The second operation, erasure, is used in the following case, for example: let us consider a
theory containing a binary operator and a constant which is a right absorbing element. The
following figure displays a rule which expresses this property (on the right) together with
a generic application of this rule (on the left); this requires an intermediate operation that
erases the grey subterm:
Finally, the last operation, called duplication, can occur in the following case: let us
consider a theory containing two binary operators, one of which is left-distributive with
respect to the other. Then, when applied, a rule that expresses this property (such as the
one pictured on the right) requires the use of an operation that can duplicate the greymost
subterm (and exchange one of its copies with another subterm, but this is the alreadyencountered permutation):
Thus, in term rewriting, these three operations are both implicit (they are not specified
by rules) and global (they act immediately on subterms of any size). We are now going
to sketch how one can make them explicit and local: only the idea is given here, the full
translation is postponed to Section 4.
Let us start with the following observation: the use of the three resource management
operations is specified both by the number of occurrences and the order of appearance of
each variable in each member of a rewrite rule. Thus, in order to make these operations
explicit, variables will be replaced by some additional operators that will represent
local permutations, erasers and duplicators; furthermore, rules will guarantee the global
behaviour of these local operators.
347
In order to give an idea of how the translation works, let us start with the study of this
term, which represents the operation (x, y, z) 7 ((x, z), x):
This diagram will be formalized as a composite of new operators and the term will be
translated this way (with some explanations below):
Variables in the term have been replaced by ordinals; indeed, we have seen that variables
are just labels corresponding to the first, second, third, etc. arguments taken by the
corresponding operation. Hence, they will be replaced by ordinals whenever it makes the
translation clearer. The second remark is also about variables, but in the translated diagram:
they will always appear, after translation, in order: 1, 2, 3, etc. Thus, they have no purpose
anymore; they will therefore vanish, as in the diagram.
Finally, let us see what operators will be added to the signature and sketch how to
translate terms and rules. One operator is added for each resource management operation:
indeed, in order to formalize our previous diagram, one must be able to exchange two
arguments, erase one or duplicate another one. Thus, we fix a (non-algebraic) signature
made of the following three resource management operators:
Each one has a representation that makes explicit the operation one wishes it to embody.
Some rules will be added to ensure their global behaviour, but they will be given in
Section 4. For the moment, the only thing we need to know is that these rules give the
following interpretations to these three operators:
(x, y) = (y, x),
(x) = (nothing),
Now, let us sketch how terms are translated: first, the tree-part is copied; then and
progressively, resource management operators are added on the top of the copy, according
to the variables that appear in the term.
10
348
The following figure gives four sample translations (the translating map is denoted by
thereafter):
Then, let us see how to translate the five rules of our term rewriting system derived from the
theory of Z/2Z-vector spaces. Each rule is pictured in order (associativity, left and right
units, commutativity and self-inverse), has been given a name (A, L, R, C and S) and has
its translation written just below:
Note that several cases may occur. For the first three rules, no resource management
operator is added during translation: these three rules are linear (or left- and right-linear).
When translated, the commutativity rule has one operator added on its right side and none
on its left side: it is a left-linear but not right-linear rule. Finally, the self-inverse rule has
one operator added on each of its members during translation: it is neither left- nor rightlinear.
Some issues have now arisen. The first one concerns the rules to be added in order both
to describe the behaviour of our local permutation, eraser and duplicator and to ensure the
global coherence of these local rules.
The next issue is about the respective computational properties of the starting term
rewriting system and of the rewriting system one gets as a result of making the resource
management operations explicit. These first two issues are addressed in Section 4.
For the moment, we are concerned with a third issue: where does rewriting takes place
now? Indeed, starting from a term rewriting system, we have crafted another rewriting
system which is not a term one, and for two reasons. The first one is that its signature
contains non-algebraic operators, that is operators that do not have exactly one output
(the resource management operators have zero or two outputs). The second reason is that
variables have been dropped to be replaced by these new operators: this is also a step
outside term rewriting. Hence, our new object is not a term rewriting system and Section 3
recalls a notion from [3] used to describe it.
11
349
3. Three-dimensional polygraphs
Like equational theories, 3-polygraphs are useful objects in universal algebra, in
the sense that they allow one to present algebraic structures by generators and
relations. However, they are far more general than equational theories, and this has two
consequences: on the one hand, they can handle more general objects, like the rewriting
system sketched in Section 2, or the structure of quantum groups; but, on the other hand,
their generality comes with an increase in the structural complexity: the development of
new tools is mandatory to prove termination, for example.
Polygraphs are genuine categorical objects but we prefer a diagrammatic definition here.
For this paper, a 3-polygraph is made of a signature, that is a set of operators with a finite
number of inputs and a finite number of outputs, together with a family of rules: in fact,
this is just a special case of 3-polygraph, one with only one 0-cell and one 1-cell. For the
complete theory of n-polygraphs, the interested reader should check [3].
The operators are once again represented by fixed diagrams of size one, with as many
free edges at the top as the operator inputs and as many free edges at the bottom as the
operator outputs. For example, some usual diagram shapes are pictured here:
Some of them have already been encountered, some of the others are less algebraic: one
has zero input and output it is useful to describe Petri nets, see [5] while one has two
inputs and zero output it is used together with its dual with zero input and two outputs
to represent knots and tangles.
Here, the terms one considers are all the circuits one can build with all these
elementary diagrams: these are the Penrose diagrams (or circuits) one can build with the
size one diagrams representing the operators, such as:
Each of these circuits has a finite number of inputs (on the top) and of outputs (on the
bottom) but has no variable. Furthermore, they need not be connected, as the three-inputs
and three-outputs wire only one.
These circuits, which are also called diagrams or arrows, have an algebraic structure.
To explain it, let us use the notation f : m n to express that f is a circuit with m
inputs and n outputs. For any circuit f , s( f ) is its number of inputs and t ( f ) its number
of outputs. The following constructions and properties are valid for circuits:
Let f : m n and g : n p. Then, one can connect each output of f with the
corresponding input of g, in the same order, to form a new circuit with m inputs and p
outputs denoted by g f .
This composition operation admits local units: a circuit f : m n satisfies f m = f
and n f = f , where p is the wire-only circuit with p inputs and p outputs.
Let f : m n and g : p q. Then, one can put f and g side by side to form a new
circuit with m + p inputs and n + q outputs, denoted by f g.
12
350
This product operation admits a bilateral neutral element: the empty circuit 0 with no
input nor output, represented by an empty diagram.
Finally, the composition and product are related by the exchange relations. They are
given by the following equality, that is required to hold for any two circuits f : m n
and g : p q:
(t ( f ) g) ( f s(g)) = f g = ( f t (g)) (s( f ) g).
Definition 3.1. A family C of circuits endowed with this structure and , satisfying the
aforementioned unit and exchange relations, is called a product category; the subset of
circuits with m inputs and n outputs is denoted by C(m, n). When the circuits of C are
freely built from a signature , this object is the free product category generated by ,
denoted by h i.
Remark 3.2. Product categories, or PROs, were defined in [11]. An alternative definition
is: a product category is a strict monoidal category whose underlying monoid of objects is
(N, +, 0), the one of natural numbers with addition and zero. In [5], such a category was
called a (monochromatic) operad, for this structure is a common generalization of many
universal algebra objects: Mays operads, Lawveres algebraic theories and MacLanes
PROs and PROPs.
Product categories are also a special case of 2-monoids or 2-categories with only one 0cell. A generalization of this papers results should be possible, since circuit-like diagrams
extend to general 2-cells. For this paper, we stick to MacLanes product categories, but all
this terminology will be made clear in subsequent work.
A rewrite rule on a product category C is a pair f g of parallel arrows (they have the
same number of inputs and the same number of outputs). Such a rule generates reductions
on circuits: whenever an arrow h contains f , the rule generates a reduction from h to k,
where k is the same as h, except that f has been replaced by g. The fact that f and g have
the same number of inputs and the same number of outputs ensures that one can connect
the unchanged part of the circuit with the changed part, without using implicit operations
before.
Definition 3.3. A 3-polygraph is a pair ( , R) where is a signature and R is a family of
rewrite rules on h i.
One way to formalize the reduction relation generated by rules on a free product
category h i is to define contexts. We just explain here what they are, avoiding digging
further into the technical aspects, developed in [5]. Let be a signature. Then, a context
on h i is a circuit c with a hole inside: this hole has a finite number of inputs and
outputs where one can paste a circuit f with corresponding numbers of inputs and outputs;
this pasting operation results in a circuit denoted by c[ f ]. Then, a rule f g generates a
reduction from each circuit c[ f ], with c any context, to the circuit c[g].
Finally, given two product categories C and D, a product category functor from C to D
is a map which sends each circuit of C onto a circuit of D with the same number of inputs
and outputs, and which preserves identities, products and compositions. When C is the
free product category h i, then a classical categorical argument tells us that any product
13
351
14
352
353
2. The family E , made of three equations for each integer n and each arrow f : n 1
in C:
The following recursively defined arrow families (n )nN and (n,1 )nN have been used:
There remains to prove that every family (y1 , . . . , yk ) of variables in {x1 , . . . , xm } can be
uniquely written (modulo E ) with the three arrows i( ), i() and i(). This can be done
in two steps.
Let us define the sub-product category V of T by restricting ourselves to families of
variables: this is T, where denotes the signature with no operator. One also defines the
cartesian category Fo of finite sets: the arrows of Fo (m, n) are in bijective correspondence
with the functions from the finite set [n] = {1, . . . , n} to [m]. Then:
Lemma 4.4. The cartesian categories V and Fo are isomorphic.
15
Proof. Let (y1 , . . . , yn ) be a family of variables taken in {x1 , . . . , xm }. Then, there exists
an unique function f from [n] to [m] such that yi = x f (i) for each i. Let us fix
(y1 , . . . , yn ) as the arrow f in Fo that corresponds to f . Conversely, if f is an arrow in
Fo (m, n): let us denote by f the corresponding function from [n] to [m]. Then one defines
( f ) = (x f (1) , . . . , x f (n) ). There remains to check that and are cartesian functors
which are inverse one another, which is straightforward.
The second step uses another result from [3]:
Theorem 4.5 (Burroni). The cartesian categories Fo and hi/E are isomorphic.
Hence, the cartesian categories V and hi/E are isomorphic. Consequently, each
family (y1 , . . . , yk ) of variables taken in {x1 , . . . , xn } corresponds to a unique arrow
in hi/E . Furthermore, each arrow f in T (m, n) admits a unique decomposition
f = f f with f in h i and f in V.
Finally, one gets that the cartesian functor F from h c i/E to T is an isomorphism.
However, we want a map from T to h c i: let us find a convergent 3-polygraph
( c , R ) such that h c i/R is isomorphic to h c i/E and use the unique normal
form property.
A conjecture from [10] is proved:
Theorem 4.6. For any algebraic signature , the 3-polygraph ( c , R ) is convergent
and h c i/R is isomorphic to the free cartesian category h c i/E generated by ,
where the family of rules R is made of the following two subfamilies:
16
354
1. The family R :
355
Definition 4.8. For every term u in T and for every integer n ]u, the term u can
be seen as an arrow u n in T (n, 1). One denotes by n (u) the arrow (u n ) of h c i and
by (u) the particular case ]u (u). If = (u, v) is a rewrite rule on T , the notation ()
is used for the rewrite rule ((u), ]u (v)) on h c i.
As an immediate consequence of the definition, one gets:
Lemma 4.9. For any algebraic signature , any term u in T and any integer n ]u,
the arrow n (u) is a normal form for the resource management rules R .
The rest of this section is devoted to the comparison of a term rewriting system ( , R)
with the 3-polygraph ( c , R c ), where R c is the union of the family R of resource
management rules and of the family (R) made of the translations by of the rules R.
2. The family R given, for each integer n and each operator in (n, 1), by:
Remark 4.7. Three families of verifications need to be done. The first one consists in
checking that the new rules are derivable from E , which is straightforward.
The second one is much more complicated: one needs to check that the 3-polygraph
terminates. However, the structural complexity of polygraphs requires new techniques
since the usual ones used in rewriting do not work. One way to craft reduction orders
for 3-polygraphs is made explicit in Section 5 and used in Section 6 in order to prove the
termination of ( c , R ).
Finally, one needs to check that this 3-polygraph is confluent. Here, this is equivalent
to computing all of its critical pairs and check that each one is confluent. Once again,
the structural complexity of polygraphs generates problems unknown with other kinds
of rewriting theories. For example, a finite 3-polygraph can produce an infinite number
of critical pairs; this is the case here. However, among these critical pairs, some have
properties that allow us to finally have only a finite number of computations to do. Critical
pairs of 3-polygraphs need to be further studied and classified according to properties of
this kind; this will be addressed in subsequent work.
The present case is discussed in Section 6 and fully studied in [5].
From Theorem 4.6, one concludes the existence of a map from T to h c i. Indeed, if f
is an arrow in the cartesian category T , then ( f ) will be the R -normal form of any
representant in h c i of the arrow F( f ) in the product category h c i/E . This map ,
which could not be proved to exist until Theorem 4.6, allows the formal definition of the
translation of terms into circuits.
17
Remark 4.10. Before stating the result, let us qualify by uniformized a rule (u, v) on T
such that u = f (y1 , . . . , yk ) with f an arrow in h i and (y1 , . . . , yk ) a family of variables
with the following property: y1 is x1 ; then, for each i in {1, . . . , k 1}, the variable yi+1
is either in {y1 , . . . , yi }, or yi+1 is x p+1 if {y1 , . . . , yi } = {x1 , . . . , x p }.
Note that any rule on T can be replaced by a uniquely defined uniformized rule that
generates the same reduction relation. Furthermore, if a left-linear rule is replaced by its
uniformized rule, this one is also left-linear.
Hence, for what follows, (left-linear) term rewriting systems can always be considered
uniformized: if they are not, they are replaced by their uniformized equivalent version, with
no consequence on the results.
This choice simplifies the translations: a rule (u, v) that is both left-linear and
uniformized satisfies u = f (x1 , . . . , x]u ), with f an arrow in h i, uniquely defined; hence,
the translation by of such a u is f and thus is an arrow of h i.
Proposition 4.11. If ( , R) is a term rewriting system, then:
1. If the term rewriting system ( , R) terminates, so does the 3-polygraph ( c , R c ).
2. The translation preserves the reduction steps generated by any left-linear rule , that
is: for any pair (u, v) of terms such that u v and any integer n ]u, there exists an
arrow f in h c i such that
n (u) () f R n (v).
Proof. Point 1 uses the technique to be introduced in Section 5. Its proof is thus postponed
until Section 6. Point 2 requires lengthy and cumbersome though intuitively simple
computations that can be found in [5].
Theorem 4.12. A left-linear term rewriting system ( , R) terminates (resp. is confluent)
if and only if its associated 3-polygraph ( c , R c ) terminates (resp. is confluent).
Proof. Let us assume that the 3-polygraph ( c , R c ) terminates while the term rewriting
system ( , R) does not. Consequently, there exists some sequence (u n )nN of terms in T
such that u n R u n+1 for every n. From 4.11, since every rule in R is left-linear, one
concludes that, for every k ]u 0 :
+
+
+
+
k
k
k
k (u 0 ) +
R c (u 1 ) R c R c (u n ) R c (u n+1 ) R c
18
356
+
Rc
R c -reduction
19
357
However, in the case of 3-polygraphs, this classical interpretation technique does not
yield reduction orders in general. Indeed, it is not always possible to send each operator
of the signature onto a strictly monotone map: for example, the erasure operator will be
sent to an function from N to N0 , that is to a single-element set: this function is unique
and monotone, but not strictly. Consequently: even if a rule f g satisfies f > g , then
( n f ) = (n g) , with n the number of outputs of both f and g.
One could also consider contravariant interpretations: hence, would be sent to a
constant natural number . But, in the most interesting 3-polygraphs, such as the ones
we are concerned with, there is a constant operator which cannot be contravariantly sent
to a strictly monotone map. The interpretation technique must be adapted to the polygraph
structure in order to yield termination orders.
Here we are in front of a choice between two possible directions: the first one consists
in interpreting arrows into functions between objects equipped with a monoidal product,
rather than a cartesian one, such as vector spaces. But, when examined, this has led to
horrendous computations that did not produce any reduction order. Nonetheless, this trail
is not to be forgotten and shall be reexamined when there is a computational tool, adapted
to polygraphs.
The other path consists in using classical interpretations, both covariant and
contravariant, as tools to build a third interpretation: this one will give the desired reduction
orders. Let us present images that describe the intuition beneath the formalism. Each
arrow in the considered product category is seen as an electrical circuit whose elementary
components are the operators it is built from, such as suggested by the diagrammatic
representation used. Then, a heat production value is associated to each circuit: each of
its inputs and outputs receives a current with a fixed intensity; hence there are two types of
currents: some are descending (they come from the inputs and propagate downwards to the
outputs) and some are ascending (they propagate upwards, from the outputs to the inputs).
The heat produced by a fixed circuit is calculated this way: an operator is arbitrarily
chosen. Then, currents are propagated through the other operators to the chosen one. This
requires that choices have been made for each operator: for each one, one must be able to
compute the intensities of descending currents transmitted when one knows the intensities
of incoming descending current, and similarly with ascending currents. When one knows
the intensities of each current coming into the chosen operator, one computes the heat it
produces, according to values fixed in advance. Then, one repeats the same procedure for
each operator, and sums the results to get the heat produced by the considered circuit, for
the chosen current intensities.
Two circuits with the same number of inputs and the same number of outputs are
compared this way: if, for each family of (ascending and descending) current intensities,
one produces more heat than the other one, then the first one is said to be greater. The
goal of this section is twofold: firstly, to formalize the objects required to compute such an
order; secondly, to obtain sufficient conditions for this order to be a reduction order.
Let us describe the required materials. The first one is the object where the
interpretations take their values: this will be a product category equipped with a strict
order. In order to build it, one considers (non-empty) ordered sets X and Y to express the
current intensities, one for descending currents, one for ascending currents (for one of the
applications to be described, two different sets of values are needed). Then, a commutative
20
358
monoid M will contain the possible values of heats; moreover, it is supposed to be equipped
with an order such that the addition is strictly monotone in both variables.
From the data X , Y and M, one builds a somewhat weird product category O(X, Y, M)
this way: an arrow from m to n in O(X, Y, M) is a triple f = ( f , f , [ f ]) consisting of
three monotone functions
f : X m X n ,
f : Y n Y m,
[ f ] : X m Y n M.
(X n , Y n , 0)
Xn
Yn
f (Ey ) g (Ey ),
[ f ](E
x , yE) > [g](E
x , yE).
21
359
arrow in h i(m, n) such that f f ; let us fix any elements xE and yE respectively in the
non-empty sets X m and Y n ; then, by definition of , one gets the following strict inequality
in M:
[F( f )](E
x , yE) > [F( f )](E
x , yE).
However, this inequality cannot hold in M since > is the strict part of an order relation. The
termination is proved by a similar argument: any infinite and strictly decreasing sequence
in h i yields, through the non-emptiness of X and Y , at least one infinite strictly decreasing
sequence in M, which existence is denied by the assumed termination of the strict part of
its order. The transitivity comes from the ones of the orders on X , Y and M. Finally,
compatibility with the product category structure is checked through computations which
use the monotone quality of each f , f and [ f ] in O(X, Y, M), together with the facts
that M is a commutative monoid and F is an product category functor.
For concrete applications, presented in the next two sections, the following corollary
will be used instead of Theorem 5.2:
Corollary 5.3. Let us consider a 3-polygraph ( , R). Let us assume that there exist:
1. Two non-empty ordered sets X and Y .
2. A commutative monoid M equipped with an order such that its strict part
is terminating and such that the sum is strictly monotone in both variables.
3. For each operator in (m, n), three monotone functions:
: X m X n ,
: Y n Y m ,
[] : X m Y n M.
If the strict order on arrows of h i built from these data, in the aforementioned
manner, satisfies f g for every rule f g in R, then the 3-polygraph ( , R)
terminates.
22
360
We denote by N the set of non-zero natural numbers with its natural order relation.
The commutative monoid freely generated by N is denoted by [N ] and is considered
equipped by the multiset order generated by the usual order relation on natural numbers.
The elements of [N ] are all the finite formal sums of non-zero natural numbers; a natural
number n, seen as a generator of [N ], is denoted by n.
The
P multiset order is defined in two steps: for the first one, one says that any sum
a = i ki .n i satisfies the inequality n > a if n > n i for each i; then, the multiset order is
taken as the reflexive and structure-compatible closure of this relation.
This implies that the addition is strictly monotone in both variables; furthermore, since
the strict order > on N terminates, so does the strict part of the multiset order. Here is an
example of some strict inequalities that hold in [N ]:
0 < 127.1 < 2 < 4.1 + 2.3 < 4.
Lemma 6.1.1. The 3-polygraph ( c , R ) terminates if and only if the 3-polygraph
( c , {}) terminates.
Proof. Let us consider the product category O(N , N , [N ]) together with the termination
order as defined in Section 5. Let us denote by F the product category functor from h c i
into O(N , N , [N ]) defined by the following values on the operators of c :
361
One checks that the first two non-strict inequalities are satisfied:
((1 ) ) (i) = (i, i, i) = (( 1) ) (i)
((1 ) ) (i, j, k) = i + j + k + 2 = (( 1) ) (i, j, k).
Moreover:
[(1 ) ](i, j, k, l) = 2.i + l + k + l + 2
[( 1) ](i, j, k, l) = 2.i + l + k.
Since l +2 > 0, one gets k + l + 2 > k and the required strict inequality. Then, consider
the rule for which the chosen values do not work. One gets the two following equalities:
= (( 1) (1 ) ( 1)) (i, j, k)
((1 ) ( 1) (1 )) (i, j, k) = (k, j, i)
This is this rule that motivates the use of the rather complicated product category
O(N , N , [N ]) to interpret h c i. In order to make the computations for this rule, one
must start by proving the following equations, which is done by iteration on the integer n:
(n ) (i 1 , . . . , i n ) = (i 1 , . . . , i n , i 1 , . . . , i n )
(i 1 , . . . , i n , j1 , . . . , jn ) = (i 1 + j1 + 1, . . . , i n + jn + 1)
n
, . . . , i n , j1 , . . . , jn X
, k1 , . . . , kn )
[n ](i 1X
=
(i u + ku ) +
(i u i v .ku + ku .i u + i v ).
1un
Three diagrams are given for each operator : two represent the functions and (how
transmits the current intensities) and one represents [] (the heat produces). Now, it is
checked that, for every rule f g in R , the inequality F( f ) F(g) holds, except for
the rule : s t, for which F(s) = F(t). Let us check the (in)equalities for three
sample rules. The complete computations are in [5]. Let us start with the coassociativity
rule for :
1u<vn
[ ](i 1 , . . . , i n , j, k) = j + k + 1 + i 1 + + i n + 1 + k
[( ) n ](i 1 , . . . , i n , j, k)
!
X
X
X
i u i v .k +
i u + k.
= j + n+1+
1u<vn
23
1u<vn
iu + iv .
1u<vn
24
362
The multiset order properties allow the conclusion: the left member of this rule is strictly
greater than its right member. Indeed, it is a consequence from the following strict
inequalities that hold in [N ]:
j +k+1> j
j +k+1>k
i + + in + 1 > iu
for every u
1
i 1 + + i n + 1 > i u + i v for every u and v.
The computations for the other rules are handled similarly, albeit more easily. Now, let us
check the equivalence between termination of the 3-polygraphs ( c , R ) and ( c , {}).
Since is a rule of R , one concludes immediately that the termination of ( c , R )
implies the termination of ( c , {}): any infinite reduction path generated by the latter
would also be an infinite reduction path in the former.
Conversely, let us assume that ( c , {}) terminates and that there exists an infinite
reduction path ( f n )nN in ( c , R ). This path yields an infinite decreasing sequence
(F( f n ))n in O(N , N , [N ]), equipped with the order . Since this order terminates, the
sequence is stationary, which means that there exists some natural number n 0 such that
F( f n ) = F( f n+1 ) whenever n n 0 . However, as proved earlier, one can have both
f R g and F( f ) = F(g) only if f g. This implies that the sequence ( f n )nn 0 is
an infinite reduction path in ( c , {}). However, the existence of such an infinite reduction
path is prevented by the termination of ( c , {}).
363
We must check that : s t satisfies F(s) F(t). The computations give, on the
one hand, the two equalities:
= (( 1) (1 ) ( 1)) (i, j, k)
((1 ) ( 1) (1 )) (i, j, k) = (k, j, i)
This paragraph contains the proof of Proposition 4.11, point 1: if a term rewriting system
( , R) terminates, then so does its associated 3-polygraph ( c , R c ). The proof once again
uses a termination order obtained with Theorem 5.2. However, integer values cannot be
used here, since rules in R are unknown. To handle this issue, the following classical result
see [2] is used:
Theorem 6.2.1. A term rewriting system terminates if and only if there exists some
mapping | | from the set of terms T to N such that |u| > |v| whenever u is a term
that reduces into another term v. Moreover, in that case, the mapping | | can be chosen
such that |u| |u 0 | whenever u 0 is a subterm of u; the mapping can also be chosen so that
it takes its values in any countable set.
Proof. If ( , R) terminates, one can choose the mapping | | to send each term u onto the
length of the longest reduction path starting from u; this mapping satisfies |u| |u 0 | if u 0
is a subterm of u, since every reduction path from u 0 yields a reduction path of the same
length from u. Conversely, if such a mapping exists, an infinite reduction path (u n )nN in
( , R) would generate a strictly decreasing infinite sequence (|u n |)nN in N, which cannot
exist; hence the term rewriting system ( , R) terminates. If this is the case, the mapping
| | can be composed with any bijection : N E, where E is any countable set.
Hence, from our terminating term rewriting system ( , R), a mapping | | : T N
is assumed to be chosen such that |u| > |v| whenever u reduces in v and |u| |u 0 |
whenever u 0 is a subterm of u. From this mapping, one defines a binary relation > on T
25
26
364
by u > v if, for every term context c, the inequality |c[u]| > |c[v]| holds. From the fact
that the usual order > on N is a terminating strict order, this binary relation is proved to
satisfy:
365
Furthermore, if u k > vk for some k, then this inequality is strict for the same k; in this
case:
|c[(u 1 , . . . , u n )]| > |c[(v1 , . . . , vn )]|.
Lemma 6.2.2. The aforementioned binary relation > on T is a terminating strict order.
Then, one builds the lexicographical order on T
if u > v or if u = v and i j. This order satisfies:
(v, j)
Lemma 6.2.3. This relation is an order on T N . Moreover, its strict part > is a
terminating strict order on T N .
The set T N , together with the aforementioned order, is taken as the first set used
in the interpretation. The second one is a one-element set {} with the only possible order.
Finally, the commutative monoid is once again [N ] with its already-used multiset order.
The product category O(T N , {}, [N ]) is denoted by O.
Sometimes, the two elements ((u 1 , i 1 ), . . . , (u n , i n )) of (T N )n and (u 1 , . . . , u n ; i 1 ,
. . . , i n ) of T n (N )n are identified.
The considered product category functor F from h c i to O is given by the following
values (only two are given for each operator since the contravariant interpretation is trivial):
In order to prove that [] is monotone, let us fix some k in [n]. Then, either u k > vk or
u k = vk and i k jk . In the first case:
|(v1 , . . . , vk1 , u k , . . . , u n )| > |(v1 , . . . , vk , u k+1 , . . . , u n )|.
Thus, by definition of the multiset order on [N ]:
( j1 + + jk1 + i k + + i n ).|(v1 , . . . , vk1 , u k , . . . , u n )|
>
Finally:
(i 1 + + i n ).|(u 1 , . . . , u n )| ( j1 + + jn ).|(v1 , . . . , vn )|.
If is a constant in (0, 1) or for operators in , proofs are direct. Furthermore, for each
operator in either or , the operation is the only map from {} to itself, and it is
monotone, so that:
There are two steps to check the conditions given in Corollary 5.3: the first one consists
in ensuring that each given operation is monotone; the second part is about computing if
F( f ) > F(g) holds for every rule f g in R c .
For the first part, consider, for example, the functions and [] for some fixed
operator in (n, 1), n 1. Let us consider terms u 1 , . . . , u n , v1 , . . . , vn and non-zero
integers i 1 , . . . , i n , j1 , . . . , jn . Let us assume that (u k , i k ) (vk , jk ) for every k. In order
to prove that is monotone, one must check that either (u 1 , . . . , u n ) > (v1 , . . . , vn ) or
both are equal and i 1 + +i n j1 + + jn . Let c be a context. Since, for every k, u k vk
and c (v1 , . . . , vk1 , , u k+1 , . . . , u n ) is a context, one gets the following inequality:
27
28
366
Let us fix some natural number n 1 and some in (n, 1); for constants in (0, 1),
computations are direct. By iteration on n, the following equalities are proved:
(n ) (E
u , E) = (E
u , uE; E, E) and
[n ](E
u ; E) = i 1 .u 1 + + i n .u n .
367
[ n (g)](E
u , E)
[( f )](E
u , E) | f uE |. Moreover, point 2 gives
< |g uE + 1|. Finally,
since the reduction f uE guE holds in ( , R) and by properties of ||: | f uE | > |gvE |.
There remains to concatenate these three inequalities to get [( f )] > [ n (g)] and, as a
consequence F ( f ) F n (g). The product category functor F from h c i to O gives
us F( f ) F(g) for every rule f g in (R) and F( f ) F(g) for every rule f g
in R . This yields the following result:
Proposition 6.2.6. If the term rewriting system ( , R) terminates, then termination of
the 3-polygraph ( c , R c ) is equivalent to termination of ( c , R ).
Since we already know that ( c , R ) always terminates, this concludes the proof of
Theorem 4.12.
7. Application 2: A convergent 3-polygraph for a commutative equational theory
u )| i k .u k .
i k .|(E
u )| i 1 .u 1 + + i n .u n . This gives the inequality [ ]
Finally: (i 1 + + i n ).|(E
[( ) n ]. Now, let us consider the first rule for local permutation; the first step is to
prove, by iteration on n:
u , v; E, j) = (v, uE; j, E)
(n,1 ) (E
u , v; Ei, j) = (v, (E
u ); j, 2.(i 1 + +i n )) = ((1)n,1 ) (E
u , v; Ei, j).
Then: ( (1)) (E
u )| = [(1 ) n,1 ](E
u , v; Ei, j).
And: [ ( 1)](E
u , v; Ei, j) = (i 1 + + i n ).|(E
The other rules in R are similarly handled and give similar results: for every rule
f g in R , the inequality F( f ) F(g) holds in O. The final part concerns the family
(R) of rules. Let us assume that : f g is a rule in R; its translation by is the rule
() : ( f ) ] f (g). Let us prove that F ( f ) F ] f (g). The first step is to
prove, by iteration on the degree of terms in T , the following lemma:
Lemma 6.2.5. Let u be a term in T , n be an integer such that n ]u, vE a family of
n terms in T and E a family of n non-zero natural numbers. Let us denote by vE the
substitution defined by xk v = vk if k n and xk otherwise. Then:
1. There exists some non-zero integer k such that ( n (u)) (u vE , k).
2. The inequality [ n (u)](E
v , E) < |u vE + 1| holds in [N ].
3. If u is not a variable, then the inequality [ n (u)](E
v , E) |u vE | also holds in [N ].
Point 1 gives, when applied to f and g with n = ] f , the existence of non-zero natural
u , E) = ( f uE , k) and n (g) (E
u , E) = (g uE , k 0 ).
numbers k and k 0 such that ( f ) (E
Let us consider some context c. By definition of the reduction relation generated
by the rule , one gets c[ f uE ] c[g uE ]. Consequently, the properties of | | give
|c[ f uE ]| > |c[g uE ]|. This holds for any context; thus, by definition of > on T , one
gets f uE > g uE . Finally, using the definition of > on T N :
This final section is devoted to give a convergent presentation of the equational theory
of Z/2Z-vector spaces, which is, as mentioned before, a commutative equational theory
and thus does not have any convergent presentation by a term rewriting system.
In Section 1, we have considered three term rewriting systems ( , R0 ), ( , R1 ) and
( , R2 ) that respectively present the equational theories of monoids, of commutative
monoids and of Z/2Z-vector spaces. All three have two operators, a product and a unit,
and they have respectively three, four and five rules. Thus, their associated 3-polygraphs
have five operators together with twenty-three rules for ( c , R0c ), twenty-four for ( c , R1c )
and twenty-five for ( c , R2c ).
Since ( , R0 ) is a left-linear convergent term rewriting system, Theorem 4.12 ensures,
in particular, that ( c , R0c ) is a convergent presentation of the theory of monoids,
with explicit resource management. The term rewriting system ( , R1 ) is left-linear,
non-terminating (due to the commutativity rule) and non-confluent (though it could
be completed to get a confluent rewriting system), hence Theorem 4.12 gives us that
( c , R1c ) is a non-terminating and non-confluent presentation of the equational theory
of commutative monoids, with explicit resource management. Finally, the term rewriting
system ( , R2 ) is a non-left-linear, non-terminating and non-confluent term rewriting
system: non-left-linearity denies us any information coming from Theorem 4.12 about this
presentation.
However, there is, in [10], an equivalent 3-polygraph called L(Z2 ). Its signature contains
a sixth operator, called and pictured this way:
This new operator is said to be superfluous since it represents, in a Z/2Z-vector space, the
concrete operation (x, y) = ((x, y), x) that can be expressed in terms of , and . In
the presentation, this relation is enforced by means of the following extra rule:
( f ) > (g).
Let us prove now that [( f )] > [ n (g)]. Since is a term rewrite rule, its source f
is a non-variable term. Hence, point 3 of the previous lemma gives the inequality
29
30
368
369
The main objective of this new operator and rule is to make proof of termination easier (if
not just possible). Then, one has to add a certain amount of rules in order to complete the
presentation, to finally obtain the 3-polygraph L(Z2 ), discovered and baptized in [10].
This polygraph has six operators:
The chosen values simplify the computations greatly. Indeed, normally, there are three
inequalities to check for each rule: hence, there should be 201 inequalities to check here.
The first reduction comes from the fact that F identifies and : there are 24 rules that can
be dropped since, for each one, there is another rule that is sent to the same image. Thus
there remain 43 rules and 129 inequalities to check.
Moreover, the rules of L(Z2 ) have some interesting symmetries that one can exploit:
indeed, whenever f g is a rule of L(Z2 ), then f o g o is also a rule of L(Z2 ), where
the duality ()o is the involution defined by:
o = ,
o = ,
(g f )o = f o g o ,
o = ,
o = ,
n o = n,
( f g)o = f o g o .
Another way to define this duality is by its action on diagrams: there, it is the top-down
symmetry. Furthermore, the functor F is compatible with this symmetry, in the sense that,
for every arrow f , the functor F sends f o onto F( f )o , where the duality on O is defined
x , xE0 ) = [ f ](E
x 0 , xE). Note that this
that way: ( f , f , [ f ])o = ( f , f , [ f ]o ), with [ f ]o (E
only has a meaning because the two sets X and Y are the same here (both equal to N ).
Thus, if some rule f g in L(Z2 ) satisfies F( f ) > F(g), then so does f o g o .
As a consequence, this reduces the number of rules to study: 18 of the remaining rules
have a distinct dual, hence only 25 rules need to be studied (75 inequalities). Furthermore,
when a rule f g is self-dual, the inequality F( f ) F(g) holds if and only if
F( f ) F(g) holds: 8 of the remaining rules are in that case, which means there still
are 67 inequalities from the former 201 to check. Computations do not raise any difficulty.
For example, let us study the following (self-dual) rule:
From [10], we already know that this presentation is confluent but termination was
still a conjecture. The technique presented in Section 5 now allows us to prove that
it is also terminating, hence convergent. The interpretation product category we use is
O(N , N , [N ]), once again denoted by O. The interpretation functor F is given by the
following values on generating operators:
31
One computes
( ) (i, j) = (2i + j, i + j)
((1 ) ( 1) (1 )) (i, j) = (i + j, i + j).
32
370
Since i and j are non-zero natural numbers, the following inequality holds:
( ) > ((1 ) ( 1) (1 )) .
Then
[ ](i, j, k, l) = i + i + j + k + k + l
[(1 ) ( 1) (1 )](i, j, k, l) = 2i + j + 2k + l.
Since i and j are non-zero natural numbers, the inequalities i + j > i and i + j > j
always hold. Thus, by property of the multiset order on [N ], the inequality i + j > i + j
always holds. Similarly, so does k + l > k + l. Finally, the multiset order on [N ] is
compatible with addition, yielding:
[ ] > [(1 ) ( 1) (1 )].
The other rules are studied in a similar way [5], which leads to the following result, proving
that commutative equational theories can admit polygraphic convergent presentations:
Theorem 7.1. The 3-polygraph L(Z2 ) is a convergent presentation of the equational
theory of Z/2Z-vector spaces, with explicit resource management.
8. Comments and future directions
The study of (3-)polygraphs had been started by Albert Burroni and Yves Lafont, as an
algebraic model for three-dimensional calculus on two-dimensional objects. Foundations
were laid in [8,3,9]. In [10], rewriting systems generated by 3-polygraphs were considered
and many known equational presentations are studied in order to be completed into
convergent rewriting systems (or, at least, rewriting systems with the unique normal form
property). Discussions with Albert Burroni, Yves Lafont and Philippe Malbos have been
essential in order to achieve the results presented here. Comments from the referee were of
great help to make this paper clearer.
There exist many research paths concerning polygraph. The first one is about
confluence: as mentioned earlier, there exist theoretical issues with critical pairs of
3-polygraphs; exploration and classification are mandatory in order to achieve some
automated completion procedure for these objects. Such a tool (which implementation in
Caml has already started) would be very useful since, starting from an equational theory,
one could use the constructions described in Section 4 in order to obtain a 3-polygraph;
then a completion procedure could be applied to correct termination and confluence issues.
Suggested by Pierre Lescanne, other usual techniques for building reduction orders in
term rewriting could also be examined, in order to see if they could also be adapted to
polygraphs. Among the most useful results to be studied are the ones concerning path
orders, see [2], and dependency pairs, see [1].
A second theme to be explored is the study of higher-dimensional polygraphs. For
an example of application, 4-polygraphs provide a categorical framework for proof
transformations in the calculus of structures [4]. Such an approach could yield results such
as proof decompositions or normal forms, given by a convergent 4-polygraph. At least,
it suggests that formulas are two-dimensional objects, proofs are three-dimensional and
33
371
computation on them (such as cut elimination) lives in dimension 4. This point of view
is conjectured to yield a new class of objects describing formal proofs, giving a different,
categorical and geometrical way to approach proof theory.
Theoretical studies can also be directed at pursuing the synthesis started in [5] on rewriting systems: one of the main goals is to have a framework where one can compare two
rewriting systems, regardless of the algebraic structure of their terms. The reduction space
associated to each rewriting system is an algebraico-geometric object (a cubical object in
some category of algebras) and one could use the underlying cubical sets of these objects
to compare rewriting systems, geometrically. Notions of (co)fibrations from Quillen model
categories see [6] theory could be useful for a better understanding of results such as
the ones of Section 4; since many rewriting systems are special cases of polygraphs, this
study will start with the construction of homotopical tools for these objects.
Still another question is the following: is there some n for which there exists a finite
n-polygraph yielding a calculus with both explicit substitutions and explicit resource
management for the -calculus? When n = 3, the answer seems to be negative, since
theoretical results deny the existence of any non-trivial product category that is both
cartesian (for resource management) and sovereign (for substitutions). An equational
description of the structure of closed category (such as the one Albert Burroni has given
for cartesian categories) should be the first step of this work. Another possibility is to use
a three-dimensional interpretation of proofs, together with the links between -terms and
proofs.
Finally, 3-polygraphs have the interesting property to modelize computational circuits.
Indeed, both classical and quantum algorithms accept representations as circuits which
are, albeit not in their usual presentation, genuine operators of a 3-polygraph. Furthermore,
equational presentations are known for both kinds of circuits. Questions that can be studied
with this point of view concern the existence of convergent 3-polygraphs for classical or
quantum circuits, thus leading to canonical representations of programs. One can take a
look at [7] for more information on circuits and [10] for their links with polygraphs.
References
[1] T. Arts, J. Giesl, Termination of term rewriting using dependency pairs, Theoretical Computer Science 236
(2000) 133178.
[2] F. Baader, T. Nipkow, Term Rewriting and All That, Cambridge University Press, 1998.
[3] A. Burroni, Higher-dimensional word problems with applications to equational logic, Theoretical Computer
Science 115 (1) (1993) 4662.
[4] A. Guglielmi, L. Straburger, Non-commutativity and MELL in the Calculus of Structures, in: Lecture
Notes in Computer Science, vol. 2142, 2001, pp. 5468.
[5] Y. Guiraud, Presentations doperades et syst`emes de ree criture, Th`ese de doctorat, Montpellier, 2004.
[6] M. Hovey, Model categories, Mathematical Surveys and Monographs 63 (1999).
[7] A. Kitaev, A. Shen, M. Vyalyi, Classical and quantum computation, Graduate Studies in Mathematics 47
(2002).
[8] Y. Lafont, Penrose Diagrams and 2-dimensional Rewriting, in: London Mathematical Society Lecture Notes
Series, vol. 177, 1992, pp. 191201.
[9] Y. Lafont, Equational Reasoning with 2-dimensional Diagrams, in: Lecture Notes in Computer Science, vol.
909, 1995, pp. 170195.
[10] Y. Lafont, Towards an algebraic theory of boolean circuits, Journal of Pure and Applied Algebra 184 (2003)
257310.
[11] S. MacLane, Categorical algebra, Bulletin of the American Mathematical Society 71 (1965) 40106.
34
The complexity of
first-order and
monadic
second-order
logic revisited
Markus Frick and Martin Grohe
Annals of Pure and Applied Logic
130 (2004), Pages 3-31
of Pure
andand
Applied
Logic
Annals
Annals
of Pure
Applied
Logic112
130(2001)
(2004)43115
331
ANNALS OF
PURE AND
APPLIED LOGIC
www.elsevier.com/locate/apal
www.elsevier.com/locate/apal
TheThe
ManinMumford
conjectureand
andmonadic
the model
complexity of first-order
theory of di&erence
'elds
second-order
logic revisited
Markus Frick , Martin Grohe
b, Israel
Hebrew University, Jerusalem,
Department of Mathematics, a
Ehud Hrushovski 1
Abstract
1. Introduction
1. Introduction
This paper extends and specializes the general model theory of di&erence equations
in characteristic
0, developed
1.1.
Model-checking
problems in [9]. We investigate the induced structure on de'nable
Abelian groups of 'nite dimension. We also give general bounds on the number of soWe study the complexity of a fundamental algorithmic problem, the so-called modellutions to a 'nite set of di&erence equations. As a corollary, we obtain a model-theoretic
checking problem: given a sentence of some logic L and a structure A, decide whether
proof inofA.the
ManinMumford
conjecture.
The
proof yields
new number-theoretic
inholds
Model-checking
and closely
related
problems
are of importance
in several areas
formation,
respect
to p-adictheory,
and algebraic
uniformities and
of the
bounds
of
computerparticularly
science, forwith
example,
in database
artificial intelligence,
automated
obtained. In this paper, we prove new lower bounds on the complexity of the modelverification.
checking problems for first-order and monadic second-order logic.
1.1.It is
The
ManinMumford
conjecture
known
that model-checking
for both first-order and monadic second-order logic is
PSPACE-complete [17,20] and thus most likely not solvable in polynomial time. While this
The ManinMumford conjecture states that if C is a curve of genus two or more,
Corresponding
embedded
in its
Jacobian J , then the set of torsion points on C is 'nite, see [18,
author.
E-mail addresses: markus.frick@sap.com (M. Frick), grohe@informatik.hu-berlin.de (M. Grohe).
The work was begun at MIT, with support from the NSF. The latter part was supported by the ISF.
E-mail address:
(E. Hrushovski).
0168-0072/$
- seeehud@math.huji.ac.il
front matter 2004 Elsevier
B.V. All rights reserved.
doi:10.1016/j.apal.2004.01.007
c 2001 Elsevier Science B.V. All rights reserved.
0168-0072/01/$ - see front matter
PII: S 0 1 6 8 - 0 0 7 2 ( 0 1 ) 0 0 0 9 6 - 3
1
35
36
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
result shows that the problems are intractable in general, it does not say too much about
their complexity in practical situations. Typically, we have to check whether a relatively
small sentence holds in a large structure. For example, when evaluating a database query,
we usually have a small query and a large database. Similarly, when verifying that a finite
state system satisfies some property, the specification of the property in a suitable logic
will usually be small compared to the huge state space of the system. When analysing the
complexity of the problem, we should take this imbalance between the size of the input
sentence and the size of the input structure into account.
1.2. Parameterized complexity theory
37
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
It is interesting to compare these intractability results for first-order logic and monadic
second-order logic with the following: the model-checking problem for linear time
temporal logic LTL is solvable in time 2 O(k) n [14], making it fixed-parameter tractable and
also tractable in practise. On the other hand, model-checking for LTL is PSPACE-complete
(as it is for first-order and monadic second-order logic). So parameterized complexity
theory helps us in establishing an important distinction between problems of the same
classical complexity.1 We may argue, however, that the comparison between LTL modelchecking and first-order model-checking underlying these results is slightly unfair. As
the name linear time temporal logic indicates, LTL only speaks about a linearly ordered
sequence of events. On an arbitrary structure, an LTL formula can thus only speak about
the paths through the structure. First-order formulas do not have such a restricted view. It
is therefore more interesting to compare LTL and first-order logic on words, which are the
natural structures describing linear sequences of events. A well-known result of Kamp [12]
states that LTL and first-order logic have the same expressive power on words. And indeed,
model-checking for first-order logic and even for monadic second-order logic is fixedparameter tractable if the input structures are restricted to be words. This is a consequence
of Buchis theorem [2], saying that for every sentence of monadic second-order logic one
can effectively find a finite automaton accepting exactly those words in which the sentence
holds. A fixed-parameter tractable algorithm for monadic second-order model-checking
on words may proceed as follows: it first translates the input sentence into an equivalent
automaton and then tests in linear time whether this automaton accepts the input word. But
note that since there is no elementary bound for the size of a finite automaton equivalent
to a given first-order or monadic second-order sentence [18] (also see [15]), the parameter
dependence of this algorithm is non-elementary, thus it does not even come close to the
2 O(k) n model-checking algorithm for LTL. Of course this does not rule out the existence
of other, better fixed-parameter tractable algorithms for first-order or monadic second-order
model-checking.
1.4. Our results
Our first theorem shows that there is no fundamentally better fixed-parameter tractable
algorithm for first-order and monadic second-order model-checking on the class of words
than the automata based one described in the previous paragraph.
Theorem 1. (1) Assume that PTIME = NP. Let f be an elementary function and p a
polynomial. Then there is no model-checking algorithm for monadic second-order
logic on the class of words whose running time is bounded by f (k) p(n).
(2) Assume that FPT = AW[]. Let f be an elementary function and p a polynomial.
Then there is no model-checking algorithm for first-order logic on the class of words
whose running time is bounded by f (k) p(n).
1 A critical reader may remark that this distinction between the complexities of LTL model-checking and firstorder model-checking was known before anybody thought of parameterized complexity-theory. This is true, but
how can we be sure that there is no 2 O(k) n model-checking algorithm for first-order model-checking? The role
of parameterized-complexity theory is to give evidence for this.
38
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
Here k denotes the size of the input sentence of the model-checking problem and n the
size of the input word.
(1) There is no model-checking algorithm for first-order logic on the class of words without
order whose running time is bounded by
22
o(k)
p(n).
(2) There is no model-checking algorithm for first-order logic on the class of binary trees
whose running time is bounded by
22
2o(k)
p(n).
Again, k denotes the size of the input sentence and n the size of the input structure.
39
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
A vocabulary is a finite set of relation, function, and constant symbols. Each relation and
function symbol has an arity. always denotes a vocabulary. A structure A of vocabulary
, or -structure, consists of a set A called the universe, and an interpretation T A of each
symbol T : relation symbols and function symbols are interpreted by relations and
functions on A of the appropriate arity, and constant symbols are interpreted by elements
of A. All structures considered in this paper are assumed to have a finite universe. The
reduct of a -structure A to a vocabulary
is the
-structure with the same universe
as A and the same interpretation of all symbols in
. An expansion of a structure A is a
structure A
such that A is a reduct of A
. In particular, if A is a structure and a A, then
by (A, a ) we denote the expansion of A by the constant a . We write A
= B to denote that
structures A and B are isomorphic.
Let be a finite alphabet. We let ( ) be the vocabulary consisting of a binary relation
symbol , a unary function symbol S, two constant symbols min and max, and a unary
relation symbol Ps for every s . A word structure over is a ( )-structure W with
the following properties:
W is a linear order of W, minW and maxW are the minimum and maximum element of
W , and SW is the successor function associated with W , where we let SW (maxW ) =
maxW .
For every a W there exists precisely one s such that a PsW .
We refer to elements a W as the positions in the word (structure) and, for every position
a W, to the unique s such that a PsW as the letter at a .
It is obvious how to associate a word from the set of all words over with every
word structure over and, conversely, how to associate an up to isomorphism unique word
structure with every word in . We identify words with the corresponding word structures
and write W to refer both to the word and the structure.
The class of all words (or word structures) over any alphabet is denoted by W. The
length of a word W is denoted by |W|.
A subword of a word W = s0 . . . sn1 is either the empty word or a word si . . . s j
for some i, j, 0 i j < n. We write V W to denote that V is a subword of W.
We assume that the reader is familiar with propositional logic, first-order logic FO and
monadic second-order logic MSO (see, for example, [7]). If is a formula of propositional
logic and is a truth-value assignment to the variables of , then we write |= to
denote that satisfies . Similarly, if (x1 , . . . , xk ) is a first-order or monadic secondorder formula with free variables x1 , . . . , xk , A is a structure, and a 1 , . . . , a k A, then
40
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
Structure A C, sentence L
Decide if A |= .
We fix a reasonable encoding of structures and formulas by words over {0, 1}. We denote
the length of the encoding of a structure A by A and the length of the encoding of a
formula by . When reasoning about model-checking problems, we usually use n to
denote the size A of the input structure and k to denote the size of the input sentence.
Our underlying model of computation is the standard RAM-model with addition and
subtraction as arithmetic operations (cf. [1,19]). In our complexity analysis we use the
uniform cost measure.
It is well-known that if we are interested in the complexity of first-order or monadic
second-order model-checking on words, the alphabet is inessential. This can be phrased as
follows:
Fact 4. Let L {FO, MSO}. Then there is a linear time algorithm that, given a sentence
L and a word W W, computes a sentence
L of vocabulary ({0, 1}) and a word
W
{0, 1} such that
O(), W
O(W), and (W |=
W
|=
).
N denotes the set of natural numbers (including 0). For all n, i N we let bit(i, n)
denote the i th bit in the binary representation of n. (Here we count the lowest priority bit
as the 0th bit.) lg denotes the base-2 logarithm, and, for i N, lg(i) denotes the i -fold
logarithm. More formally, lg(i) is defined by lg(0) (n) = n and lg(i+1) (n) = lg lg(i) (n).
We define the tower function T : N R R by T(0, r ) = r and T(h + 1, r ) = 2T (h,r )
for all h N, r R. Thus T(h, r ) is a tower of 2s of height h with an r sitting on top.
Observe that for all n, h N with n 1 we have T(h, lg(h) n) = n.
3. Succinct encodings
41
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
words encode the same number smaller than T(h, n). This is what Lemma 8, the key result
of this section, states.
For all h 1 we let h = {0, 1, <1>, </1>, . . . , <h>, </h>}. The tags <i> and </i>
represent single letters of the alphabet and are just chosen to improve readability. We define
L : N N by L(0) = 0, L(1) = 1, L(n) = lg(n 1) + 1 for n 2. Note that for
n 1, L(n) is precisely the length of the binary representation of n 1.
We are now ready to define our encodings h : N h , for h 1. We let
1 (0) = <1></1> and
1 (n) = <1> bit(0, n 1) bit(1, n 1) . . . bit(L(n) 1, n 1) </1>
</h>
h1 (0) bit(0, n 1)
h1 (1) bit(1, n 1)
..
.
h1 (L(n) 1) bit(L(n) 1, n 1)
for n 1. Here empty spaces and line breaks are just used to improve readability.
Example 5.
2 (47) =
Lemma 6.
<2>
</2>
1 (0) 0
1 (1) 1
1 (2) 1
1 (3) 1
1 (4) 0
1 (5) 1
<2>
<1></1> 0
<1>0</1> 1
<1>1</1> 1
<1>01</1> 1
<1>11</1> 0
<1>001</1> 1
</2>
i
j =1
L j (n).
Observe that for all i 2 and n 1 we have Pi (n) = L(n) Pi1 (L(n)).
We first prove, by induction on h 1, that for all n 1,
|h (n)| 4h Ph (n).
(1)
42
10
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
We have 1 (n) = 2 + L(n) 4L(n) = 4P1 (n), so (1) is true for h = 1. Let h 2 an
suppose that (1) holds for h 1. Then
|h (n)| = 2 + L(n) +
L(n)1
i=0
= 2 + L(n) + 2 +
4 + L(n) +
i=1
L(n)1
i=1
|h1 (i )|
Note that P = {Ph (m) | m < n 0 , h 1} is a finite set and let c = max(P ).
We prove that Ph (n) c L(n)2 by induction on h 1. Since P1 (n) = L(n),
this statement is true for h = 1. For h 2, we have Ph (n) = L(n) Ph1 (L(n)). If
L(n) < n 0 , we have Ph1 (L(n)) c and thus Ph (n) cL(n). If L(n) n 0 , we have
L(L(n))2 L(n). By induction hypothesis, Ph1 (L(n)) c L(L(n))2 . Thus
Lemma 7. There is an algorithm that, given h, n N, computes h (n) in time
O(|h (n)|) = O(h lg2 n).
Proof. The algorithm computes h (n) in a straightforward recursive manner. We get the
following recurrence for the running time R(h, n):
L(n)
i=0
This recurrence is very similar to the one we obtained in the proof of Lemma 6 and can
easily be solved using the same methods.
Observe that for all m 1 we have
43
(2)
m = n.
R(h, n) O(L(n)) +
Recall that T(h, ) is a tower of 2s of height h with an on top. Thus, in particular, for all
h, 1 we have
W |= h, (a , b)
4(h 1) Ph1 (i )
11
|h1 (i )|
L(n)1
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
Proof. Let h = 1. Recall that the 1 -encoding of an integer p 1 is just the binary
encoding of p 1 enclosed in <1>, </1>. Hence to say that x and y are 1 -encodings of
the same numbers, we have to say that for all pairs x + i, y + i of corresponding positions
between x respectively y and the next closing </1>, there are the same letters at x + i
and y + i . For numbers p in {0, . . . , T(1, )}, there are at most L( p) positions to be
investigated. To express this, we let
1, (x, y) = x1 . . . x y1 . . . y
1
Sx = x1
((P</1> xi xi = xi+1 ) (P</1> xi Sxi = xi+1 ))
Sy = y1
i=1
i=1
1
i=1
((P0 xi P0 yi ) (P1 xi P1 yi )) .
Now let h 2 and suppose that we have already defined h1, (x, y). It will be convenient
to have the following auxiliary formulas available:
h
(x, y) = x < y z ((x < z z y) P</h> z) ,
int
h
last
(x, y) = x < y P</h> y z ((x < z z < y) P</h> z) .
h
Intuitively, int
(x, y) says that y is in the interior of the subword of the form h ( p) starting
h
at x and last
(x, y) says that y is the last position of the subword of the form h ( p) starting
at x, provided such a subword indeed starts at x.
To say that the subwords starting at x and y are h -encodings of the same numbers, we
have to say that for all positions w between x and the next closing </h> and all positions z
between y and the next closing </h>, if w and z are first positions of subwords isomorphic
to h1 (q) for some q N, then the positions following these two subwords are either both
1s or both 0s. For all subwords of h ( p) of the form h1 (q) we have q {0, . . . , L( p)}.
In order to apply the formula h1, to test equality of such subwords, we must have
q L( p) T(h 1, ). By (2), the last inequality holds for all p T(h, ). Thus for
such p we can use the formula h1, to test equality of subwords of h ( p) of the form
44
12
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
h
(x, y) = w int
(x, w) P<h - 1> w
h,
h
z int (y, z) P<h - 1> z h1, (w, z)
h
(y, z) P<h - 1> z
z int
h
w int
(x, w) P<h - 1> w h1, (w, z)
h
h
(x, w) P<h - 1> w int
(y, z) P<h - 1> z h1, (w, z)
wz int
h1
h1
w
z
last
(w, w
) last
(z, z
) (P1 Sz
P1 Sw
) .
The first line of this formula says that every subword of the form h1 (q) in the subword
of the form h ( p) starting at x also occurs in the subword of the form h ( p) starting at y.
The second line says that every subword of the form h1 (q) in the subword of the form
h ( p) starting at y also occurs in the subword of the form h ( p) starting at x. The third
and fourth lines say that if w and z are the first positions of isomorphic subwords of the
form h1 (q), then they are either both followed by a 1 or both by a 0 (since the only two
letters that can appear immediately after a subword h1 (q) in a word h ( p) are 0 and 1).
This formula says what we want, but unfortunately it is too large to achieve the desired
bounds. The problem is that there are three occurrences of the subformula h1, (w, z).
We we can easily overcome this problem. We let
h1
h1
(w, w
) last
(z, z
) P1 Sz
P1 Sw
(w, z) = w
z
last
and
h, (x, y) = wz
h
h
(x, w) int
(y, z)
int
h
int
(y, w) int
(x, z)
P<h - 1> w P<h - 1> z
h
h
int (y, w) int
45
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
13
h () = <clause> h (1 ) h (m ) </clause>,
h ( ) = <cnf> h (1 ) h (m ) </cnf>.
Next, we need to encode assignments. Let A(n) denote the set of all assignments
: {X 0 , . . . , X n1 } {TRUE, FALSE}.
We add the symbols <val>, </val>, <asn>, </asn>, true, false to our alphabet. For
an assignment A(n), we let
h ()
= <asn>
</asn>.
<val>h (n 1) (X n1 )</val>
46
14
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
Of course what is meant by (X i ) here is the symbol true if (X i ) = TRUE and the
symbol false otherwise. For a pair ( , ) CNF(n) A(n) we simply let h ( , ) =
h ( ) h ().
The following lemma is an immediate consequence of Lemmas 6 and 7:
Lemma 9. Let h N and ( , ) CNF(n) A(n). Then |h ( , )| = O(h lg2 n ( +
n)) and there is an algorithm that computes h ( , ) in time O(h lg2 n ( + n)) (that
is, linear in the size of the output).
Lemma 10. For all h, N there is a first-order sentence h, of size O(h lg h + ) such
that for all n T(h, ) and ( , ) CNF(n) A(n),
h ( , ) |= h,
Proof. Let h, (x, y) be the formula defined in Lemma 8. Recall that it says that the
subwords of the form h (m) and h (n) starting at x, y, respectively, are identical,
provided that such subwords start at x and y and that n, m T(h, ). Also recall the
formula
h
last
(x, y) = x < y P</h> y z((x < z z < y) P</h> z),
defined in the proof of Lemma 8, which says that y is the last position of the subword of
the form h (n) starting at x.
lit
We first define a formula h,
(x) such that if the subword of starting at x is the
encoding of a literal, then it is satisfied by . We let
lit
h
h
h,
(x) = yx
y
(P<val> y h, (Sx, Sy) last
(Sx, x
) last
(Sy, y
)
It simply says that there is a position y which is still within the boundary of the clause
starting at x such that a literal starts at y and this literal is satisfied. Finally, we let
clause
h, (x) = y(P<clause> y h,
(y)).
This formula says that all clauses and thus the whole CNF-formula are satisfied.
For reasons that will become clear in the next section, we will also have to encode tuples
( , V1 , . . . , Vt ), where CNF(n) and V1 , . . . , Vt is a partition of {1, . . . , n}. We add
47
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
15
symbols V1, . . . , Vt to the alphabet. So now our alphabet depends on the two parameters
h and t. For every i {0, . . . , n 1} and 1 j t we let part(i ) = Vj if X i Vj . Then
we let
h ( , V1 , . . . , Vt ) = h ( )<asn>
Even in the case t = 1 it will be useful to work with the encoding h ( , {0, . . . , n 1})
instead of just h ( ), because the word h ( , {0, . . . , n 1}) already provides
the infrastructure for an assignment. For brevity, we write h ( , ) instead of
h ( , {0, . . . , n 1}).
5. Satisfiability testing through model-checking
In this section, we prove Theorem 1.
Theorem 11. Assume that PTIME = NP. Let h N and p a polynomial. Then there is no
algorithm for MC(MSO, W) whose running time is bounded by
T(h, k) p(n).
As usual, k denotes the size of the input sentence and n the size of the input word.
Proof. Suppose that there is an algorithm A for MC(MSO, W) whose running time is
bounded by
T(h, k) p(n),
where h+1,
is the formula obtained from the formula h+1, of Lemma 10 by replacing
the subformula Ptrue Sy
by X Sy
. Recall that Ptrue Sy
is the only subformula of h+1,
that involves either Ptrue or Pfalse . The subformula x(X x PV1 x) says that X only
contains elements that are at a position with symbol V1, which may simply be viewed as a
placeholder for true or false in an assignment. The intended meaning of X is to indicate
all variables set to TRUE. It is easy to see that for every n
T(h+1, ) and 3-CNF(n
)
we have
h+1,
h+1 ( , ) |=
is satisfiable.
(3)
48
16
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
Consider the algorithm displayed in Fig. 1, which decides if the input formula is
satisfiable. The correctness of the algorithm follows from (3) and
n = T(h + 1, lg
(h+1)
(n )) T(h + 1, lg
(h+1)
For the running time analysis, without loss of generality we can assume that n
O((n
)3 ), that is, that and n
are polynomially related. We claim that the running time
of the algorithm is bounded by q(n
) for some polynomial q depending only on the fixed
constant h.
Lines 13 of the algorithm can be implemented in time polynomial in h, n
. Recall that
by Lemma 9, |h+1 ( , )| is polynomially bounded in terms of h and n
. Thus by our
assumption on the algorithm A, Line 4 requires time
(n )!).
h+1, ) p
(h, n
),
T(h,
h+1, ) p(|h+1 ( , )|) T(h,
49
17
Fig. 1.
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
Theorem 12 (Downey et al. [6], Flum and Grohe [9]). If AWS AT[3-CNF] is fixed-parameter tractable then
AW[] = FPT.
Theorem 13. Assume that FPT = AW[]. Let h N and p a polynomial. Then there is
no algorithm for MC(FO, W) whose running time is bounded by
T(h, k) p(n).
As usual, k denotes the size of the input sentence and n the size of the input word.
To prove this theorem, we will use the following alternative characterization of fixedparameter tractability. A parameterized problem P N is eventually in polynomial
time if there is a computable function f and an algorithm, whose running time is
polynomial in |x| that, given an instance (x, k) N of P with |x| f (k) correctly
decides if (x, k) P . (The behaviour of the algorithm on instances (x, k) N with
|x| < f (k) is irrelevant.)
Lemma 14 (Flum and Grohe [8]). A parameterized problem is fixed-parameter tractable
if, and only if, it is computable and eventually in polynomial time.
Proof of Theorem 13. Suppose that there is an algorithm A for MC(FO, W) whose
running time is bounded by
T(h, k) p(n),
for some h N and polynomial p. We shall prove that AWS AT[3-CNF] is in FPT.
be the formula obtained from the formula h+1, of
For all h, , k, t N, let h+1,,k,t
Lemma 10 by replacing the (unique) subformula Ptrue Sy
by ti=1 kj =1 Sy
= xi j , for
50
18
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
19
Fig. 2.
for a suitable polynomial p
. Using a similar argument as in the proof of Theorem 11, we
can now derive that there is a computable n 0 depending on h, k
, t such that for all n
n 0
we have
i=1
Qxt 1 . . . Qxt k
k
i=1
i=1
PV2 x2i
k
i=1
..
.
k1
i=1
PVt xt i
k1
i=1
T(h,
h+1,,k
,t ) T(h, lg(h) (n
)) n
.
This proves our claim that if n
is sufficiently large, then the running time of the algorithm
is bounded by q(n
) for some polynomial q and thus the theorem.
xt i < xt (i+1)
h+1,,k,t
.
h+1 ( , V1 , . . . , Vt ) |=
h+1,,k,t
( , V1 , . . . , Vt )
with parameters (k, t) is a yes-instance AWS AT[3-CNF].
(4)
For the running time analysis, without loss of generality we assume that n
O((n
)2 ). We claim that if n
is sufficiently large, then the running time of the algorithm
is bounded by q(n
) for some polynomial q. More precisely, we claim that there is a
polynomial q and an n 0 N, which is computable from h, k
, t, such that for n
n 0 the
running time of the algorithm is bounded by q(n
). Since h is fixed and since AWS AT[3CNF] is computable, by Lemma 14 this implies that AWS AT[3-CNF] is in FPT.
51
Remark 15. For readers familiar with least fixed-point logic, let us point out that with the
same techniques it can be proved that there is no model-checking algorithm for monadic
least fixed-point logic on words whose running time is bounded by T(h, k) p(n), for any
h N and polynomial p, under the weaker assumption that AW[P ] = FPT.
AW[P ] is a parameterized complexity class that contains AW[]. A complete problem
for AW[P ] is the alternating weighted satisfiability problem for arbitrary Boolean circuits
(as opposed to bounded depth circuits for AW[]).
6. First-order model-checking on structures of bounded degree
In this and the next section, we investigate the parameterized complexity of first-order
model-checking over structures of bounded degree. Let A be a -structure for some
vocabulary . We call two elements a , b A adjacent if they are distinct and there is
an R , say, r -ary, and a tuple a 1 . . . a r RA such that a , b {a 1 , . . . , a r }. The degree
of an element a A in the structure A is the number of elements adjacent to a , and the
degree of A is the maximum degree of its elements. For d 1, we denote the class of all
structures of degree at most d by D(d).
It is quite easy to derive from Seeses proof a triply-exponential upper bound on f for a
non-uniform version of this theorem, stating that for every fixed first-order sentence there
is a triply exponential function f and an algorithm checking whether a given structure A
of degree at most d satisfies . We shall prove a uniform version of this result, which has
the additional benefit that our algorithm is quite simple.
The crucial idea, which has also been explored by Seese, is to use the locality of firstorder logic. Without loss of generality we assume that vocabularies only contain relation
52
20
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
lg d2 O(k)
21
n,
where as usual k denotes the size of the input sentence and n the size of the input
structure.
Proof. We denote the running time of model-check(, A) by R(n, p, q), where n =
A, q = qr(), and p is the size of the quantifier-free part of . Note that p + q k(=
). Let r = r (q) = 2q ,
s(q) =
Fig. 3.
and constant symbols. (Functions can easily be simulated by relations.) We need some
additional notation. A path of length l is a sequence of vertices a 0 , . . . , a l A such that
a i1 , a i , i = 1, . . . , l are adjacent in A. The distance between two elements a , b A of
the universe is 0, if a = b and r , if the shortest path between a and b has length r . Let
r 1 and a A. The r -neighbourhood of a in A, denoted by NrA (a ) is the set of b A
such that a , b have distance at most r . Let NrA (a ) denote the substructure induced by A
on NrA (a ). For elements a , b of a structure A we write a
=rA b if there is an isomorphism
from NrA (a ) to NrA (b) that maps a to b.
Recall that qr() denotes the quantifier-rank of a formula .
Lemma 17 ([11,13]). For every first-order formula (x) there is an r 1 such that for
(A |= (a )
A |= (b))).
every structure A and a , b A we have (a
=rA b
Furthermore, r can be chosen to be 2qr() .
53
O(k)
n,
a A,AC
the maximal size of an r -neighbourhood, and let t (q) denote the number of equivalence
classes of
=rA . Note that there exist upper bounds for s(q) and t (q) only depending on the
degree of the input structure (and not on n or ). Remember that the degree is constant for
the classes under consideration.
Now consider the algorithm displayed in Fig. 3. Line 1 only requires constant time. If
Line 2 is executed, it requires time O( p n), and the algorithm stops. Otherwise, it proceeds
to Line 3, which can be executed in constant time. To execute Line 4, we maintain a list
of pairs (NrA (a ), a ) such that no induced substructure (NrA (a ), a ) occurs twice. The size
of this list never exceeds t (q), hence for each a in turn, we simply compute the induced
substructure, and look if it is already in the list. This requires time O(n f (s(q)) t (q)),
if we denote the time to check isomorphism of structures of size m by f (m). The loop in
Lines 59 requires time
O(t (q)) + t (q) R(n, p, q 1).
(for q 1),
for suitable constants c1 , c2 . To solve this equation, we use the following simple lemma:
for all m N.
m
i=0
g(i )
m
j =i+1
h( j )
54
22
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
q
j =1
q
j =1
t( j) +
q
t ( j ) c1 p n +
i=1
q
i=1
c2 n f (s(i )) t (i )
c2 n f (s(i )) .
q
j =i+1
t( j)
To give an upper bound on t (q), we have to take into account the number u of symbols in
the vocabulary. Since we only have to consider symbols that actually appear in , we can
assume that u k. Moreover, without loss of generality we can assume that the vocabulary
only contains unary and binary relation symbols (because we are considering structures of
degree 2).
Let us count the number of isomorphism types of an m-vertex structure B of degree 2
whose vocabulary contains u 1 unary relation symbols and u 2 binary relation symbols.
The unary relations can take at most 2 u 1 m different values. There are at most m pairs
of elements which can be connected by a binary relation, thus the binary relations can take
at most 2u 2 m different values. Thus the overall number of isomorphism types is bounded
by 2(u 1 +u 2 )m .
Our r -neighbourhoods have size at most 2r + 1, so we obtain
Thus
q
j =1
t( j)
q
j =1
2 O(k2 ) 2 O(k
j
q
j =1
2j)
22
O(k)
and thus
i=1
R(n, p, q) 22
O(k)
n.
Degree at least 3: The calculations are similar in this case, the only important difference
being that an r -neighbourhood may be of size (d r ) and thus doubly exponential in q,
which yields a triply exponential bound for R.
55
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
23
m = n.
Note that Lemma 8 only provides a formula 1,l (x, y) that works for m, n 2 .
Before we prove the lemma, we define a few basic formulas and notations that we need
in dealing with words without order. Let (x, y) be a formula. For a structure A, elements
a , b A, and 0, a -path of length from a to b is a sequence a 0 , a 1 , . . . , a of
elements of A such that a 0 = a , a = b, and A |= (a i , a i+1 ) for 0 i < . We let
b a be the minimum length of a -path from a to b if there is such a path. If there is
no -path from a to b, we let b a = .
Lemma 21. Let 1 and (x, y) a first-order formula.
(1) There exists a first-order formula (x1 , x2 ) of size O() such that for every structure
A and all a 1 , a 2 A,
A |= (a 1 , a 2 )
a 2 a 1 2 .
(2) There exists a first-order formula (x1 , x2 , y1 , y2 ) of size O() such that for every
structure A and all elements a 1 , a 2 , b1 , b2 A,
A |= (a 1 , a 2 , b1 , b2 )
a 2 a 1 2 a 2 a 1 = b 2 b 1 .
Proof. We only prove (2); the proof of (1) is similar, but simpler. We let
0 (x1 , x2 , y1 , y2 ) = (x1 = x2 y1 = y2 )
x1 = x2 y1 = y2 (x1 , x2 ) (y1 , y2 ) ,
56
24
and for 1
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
25
(x1 , x2 , y1 , y2 ) = 0 (x1 , x2 , y1 , y2 )
x3 y3 xx
yy
(x = x1 x
= x3 y = y1 y
= y3 )
(x = x3 x
= x2 y = y3 y
= y2 )
1 (x, x , y, y ) .
Fig. 4.
( (x, x , y, y )
Recall that 3-CNF(n) denotes the set of all formulas in 3-conjunctive normal form
whose variables are among X 0 , . . . , X n1 and that A(n) denotes the set of all truth-value
assignments to these variables. Recall further the encodings of propositional formulas
introduced in Section 4.
Lemma 22. For all l N there is a first-order sentence l of size O(l) such that for all
l
|= . Furthermore,
n 22 and ( , ) 3-CNF(n) A(n) we have 1 ( , ) |= l
l can be computed in time O(l).
Proof. Essentially, we proceed as for words with order. Suppose that there is an algorithm
A for the problem MC(FO, W) whose running time is bounded by
22
Observe that the length of an encoding 1 (n) for an n 2 is in O(2 ). We have seen
above that we can describe subwords of length up to 2 by formulas of length O() that
h (x, y) by a formula of length O()
only use the successor relation. Therefore, replace last
that only involves the successor relation.
Moreover, since we are only considering 3-CNF(n) formulas for n 22 , subwords
describing clauses have length O(). Thus again we can replace the subformulas involving
the order symbol by suitable formulas of length O().
Note that the previous proof does not work for arbitrary CNF-formulas; it is crucial that
the clauses have bounded length.
We are now ready to prove the main result of this section (which is Theorem 2(1)):
Theorem 23. Assume that FPT = AW[], and let p be a polynomial. Then there is no
algorithm for MC(FO, S) whose running time is in
22
o(k)
p(n),
where k denotes the size of the input sentence and n the size of the input word.
57
p(n),
for some polynomial p and a function f (k) o(k). We shall prove that AWS AT[3-CNF]
is in FPT.
For all , k, t N, let
k
k1
PV1 x1i
x1i < x1(i+1)
,k,t = x11 . . . x1k
x21 . . . x2k
Proof. Recall the proof of Lemma 10. Instead of the formula h, we now use of
Lemma 20. We have to eliminate all occurrences of the order symbol <, which is used
h
clause
in the formulas last
(x, y) and h,
.
2
f (k)
i=1
Qxt 1 . . . Qxt k
k
i=1
i=1
PV2 x2i
k
i=1
..
.
k1
i=1
PVt xt i
k1
i=1
xt i < xt (i+1)
,k,t
,
where ,k,t
is the formula obtained from the formula of Lemma 22 by replacing
the (unique) subformula Ptrue Sy
by ti=1 kj =1 Sy
= xi j . Then for every n 22 ,
3-CNF(n), k N, and for every partition V1 , . . . , Vt of {0, . . . , n 1} we have
1 ( , V1 , . . . , Vt ) |=
,k,t
( , V1 , . . . , Vt )
with parameters (k, t)is a yes-instance of AWS AT[3-CNF].
(5)
f (
,k
)
p(n),
58
26
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
27
Lemma 25. Let 1. There is a formula (x, y) of vocabulary B ({0, 1}) of size O()
such that for all ordered binary trees T B, a , b T and m, n {0, . . . , 22 } the
following holds:
If the depth 2 subtree below a is isomorphic to (n) and the depth 2 subtree below b
is isomorphic to (m) then
2l
T |= (a , b)
l,k
for some polynomial p
and constant c. Hence for sufficiently large n
we have
c
lg lg n
, say, for c
= 2c. Since f (k) o(k), there is an n 0 such that for all n
n 0 we
have f (c
lg lg n
) lg lg n
and thus
22
f (
l,k
)
22
f (c lg lg n )
22
lg lg n
n .
This gives us the desired upper bound on the running time of our algorithm.
7.2. Ordered binary trees
We view ordered binary trees as {S0 , S1 }-structures T , with S0T and S1T being the left
child and right child relations. We allow nodes to only have one child. For a finite alphabet
, we let B ( ) = {S0 , S1 } {Ps | s }, where Ps , for s , is a unary relation
symbol. An ordered binary tree over is a B ( )-structure whose -reduct is an ordered
binary tree in which each vertex is contained in precisely one PsT , for s . We denote
the class of all ordered binary trees over some finite alphabet by B. For a node a of a tree
T B and d 1, the depth d subtree below a is the subtree of T whose nodes are all
descendants of a of distance at most d from a .
To proceed as in the word cases, we will encode natural numbers by trees and provide
short formulas allowing to compare large encoded numbers. For N, let T be the
ordered binary tree with vertex set {0, . . . , } and root 0 in which the children of i are 2i +1
and 2i + 2. Recall that L(n) denotes the length of the binary encoding of n N. We let
(n) be the ordered binary tree over {0, 1} whose underlying tree is T L(n) and in which, for
i = 0, 1,
Pi
T (n)
= { j L(n) | bit( j, n) = i }.
Example 24. Fig. 5 shows the encoding of 38, the binary representation of which is
100110.
The next lemma corresponds to Lemmas 8 and 20.
59
m = n.
and for l 1
(S1 x1 x2 S1 y1 y2 )
(x1 = x2 y1 = y2 ),
l (x1 , x2 , y1 , y2 ) = x3 y3 xx
yy
((x1 = x x3 = x
y1 = y y3 = y
)
(x3 = x x2 = x
y3 = y y2 = y
)
l1 (x, x , y, y )).
Now we proceed as before and encode formulas of 3-CNF(n) for some n as an ordered
binary tree over some alphabet . For 3-CNF let ( ) be the binary tree T constructed
as follows: let W be the word without order 1 ( ), and consider W as a tree of S1 successors without any S0 -successors. To get T we substitute each subword U of W of
the form 1 (m) by a single vertex v such that vs S0 -successor is the root of a copy of
(m), while its S1 -successor is the first position after U in W. v itself carries the new
symbol var.
We extend the definition of to pairs ( , ) 3-CNF(n) A(n) and tuples
( , V1 , . . . , Vt ) by applying the same substitution process. This encoding gives us the
following lemma, whose proof is omitted since it resembles the proof of Lemma 10 using
the newly introduced encoding together with the decoding formulas (x, y).
Lemma 26. For all N there is a first-order sentence of size O(l) such that for all
|= . Furthermore,
Now we are ready to state the second main result of this section, which is Theorem 2(2).
We omit the proof, which is analogous to the proof of Theorem 23.
60
28
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
Theorem 27. Assume that FPT = AW[], and let p be a polynomial. Then there is no
algorithm for MC(FO, B) whose running time is in
22
2o(k)
p(n),
where k denotes the size of the input sentence and n the size of the input tree.
8. L ower bounds for first-order m odel-check ing on trees
In this last section we prove a non-elementary lower bound for first-order modelchecking over unranked trees. We need the same ingredients as before: suitable encodings
of natural numbers and small formulas for comparing two numbers.
For simplicity, we work with directed labelled trees . In Remark 33 we describe how to
get rid of labels and directed edges in order to transfer the results to plain undirected trees.
But for now we view a tree as an {E}-structures T with E T being the child-relation. For
a finite alphabet we let T ( ) = {E} {Ps | s }. Then a tree over is a T ( )structure T whose {E}-reduct is a tree and in which each vertex is contained in precisely
one PsT , for s . We denote the class of all trees over some alphabet by T.
Recall that T(h, 2) denotes a tower of 2s of height h + 1 and that bit(i, n) denotes the
i th bit in the binary representation of n. For every h 0 and n {0, . . . , T(h, 2) 1} we
define h (n) to be the following tree over {0, 1, *}:
(1) If h = 0, we let 0 (0) be a single node labelled by 0. Likewise, let 0 (1) be a single
node labelled by 1.
(2) If h 1, we let h (n) be the tree formed by taking a new root, labelling it by *, and
attaching to it the tree h1 (i ) for each i such that bit(i, n) = 1.
Exam ple 28. Fig. 6 shows the 3 -encoding of 40 961 = 215 + 213 + 20 . The tree is
constructed as follows:
61
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
29
To construct 3 (40 961), by clause (2), we take a new root labelled by * and attach three
trees to this root: 2 (0), 2 (13), 2 (15).
The binary representation of 0 consists of 0s only. Thus to construct 2 (0), we take a
new root labelled by * and attach no children. This explains the leftmost leaf labelled *.
We have 13 = 20 + 22 + 23 . Thus to construct 2 (13), we take a new root labelled by *
and attach three children labelled 1 (0), 1 (2), and 1 (3).
1 (0) is again a tree consisting of just one node labelled *. This explains the second leaf
labelled *.
We have 2 = 21 . Thus to construct 1 (2), we take a new root labelled by * and attach
one child labelled by 0 (1).
0 (1) is the 1-node tree labelled 1.
The remaining subtrees are constructed similarly.
Lemma 29. There is an algorithm that, given h and n {0, . . . , T(h, 2)}, computes h (n)
in time O(h lg2 n). Furthermore, |h (n)| O(h lg2 n).
Proof. A simple recursive procedure will do. The running time analysis uses the same
ideas as the proofs of Lemmas 6 and 7.
The next lemma corresponds to Lemmas 8 and 20.
Lemma 30. Let h 1. There is a first-order formula h (x, y) of size O(h) such that for
all trees T over , a , b T, and m, n {0, . . . , T(h, 2) 1} the following holds:
If the subtrees of T rooted at a , b are isomorphic to h (m) and h (n), respectively, then
T |= h (a , b) if, and only if, m = n.
Lemma 31. For all h N there is a first-order sentence h of size O(h) such that for all
n < T(h, 2) and ( , ) 3-CNF A(n) we have h ( , ) |= h |= . Furthermore,
h can be computed in time O(h).
We omit the simple proof.
Theorem 32. Assume that FPT = AW[]. Let h N and p a polynomial. Then there is
no algorithm for MC(FO, T) whose running time is bounded by
62
30
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
T(h, k) p(n),
where k denotes the size of the input sentence and n the size of the input tree.
The proof is analogous to our earlier lower bound proofs.
Remark 33. Even though we only stated the lower bound result for labelled binary trees,
it also holds for unlabelled undirected trees, that is, connected acyclic undirected graphs.
To see this, we first note that the alphabet and thus the vocabulary of the formula h of
Lemma 31 does not depend on h. Suppose the vocabulary of h is {E, P1 , . . . , P p }. To get
rid of the directed edges, we replace each directed edge from a vertex v to a vertex w by
the following subgraph:
To get rid of the unary relations, we attach (i + 2) new children to each node in Pi and
delete Pi .
9. Conclusions
M. Frick, M. Grohe / Annals of Pure and Applied Logic 130 (2004) 331
31
[9] J. Flum, M. Grohe, Model-checking problems as a basis for parameterized intractability, Technical Report
23/2003, Fakultat fur Mathematik und Physik, Albert-Ludwigs-Universitat Freiburg, 2003.
[10] M. Frick, M. Grohe, Deciding first-order properties of locally tree-decomposable structures, Journal of the
ACM 48 (2001) 11841206.
[11] W. Hanf, Model-theoretic methods in the study of elementary logic, in: J. Addison, L. Henkin, A. Tarski
(Eds.), The Theory of Models, North Holland, 1965, pp. 132145.
[12] H. Kamp, Tense Logic and the theory of linear order, Ph.D. Thesis, University of California, Los Angeles,
1968.
[13] L. Libkin, Logics with counting and local properties, ACM Transactions on Computational Logic 1 (2000)
3359.
[14] O. Lichtenstein, A. Pnueli, Finite state concurrent programs satisfy their linear specification, in: Proceedings
of the Twelfth ACM Symposium on the Principles of Programming Languages, 1985, pp. 97107.
[15] K. Reinhardt, The complexity of translating logic to finite automata, in: E. Gradel, W. Thomas, T. Wilke
(Eds.), Automata, Logics, and Infinite Games, Lecture Notes in Computer Science, vol. 2500, SpringerVerlag, 2002, pp. 235242 (Chapter 13).
[16] D. Seese, Linear time computable problems and first-order descriptions, Mathematical Structures in
Computer Science 6 (1996) 505526.
[17] L.J. Stockmeyer, The Complexity of Decision Problems in Automata Theory, Ph.D. Thesis, Department of
Electrical Engineering, MIT, 1974.
[18] L.J. Stockmeyer, A.R. Meyer, Word problems requiring exponential time, in: Proceedings of the 5th ACM
Symposium on Theory of Computing, 1973, pp. 19.
[19] P. van Emde Boas, Machine models and simulations, in: J. van Leeuwen (Ed.), Handbook of Theoretical
Computer Science, vol. 1, Elsevier Science Publishers, 1990, pp. 166.
[20] M.Y. Vardi, The complexity of relational query languages, in: Proceedings of the 14th ACM Symposium on
Theory of Computing, 1982, pp. 137146.
References
[1] A.V. Aho, J.E. Hopcroft, J.D. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley,
1974.
[2] J.R. Buchi, Weak second-order arithmetic and finite automata, Zeitschrift fur Mathematische Logik und
Grundlagen der Mathematik 6 (1960) 6692.
[3] B. Courcelle, Graph rewriting: an algebraic and logic approach, in: J. van Leeuwen (Ed.), Handbook of
Theoretical Computer Science, vol. B, Elsevier Science Publishers, 1990, pp. 194242.
[4] N.J. Cutland, Computability, Cambridge University Press, 1980.
[5] R.G. Downey, M.R. Fellows, Parameterized Complexity, Springer-Verlag, 1999.
[6] R.G. Downey, M.R. Fellows, K. Regan, Parameterized circuit complexity and the W-hierarchy, Theoretical
Computer Science 191 (1998) 97115.
[7] H.-D. Ebbinghaus, J. Flum, W. Thomas, Mathematical Logic, 2nd edition, Springer-Verlag, 1994.
[8] J. Flum, M. Grohe, Describing parameterized complexity classes, in: H. Alt, A. Ferreira (Eds.), Proceedings
of the 19th Annual Symposium on Theoretical Aspects of Computer Science, Lecture Notes in Computer
Science, vol. 2285, Springer-Verlag, 2002, pp. 359371.
63
64
A lemma for
cost attained
Greg Hjorth
Annals of Pure and Applied Logic
143 (2006), Pages 87-102
Abstract
A treeable ergodic equivalence relation of integer cost is generated by a free action of the free group on the corresponding
number of generators. Every countable treeable ergodic equivalence relation is induced by the free action of some countable group.
c 2006 Elsevier B.V. All rights reserved.
Keywords: Ergodic theory; Treeable equivalence relations; Free groups; Measure equivalence of groups
1. Introduction
Given an equivalence relation E one can consider the graphings of E, consisting of partial functions included in
the graph of E whose various compositions enable us to trace out a path between any two equivalent points. Levitt in
[5] defines the cost of a measure preserving equivalence relation to be the infimum among graphings of the sums of
the measures of the domains of the relevant partial functions.
Here we present a result, which in an unpublished form has been previously cited by [4] to obtain a kind of
dichotomy theorem for amenability and [6] in an application to von Neumann algebras. The authors of [4] wrote up a
proof of 1.1, though their organization is very different to the one below.
Proposition 1.1. Let E be an ergodic measure preserving equivalence relation on a standard Borel probability space
(X, ); assume that every equivalence class is countable. Let be a graphing for E with C () n.
Then there is an alternate graphing for E which has no greater cost and contains n morphisms which are total
that is to say:
(a) C ( ) C ();
(b) and there are distinct bijections 1 , . . . , n in with i : X X.
One obtains additionally that if C () n + 1 then we may further conclude = {1 , 2 , . . . , n , }, where
: A B is a bijection, some A, B X.
Recall that a measurable equivalence relation is treeable if there is a measurable way of assigning the structure of
a tree to each equivalence class. In the next corollary one should bear in mind that Damien Gaboriau has shown in [3]
that an E as above with finite cost is treeable if and only if it admits a graphing which actually attains its cost, and in
this case any treeing will in fact realize the infimum.
Tel.: +1 310 8255626.
65
66
88
Corollary 1.2. For E treeable and as above, the cost of E equals n if and only if there is a measure preserving action
of the free group on n generators, Fn , which is free -a.e. and has E as its orbit equivalence relation.
We also show that in the case that E is treeable with infinite cost one may find a free action of F giving rise to E.
Appealing to the connections made in [2] between orbit equivalence in the ergodic setting and measure equivalence,
this implies that a countable non-amenable group is measure equivalent to a non-abelian free group if and only if it has
a free measure preserving action on some standard Borel probability space which gives rise to a treeable equivalence
relation.
This paper finishes with a comment on a deep theorem of Alex Furmans.
Corollary 1.3. If E is a treeable ergodic measure preserving equivalence relation on a standard Borel probability
space with countable classes, then there is a countable group G acting -a.e. freely giving rise to E as its orbit
equivalence relation.
Moreover, the group G can be chosen solely as a function of the cost of E.
3. Proof
We set about proving 1.1 for n = 2. It should be more or less clear how to extend it to larger n. We organize this
into a series of small technical lemmas, omitting proofs when they resemble earlier arguments.
The first of these lemmas, at 3.1, states that we may find a new morphism 0 included in E and a corresponding
partition of the space up into an infinite array of measurable sets,
B1,0 , B1,1,
B2,0 , B2,1 , B2,2,
...
Furman in [2] had previously obtained ergodic E which are not induced by an a.e. free action of a countable group.
His examples arose by the restriction of a non-treeable equivalence relation to a non-null set, and were therefore
known to be non-treeable.
2. Notation and definitions
We take all the usual notational shortcuts. All sets considered are measurable. All functions are measurable.
All group actions are by measure preserving transformations. We identify functions agreeing a.e. Unless otherwise
warned, the reader should assume that all non-empty sets are non-null. We tend to say everywhere when we only mean
almost everywhere. N begins with the number 0.
Definition. A standard Borel space is a set X equipped with a -algebra B, such that B is the -algebra generated by
some choice of a Polish topology on X. A standard Borel probability space is a standard Borel space equipped with
a probability measure on its Borel sets.
In general we will only be considering uncountable standard Borel spaces, and these are all isomorphic to the unit
interval in its usual Borel structure. Thus one might reasonably think of a standard Borel probability space as just
being some choice of a Borel probability measure on [0,1].
Definition. If E is an equivalence relation on a standard Borel probability space (X, ), and A, B X measurable,
then we say that a function
f :AB
is a morphism (for E) if it is a bijection and
x E f (x)
all x A. We say that E is measure preserving if every morphism is a measure preserving function. We say that E is
ergodic if every E-invariant set is either null or conull.
From [1], the measure preserving equivalence relations with countable classes are exactly those induced by a
countable group of measure preserving transformations. Even in the case that E is ergodic, [2] has shown that in
general we may not be able to choose this group so that it acts freely on the space. Below we will prove that the
additional assumption of treeability does ensure that we can choose the countable group so that it acts freely.
Definition. Given a set of morphisms, a word built from is a morphism of the form
x 1(1) 2(2) . . . n(n) (x),
where each i , each (i ) {1, 1}, and x ranges over a set on which these compositions make sense. The word
is reduced if at no i do we have (i ) = (i + 1) along with i = i+1 . A set of morphisms is said to be a
graphing of E if for any x E y there is a word mapping x to y; the graphing is said to be a treeing if there is always a
67
89
unique reduced word. Equivalently, is a treeing if the adjacency relation x Ry if there exists with 1 (x) = y
providing a tree structure on each equivalence class.
For a collection of morphisms we let 1 = { 1 : }.
some C X, (the new graph consists just of the new morphism and restrictions of the old morphisms);
(5) for each there is a partition (Ci )iN of X such that
(i) |C0 0 ;
(ii) and at i > 0, |Ci = (0 )i |Ci , some i Z (we can recover the missing pieces of the old morphisms as
powers of the new morphism).
Proof. We assume that for distinct , we always have (x)
= (x) when both are defined. We may also
Dom()
empty.
assume there is some with Ran()
We build transfinite sequences of graphings
( )< , ( )< ,
along with a choice of morphisms , by induction on so that:
(a) 0 is empty; 0 = ;
(b) 1 consists in a single morphism 0 with Ran(0 )Dom(0 ) = , 0 , (Dom(0 ))
= 0; we set 0 = 0 , and
for all and we have Ran( ) Dom( ) empty;
;
(c) if + 1 < , > 0, and +1 , then Dom( ) Ran( ) some
68
90
91
Fig. 1. We need to add again, and we find some 1 1 such that some restriction 1 | A or its inverse 1 |1
A has image disjoint to both the domain
1
and image of 0 . Then, as indicated later, we can find 1 either of the form 1 |
A or 1 | A 1 whose domain is disjoint from the domain of 0
and whose range is disjoint to both its domain and range. We add 1 to 1 to obtain 2 and subtract off 1 | A .
Fig. 2. We insist that 0 , 1 have disjoint domains and that the range of 1 is entirely new. After this we keep going, finding some 2 whose domain
is disjoint from the domains of 0 and 1 and is, relative to these two morphisms, interdefinable with some element of 2 . We do insist that it picks
up some more of the space in its image, though we do not object if its domain is included in the range of one of the earlier functions.
(g) is the only morphism not appearing in +1 and for this there is a partition of Dom( ) into A0 , A1
such that
(A1 )
= 0;
| A1 +1 ;
Dom( ) = 1 . . . k [ A0 ]
or
( | A0 )1 = 1 2 . . . 1 . . . k | A0
(h) if is a limit then for < and , we have if is in every earlier , or otherwise we have
| A, , where
A, = {A1 : , }.
Inrough terms, we begin the construction by taking some 0 in our original graphing which can be assumed to
have disjoint range and image and simply adding it to 1 and subtracting it from 0 to obtain 1 . We just describe
the first few steps, without giving much in the way of proofs yet.
Thus the general idea of this construction is to steadily transfer across pieces of the s to the s, so that
continues to graph E. The crucial part of this is (g). It tells us that when we remove a single piece | A0 of
a morphism then we are compensating by placing into +1 a morphism, , which can reconstruct | A0
using only pre-existing morphisms already placed into . As we continue through the construction, and survey the
construction at ever larger ordinals , the sets only get bigger, and our ability to write | A0 as a word from is
never endangered.
The other parts of this construction are less vital. (a) and (d) are bookkeeping requirements, describing how we add
to the s and remove from the s. (f) actually follows from the other clauses. (e) ensures that partial morphisms in
69
Fig. 3. The domain of 2 is included in the image of the earlier functions, but its image is new. After this we keep going to add 3 , spreading out to
new domains and so on.
the s will eventually have some morphism as their union, which in turn will give us 0 described in the statement of
3.1. (b) and (c) will enable us to obtain the (Bn,k ) sets with the structure indicated above. (h) and (d) state that at limit
ordinals we take an appropriate limit of the process so far, with a union on the side and a kind of intersection
along the s.
We continue with this construction for as long as possible, eventually arriving at some ( )< , ( )< admitting
no further extension. We will argue that this final ordinal is a successor ordinal, = + 1, and that in some natural
way will yield 0 and 0 as required.
Claim (1). is not a limit ordinal.
70
92
(Claim)
/ Dom( ) all ; or
(1) x
(2) there exists k and 0 , 1 , . . . , k such that k k1 . . . 0 (x) is well defined and not a member of Dom( )
any .
3
2
93
nN Bn ; and thus
Proof of Claim. Otherwise suppose not; we define Bn, to be the set of x Bn such that for every k there exists
0 , 1 , . . . k with k k1 . . . 0 (x) well defined, and observe that this set will have positive measure. It then
follows by clause (c) of our construction that we may at each m define a morphism from Bm, to Bm+1, and thus
for m m we have (Bm, ) (Bm , ), and thus (Bm, )mn provides a sequence of disjoint sets with measure
bounded away from zero, and a contradiction to (X) = 1. (Claim)
Fig. 4. The next morphism 3 takes its domain from the image of 0 . We do not rule out returning to the images of much earlier morphisms.
nN
A Dom() Bm
and either
(1) Dom( ) A = all ; or
(2) there exists k and 0 , 1 , . . . , k such that k k1 . . . 0 (x) is well defined all x A and
k k1 . . . 0 [ A] and Dom( ) are disjoint for any .
We assume that (2) holds and that ; the other cases are exactly similar.
4
2
?
@
R
@
(x) = 01 11 . . . k1 (x).
We let A0 = A, A1 =Dom() \ A0 , +1 = { }, +1 = ( \ {}) {| A1 }. In this way we are able to
contest another round, contradicting the assumption that the construction ground to a halt at . (Claim)
Fig. 5. In this way we eventually ensure that all the space except for Dom(0 ) is in the range of some .
Proof of Claim. Otherwise we could simply let = < , and let consist of all | A, where some
< and as in (h) above A, = {A1 : , < , }. (Claim)
So from now on let us fix with + 1 = . For each let B0 = Dom(0 ), and for each n N let
Bn+1 = { [Bn ]| }.
71
(Claim)
We can then finish up the proof of the lemma by letting Bn,k be the set of x Bk such thatthere are
1 , 2 , . . . , nk with 1 . . . nk (x) well defined and n is the largest such integer. (In other words, kn Bn,k
is the set of elements whose orbit under has size exactly n + 1.) We use the disjointness of the morphisms in
to define the longed for 0 : for x Bn,k we consider the unique with x Dom( ) and let 0 (x) = (x).
We now let B = nN Bn,0 . We are going to repeat the previous step, relativizing the process to B.
Lemma 3.2. There is a graphing of E, containing 0 along with a new morphism , with (Cn,k )nN,kn a
partition of B, such that:
(1) C ( ) C ( 0 );
(2) [Cn,k ] = C
n,k+1 all k < n;
(3) Dom( ) = k<n,nN Cn,k ;
72
94
Bn+3
@
6@
2
@
6
@
1
@
@
@
6
@
@
@
0
@
@
A
R
-@
Bn+2
Bn+1
95
| B
(B) ,
Cm,k
mN,k<m
This morphism 1 will extend the old 0 , and so we simply set 1 (x) = 0 (x) for x
and 0n (x) mN,k<m Cm,k we let
nN,k<n
1 (x) = 0n (x).
From this we obtain a new graphing 1 of E with
1 = ( \ { 0 , }) {1 }.
Bn
nN Bn
Fig. 6. A typical case: the image of is what we want, but the domain intersects the domain of morphisms already in .
C ( 1 ) C ( 0 );
1 is still a graphing of E;
1 extends 0 and (X \ Dom(1 )) 12 (X \ Dom(0 ));
for any 0 we can partition Dom() into (Di )iN such that
(a) | D0 1 ;
(b) for each i > 0 there is i Z such that | Di = ( 1 )i | Di .
Plainly we can continue this indefinitely, obtaining a sequence ( n , n )nN where at each n
C ( n ) C ();
n is a graphing of E;
n extends n1 and (Dom(n )) 1 2n ;
for any n1 we can partition Dom() into (Di )iN such that
(a) | D0 n ;
(b) for each i > 0 there is i Z such that | Di = ( n ) | Di .
In the end we let
=
n
nN
6
| A, ,
k
where A, equals
{D : n n ( , D = Dom( ))}.
By considering the measure of the domain we actually have
- ?
B
We may then consider the graphing
which for each 0 ,
= 0 , has the appropriate morphism
( 0 )k ( 0 ) for E| B .
73
: X X
(almost everywhere defined). By the nature of the definition of and the assumptions on the various n we have
that for any n we can partition Dom() into (Di )iN such that
(a) | D0 ;
(b) for each i > 0 there is i Z such that | Di = ( 1 )i | Di .
74
96
This gives us a new graphing of E containing a morphism : X X with C ( ) C (). We are now
going to take that whole step over again, adding in a new morphism but keeping hold of and not allowing it to be
changed. This requires relativizing 3.1 to .
0 of E which still contains , and a morphism
Lemma 3.3. Let E, , , be as above. Then there is a graphing
0 , and (Dn,k )nN,kn a partition of X , such that:
0
0 ) C ( );
(1) C (
(2) 0 [Dn,k ] = Dn,k+1 all k < n;
(3) Dom(0) = k<n,nN Dn,k ;
0 with
= , 0 we have
(4) for each
= |C ,
some C X , ;
(5) for each there is a partition (Ci )iN of X such that
0;
(i) |C0
(ii) and at i > 0, |Ci equals some word in 0 , restricted to Ci .
Proof. This closely parallels the proof of 3.1. There is a difference in how we show we can continue at inductive
steps.
We build graphings
97
Again this construction stops at some successor ordinal = + 1, and as before at each and n N we can
let
D0 = Dom(),
nN Dn
= X, and we try to show that after all we could have continued to define +1 , +1 , , .
Definition. Let F be the equivalence relation on X induced by the graphing { }.
Case 1. F is ergodic.
Then we can choose some , words , built up from { }, A0 A1 partitioning Dom( ), with
Dn \ {Dom( )| }
1 [ A0 ]
nN
[ A0 ] X \
Dn .
nN
1 .
After shrinking we may assume A0 Ran( ) some single , and then we can let = |
[A ]
0
( )< , ( )< ,
and morphisms :
(a) 0 is empty; 0 = \ { };
k [Dom()], we have
(b) 1 consists in a single morphism ,
where for some k, , and A =
A0
(c)
(d)
(e)
(f)
(g)
k
| A ;
Dn \
X\
Dom( )
Dn
| A1 +1 ;
Fig. 7.
= 1 2 . . . 1 . . . k | A0
(h) if is a limit then for < and , we either have or we have | A, , where
A, = {A1 : , }.
75
nN
Dn , Y2 X \ (
nN
Dn ),
nN
Dn .
nN
76
98
99
With this general formulation observed and set to one side, let us continue with the proof for the specific case in
front of us.
of E, containing , 0 , along with a new morphism , and there is
Lemma 3.5. There is a graphing
0
(Hn,k )nN,kn , a partition of D, such that:
1
1
(1)
(2)
(3)
(4)
Dn \
X\
Dom( )
Dn
Z1
Fig. 8.
Again after possibly refining Y1 we may assume there is some word from such that
Dn \ {Dom( )| }.
1 2 2 . . . 1 [Y1 ]
= |C ,
0;
some C X,
0 there is a partition (Di )iN of X such that
(5) for each
0;
(i) | D0
(ii) and at i > 0, |Ci equals some word built up from , 0 , restricted to Ci .
0 for , D for A, { , 0 } for {1 , . . . , n } to obtain (Hn,k )nN,kn partitioning A
Proof. We apply the lemma to
and as required.
With this claim granted, we can mimic earlier arguments and choose 0 with (Dom(0 )) = (Dom( )) +
(Dom( )), and { } graphing the same equivalence relation as { , 0 }. And then we may clearly continue with
n and n such that:
this over and over, obtaining at each n
n ) C (
0 );
C (
n is a graphing of E containing n , ;
n extends n1
and (Dom(n )) 1 2n ;
0 we can partition Dom() into (Di )iN such that
for any
n;
(a) | D0
(b) for each i > 0 | Di can be written in a word in n , .
nN
Here is probably a good point to pause and formulate the general result. The proof of this general technical lemma
clearly follows from the above argument.
Lemma 3.4. Let F be an ergodic measure preserving equivalence relation on standard Borel space (Y, ), with all
classes countable, a finite Borel measure. Let be a graphing of F containing morphisms {1 , 2 , . . . , n }.
Let A Y be a set whose saturation under {1 , 2 , . . . , n } is conull that is to say, for almost all x Y
there is some y A and word w from {1 , 2 , . . . , n } with w (y) = x. Suppose further more that C ( )
Ran()
disjoint subsets
C ({1 , 2 , . . . , n }) + (A) and there is some \ {1 , 2 , . . . , n } having Dom(),
of A.
77
) C (
0 );
C (
0
[Hn,k ] = Hn,k+1 all k < n;
Dom( ) = k<n,nN Hn,k ;
with
= , 0 , we have
for each
: A A; ;
for each \ {1 , . . . , n , } we have = |C some C Y , ;
for there is a partition (Ci )iN such that
(i) |C0 ;
(ii) at i > 0, |Ci equals some word in {1 , 2 , . . . , n , } restricted to Ci .
78
100
4. Corollaries
Lemma 4.1. A treeable ergodic measure preserving equivalence relation E with countable classes on a standard
Borel probability space has cost n if and only if it is induced by some free action of Fn .
Proof. The if direction is known from [3], so we concentrate on the converse.
We begin with a treeing of E; by [3], C () = n. Applying the argument of the last section we can find an
alternative graphing containing 1 , . . . , n , each
i : X X,
and with
We finish with {i : i N} as a graphing of E. Since each m is a treeing so too is the limit, {i : i N}. Since
each i is total and since they jointly give rise to a treeing, we thus obtain the free action of F .
Lemma 4.3. Let E be an ergodic measure preserving equivalence relation with countable classes on a standard Borel
probability space (X, ). If E is treeable, then there is a free action of a countable group G giving rise to E as its
orbit equivalence relation.
Proof. We at once can assume the cost is finite, or else the result follows with G = F from the last lemma. By
earlier results we may assume there is a treeing and some which is total. By Dyes theorem on the orbit
equivalence of ergodic Z-actions, we can assume that there is a sequence of subsets of X, (Ai )iN , such that each
(Ai ) = 2i ,
Ai+1 Ai ,
i
2 : Ai+1 Ai ,
i
i = 2 | Ai+1 ;
note that {i : i N} graphs the same equivalence relation as {}.
We then build (ki )iN , k0 N, each ki+1 {0, 1}, and morphisms
i, j : Ai Ai
for j < ki , and treeings n for E such that:
79
101
. If no, we just pass on with kn+1 = 0; if yes, then we apply 3.6 to obtain some
whether C (E) C (n ) + 2
n+1,0 : An+1 An+1
C ( ) C ().
(a)
(b)
(c)
(d)
and graphing
n+1 = {i : i N} {i, j : i n + 1, j < ki } { | B,n+1 : },
and take the process to the next round.
This construction granted it follows from (c) that
= {i : i N} {i, j : i N, j < ki }
graphs E. Since each n is a treeing it follows that is a treeing. We will use it define a group action in some
natural way.
We let G be the group with generators {ai : i N}, {bi, j : i N, j < ki }. We will ask that this group be free
subject to the relations
a bi, j = bi, j a
for i > ,
(a )2 = 1,
a ak = ak a
all k, . For each i N we define a total function Ti : X X by first choosing for a.e. x the unique
x
{1, 0} such that
m 0x , m 1x , . . . , m i1
mx
mx
mx
i1
i2
yx = i1
i2
. . . 0 0 (x) Ai
Ti (x) = 0
m 1x
m x
. . . i1i1 i (yx )
if yx Ai+1 ,
m 0x
Ti (x) = 0
m 1x
m x
. . . i1i1 i1 (yx )
if yx Ai \ Ai+1 . (In other words we recursively define each Ti to be the unique Ti : X X of order 2 which
extends i and commutes with T j all j < i .) Similarly we define for j < ki
m 0x
Si, j (x) = 0
m 1x
m x
. . . i1i1 i, j (yx ).
We want to show that if we let each a act on X via T and each bi, j act via Si, j then firstly it is well defined as an
action of G and secondly that it is free a.e.
Claim (1).
Si, j T = T Si, j
80
102
all < i ,
T = T1
T Tk = Tk T
all , k.
Proof of Claim. This follows quickly from the definitions.
(Claim)
Randomness and
the linear degrees
of computability
Proof of Claim. Suppose g is as above and for a non-null set of x we have g (x) = x. We attempt to reduce g down
to 1 using the relations imposed as part of the definition of G.
We may write the group element in the form
g = c0 c1 . . . c M
1
where each ck equals either ai(k) , bi(k), j (k) , or bi(k),
j (k) . After possibly replacing each ck by a suitable
m 0 m 1
a1
a0
i(k)1
i(k)1
. . . ai(k)1
ck ai(k)1
. . . a0 0
we may assume that there is a positive measure set A X such that for all x A
g (x) = x
and at each k M we have
Uk+1 Uk+2 . . . U M (x) Ak(i) ,
1
where each U is respectively Ti() , Si(), j (), Si(),
j () , depending on whether c equals either ai() , bi(), j (), or
m
i(k)1
i(k)1
1
0 m 1
bi(),
a1 . . . ai(k)1
ck ai(k)1
. . . a0 0 as a consequence
j (); the key point is that in G the element ck equals a0
of the relations imposed in the definition of the group G. It then follows from being a treeing that we may reduce
the word
U0 U1 . . . U M | A
down to the identity by the operations of canceling various U U+1 when U+1 = U1 . Thus in particular it follows
that c0 c1 . . . c M will easily reduce to the identity in G and we are done. (Claim)
Acknowledgement
The author was partially supported by NSF grants DMS 99-70403, DMS 01-40503.
References
[1] J. Feldman, C.C. Moore, Ergodic equivalence relations and von Neumann algebras, Transactions of the American Mathematical Society 234
(1977) 289324.
A. Furman, Orbit equivalence rigidity, Annals of Mathematics 150 (1999) 10831108.
D. Gaboriau, Cout des relations dequivalence et des groupes, Inventiones Mathematicae 139 (2000) 4198.
A.S. Kechris, B. Miller, Topics in Orbit Equivalence, in: Lecture Notes in Mathematics, vol. 1852, Springer, Berlin, 2004.
G. Levitt, On the cost of generating an equivalence relation, Ergodic Theory and Dynamical Systems 15 (1995) 11731181.
S. Popa, A class of cross-product factors by free groups, having trivial fundamental group, preprint UCLA 2001.
[2]
[3]
[4]
[5]
[6]
81
82
A.E.M. Lewis, G. Barmpalias / Annals of Pure and Applied Logic 145 (2007) 252257
253
values of a 6= 1 are of small relevance in the study of computability theory. From a computational point of view, then,
the linear reducibility can be seen as formalizing the notion of length efficient oracle computation.
Annals of Pure and Applied Logic 145 (2007) 252257
www.elsevier.com/locate/apal
Definition 1.1. We say is linear reducible to ( ` ) if there is a Turing functional and a constant c such that
= and the use of this computation on any argument n is bounded by n + c. The Turing functionals which have
their use restricted in such a way are called `-functionals.
The linear reducibility (in particular the case where c = 0) was used in the recent work of Soare, Nabutovsky and
Weinberger on applications of computability theory to differential geometry (see Soare [10]). If we consider partial
computable functionals as operators from 2< to itself, the `-functionals are also closely related to the notion of
Lipschitz continuous operators.
Definition 1.2. A partial operator from a (pseudo-)metric space (X, d) to itself is Lipschitz continuous if there is
a constant C such that
d( (x), (y)) C d(x, y)
Received 12 March 2006; received in revised form 14 August 2006; accepted 21 August 2006
Available online 18 October 2006
Communicated by R.I. Soare
(1)
Abstract
We show that there exists a real such that, for all reals , if is linear reducible to ( ` , previously denoted as sw )
then T . In fact, every random real satisfies this quasi-maximality property. As a corollary we may conclude that there exists
no `-complete 2 real. Upon realizing that quasi-maximality does not characterize the random reals there exist reals which
are not random but which are of quasi-maximal `-degree it is then natural to ask whether maximality could provide such a
characterization. Such hopes, however, are in vain since no real is of maximal `-degree.
c 2006 Elsevier B.V. All rights reserved.
Proposition 1.1. An `-functional is a partial computable and Lipschitz continuous operator from (2< , d) to itself.
Conversely, every partial computable and Lipschitz continuous operator : (2< , d) (2< , d) equals an `functional on infinite strings.
Proof. If is an `-functional, it is obviously partial computable but also Lipschitz continuous as a function on 2< .
0
Indeed, suppose we are given two finite binary strings , 0 such that d( , ) = 2t . If the use of on n is n + c
for some fixed constant c, d(, 0 ) must be at least 2(t+c) . Hence
0
d( , ) d(, 0 ) 2c
(2)
1. Introduction
In the process of computing a real given an oracle for it is natural to consider the condition that for the
computation of the first n bits of we are only allowed to use the information in the first n bits of . It is not
difficult to see that this notion of oracle computation is complexity sensitive in many ways. We can then generalize
this definition in a straightforward way by allowing that, in the computation of n, access is permitted to (n +c)
for some fixed constant c.
The study of oracle computations of this kind and of the reducibility they induce on 2 was initiated by Downey,
Hirschfeldt and LaForte [6,5], the motivation being that they might serve as a measure of relative randomness.
They presented the induced reducibility as a restriction of the weak truth table reducibility and gave it the (perhaps
unfortunate!) name strong weak truth table reducibilityor sw reducibility for short. After discussions with other
researchers in the area we introduce here the terminology linear reducible in place of strong weak truth table
reduciblewhile another reasonable contender for this title would certainly be the set of reductions in which the
use on argument n is bounded by an + c for some constants a and c it would seem that reductions of this type for
Corresponding author.
and is Lipschitz continuous. On the other hand, if is partial computable and Lipschitz continuous (say with
constant 2c ) we show that we can construct an `-functional which is equal to on infinite strings. To compute a total
on n knowing the first n + c bits of we effectively find an extension of (n + c) such that (n) . Since
(2) holds, the distance between (n + 1) and (n + 1) will be less than 2n . So (n) = (n).
The following are some results from the literature on the `-degrees (induced by ` ) which are relevant to our
present considerations. For more background on this structure we refer the reader to [7,5].
P
Definition 1.3. A Solovay test is a c.e. set S of binary strings such that S 2| | < . A real number avoids S
if for almost all S, 6 . A real is (Martin-Lof) random if it avoids all Solovay tests.
Definition 1.4. A real number is computably enumerable (c.e.) if it is the limit of a computable increasing sequence
of rationals.
The main justification for ` as a measure of relative randomness was the following:
Proposition 1.2 (Downey et al. [6]). If ` then for all n, the prefix-free complexity of n is less than or equal
to that of n (plus a constant).
In particular, then, ` preserves randomnessif is a random real and ` then is random, so that any
`-degree either contains only random or no random reals.
Yu and Ding proved the following:
Theorem 1.1 (Yu and Ding [11]). There is no `-complete c.e. real.
By a uniformization of their proof they got two c.e. reals which have no c.e. real `-above them. Hence:
83
Corollary 1.1 (Downey et al. [6]). The structure of `-degrees is not an upper semi-lattice.
84
254
A.E.M. Lewis, G. Barmpalias / Annals of Pure and Applied Logic 145 (2007) 252257
The main idea of their proof of Theorem 1.1 can be applied for the case of c.e. sets in order to get an analogous
result. Using different ideas Barmpalias [1] proved the following stronger result.
A.E.M. Lewis, G. Barmpalias / Annals of Pure and Applied Logic 145 (2007) 252257
255
Corollary 2.2 (The equivalent of the YuDing Theorem for the 2 Reals). There exists no `-complete 2 real.
Proof. This follows immediately from the previous corollary.
Theorem 1.2 (Barmpalias [1]). There are no `-maximal c.e. sets. That is, for every c.e. set A, there exists a c.e. set
W such that A <` W .
Note that since the Solovay degrees and the `-degrees coincide on the c.e. sets (see [5]) the following also holds.
Corollary 1.2 (Barmpalias [1]). The substructure of the Solovay degrees consisting of the ones with c.e. members
(i.e. containing c.e. sets) has no maximal elements.
Corollary 2.4. The `-degrees are not an upper semi-lattice, in fact there exists a set of two `-degrees with no upper
bound.
In Barmpalias and Lewis [2] it was shown that there are c.e. reals that cannot be `-computed by any random c.e.
real. That is, for any c.e. real ` , is not random. Also, in Barmpalias and Lewis [3] it was shown that strictly
below every random `-degree there is another random `-degree. The first aim of this paper is to prove the following
(perhaps rather surprising) result.
Proof. Just choose any and which are random and Turing incomparable.
Theorem 2.3 below, however, tells us that quasi-maximality does not characterize the random reals.
Theorem 1.3. There exists a (globally) quasi-maximal `-degree, i.e. there exists a real such that, for all reals , if
` then T . In fact every random real satisfies this quasi-maximality property.
Theorem 2.2 (Chaitin [4]). Consider a total computable prediction function f which, given an arbitrary finite initial
segment of a real , returns either no prediction, the next bit is a 0, or the next bit is a 1. If is random and f
predicts infinitely many bits of then in the limit the proportion of correct predictions to total predictions made tends
to 12 .
The fascination of this result lies in the fact that we are generally not used to degree structures possessing anything
like maximal elements in the global sense (where we consider the degrees of all reals).
So suppose given and such that is a random real and i = . By Schnorrs theorem we may define m ? to be
= m. Let Tn be all those strings of length
the maximum m such that there exist an infinite number of n, i ( n)S
n + ci such that i is the initial segment of of length n and let T = n Tn . We say that a real lies on T if all but
finitely many initial segments are in T . Since there are a finite number of 0 lying on T there exists 0 such that
if 0 6= lies on T then 0 6 0 . Now suppose we are given 0 which is not an initial segment of . Using an
oracle for we can enumerate all tuples (n, 1 , .., m ? ) such that Tn = {1 , .., m ? } until we find such a tuple with no
m compatible with whereupon we can deduce that is not an initial segment of .
Corollary 2.1. There exist low reals which are of quasi-maximal `-degree.
Proof. There exist low random reals [7].
85
Corollary 2.3. Every Turing degree above 00 contains a set of quasi-maximal `-degree.
Definition 2.4. For 2< let (, i) = min{ ( 0 , i) 0 }. Let ? (, i) be the least string 0 such that
( 0 , i) = (, i).
Lemma 2.2. Given 0 , i, let 1 = ? (0 , i). For all 2 1 we have (2 , i) = (0 , i).
Proof. By induction on the length of 2 . So suppose given 2 1 such that (2 , i) = (0 , i). Now if
(2 0, i) < (0 , i) or (2 1, i) < (0 , i) this would contradict the fact that 1 = ? (0 , i). Thus by Lemma 2.1
(2 0, i) = (2 1, i) = (0 , i).
Lemma 2.3. Given 0 , i, let 1 = ? (0 , i). For all 1 and all such that i = we have that T .
Proof. Given and as in the statement of theS
lemma, let Tn be all those strings of length n + ci such that i is
the initial segment of of length n and let T = n Tn . The following facts follow immediately from the fact that, by
Lemma 2.2, there are precisely the same number of strings (actually (0 , i)) in Tn for all sufficiently large n.
(i) There are a finite number of reals lying on T (at most (0 , i)).
(ii) We can compute (not just enumerate) T using an oracle for .
By (i) there exists 0 such that if 0 6= lies on T then 0 6 0 . If we are given 1 0 which is not an initial
segment of then using an oracle for it follows by (ii) that we can find n such that there are no extensions of 1 in
Tn .
For all , define f ( ) = {n : (n) = 0}. If is a random real then, by Theorem 2.2:
()limn
1
f ( n)
= .
n
2
The construction.
Let 0 beS
the empty string. Given i let i0 = ? (i , i) and then define i+1 to be i0 concatenated with 2i0 zeros.
Define = i i .
The verification.
Since i+1 it follows by Lemma 2.3 that if = i then T . We have that is not random since it
clearly does not satisfy ().
86
256
A.E.M. Lewis, G. Barmpalias / Annals of Pure and Applied Logic 145 (2007) 252257
3. Maximality
Having proved that quasi-maximality does not characterize the random reals it is natural to ask whether maximality
might provide such a characterization. With the following theorem, however, we are able to answer this question in
the negative.
Theorem 3.1. No real is of maximal `-degree.
Proof. Let the `-functionals 0 and 1 be defined inductively as follows. Suppose d {0, 1}.
(i) For both strings of length 1 we define d = d.
(ii) If is of the form 2n for some n 1 then let 0 be the initial segment of of length 2n 1. There exists a unique
define d = d0 1.
(iii) If is not of the form 2n for any n 0 then let 0 be the initial segment of of length 1. Let c = ( 1)
and define d = d0 c.
It is important to have an intuitive picture of the above inductive definition. Consider the range of 0 . We begin by
branching the empty sequence with two 0s. From then on, at levels 2n (for any n) we extend with either two 1s or two
0s according to whether there is another node of the identity tree which is on the left and which is 0 -mapped to the
same string as the node we are on or not. At all other levels we extend the strings as we would the identity treethat
is, a 0 on the left branch and a 1 on the right branch. It can easily be seen that 0 has the following properties.
A.E.M. Lewis, G. Barmpalias / Annals of Pure and Applied Logic 145 (2007) 252257
257
[2] G. Barmpalias, A.E.M. Lewis, A c.e. real that cannot be sw-computed by any -number, Notre Dame Journal of Formal Logic 47 (2) (2006)
197209.
[3] G. Barmpalias, A.E.M. Lewis, Random reals and Lipschitz continuity, Mathematical Structures in Computer Science 16 (5) (2006).
[4] G. Chaitin, Algorithmic Information Theory, Cambridge University Press, 2004.
[5] R. Downey, D. Hirschfeldt, G. LaForte, Randomness and reducibility, Journal of Computer and System Sciences 68 (2004) 96114.
[6] R. Downey, D. Hirschfeldt, G. LaForte, Randomness and reducibility, in: Mathematical Foundations of Computer Science, in: Lecture Notes
in Comput. Sci., vol. 2136, Springer, Berlin, 2001, pp. 316327.
[7] R. Downey, D. Hirschfeldt, Algorithmic Randomness and Complexity, Monograph (in preparation).
[8] A. Kucera, Measure, 01 Classes and Complete Extensions of PA, in: Springer Lecture Notes in Mathematics, vol. 1141, Springer-Verlag,
1985, pp. 245259.
[9] C. Schnorr, A unified approach to the definition of a random sequence, Mathematical Systems Theory 5 (1971) 246258.
[10] R. Soare, Computability theory and differential geometry, Bulletin of Symbolic Logic (in press).
[11] L. Yu, D. Ding, There is no SW-complete c.e. real, Journal of Symbolic Logic 69 (4) (2004) 11631170.
Further reading
[1]
[2]
[3]
[4]
[5]
[6]
G. Barmpalias, A.E.M. Lewis, The ibT degrees of computably enumerable sets are not dense, Annals of Pure and Applied Logic (in press).
C.S. Calude, A characterisation of c.e. random reals, Theoretical Computer Science 271 (12) (2002) 314.
R. Downey, D. Hirschfeldt, A. Nies, Randomness, computability, and density, SIAM Journal of Computation 31 (4) (2002) 11691183.
R. Downey, Some recent progress in algorithmic randomness, 2004 (preprint).
P. Odifreddi, Classical Recursion Theory, North-Holland, Amsterdam Oxford, 1989.
R. Soare, Recursively Enumerable Sets and Degrees, Springer-Verlag, Berlin, London, 1987.
00 = 01 = .
If | | = 2k + c < 2k+1 consider the two i such that 00 = 01 = . Then 0 , 1 differ at their c-th bit from the
end, i.e. their | | c 1 bit. In particular, if is of length 2k they differ on their last bit.
For every real which begins with 0 there is a unique such that 0 = .
So now suppose given a real and without loss of generality that (0) = 0. Then there exists a unique such
that 0 = . If is of `-degree strictly above then we are done. So suppose instead that we are given i such
0
that i = . We shall define 2 for which there exists a total tree of reals 0 such that 2 = . This suffices to
give the result, since then we can pick 0 on this tree which is not Turing below . Pick n 0 large enough such that
2n 0 ci > 2n 0 1 + 1.
(i) For all which are of length < 2n 0 , 2 = 0 .
(ii) If > 2n 0 , but is not of the form 2n for any n 0 then let 0 be the initial segment of of length 1. Let
c = ( 1) and define 2 = 20 c.
(iii) If is of the form 2n for some n n 0 , then let 0 be the initial segment of of length 2n 1. Let = 20 ,
87
88
On gaps under
GCH type
assumptions
Moti Gitik
Annals of Pure and Applied Logic
119 (2003), Pages 1-18
Abstract
We prove equiconsistency results concerning gaps between a singular strong limit cardinal
of co1nality 0 and its power under assumptions that 2 = ++1 for and some weak form
of the Singular Cardinal Hypothesis below . Together with the previous results this basically
completes the study of consistency strength of the various gaps between such and its power
c 2002 Elsevier Science B.V. All rights reserved.
under GCH type assumptions below.
MSC: Primary 03E35; 03E55
Keywords: Pcf-theory; Extenders; Forcing
0. Introduction
Our 1rst result deals with cardinal gaps.
We continue [8] and show the following:
Theorem 1. Suppose that is a strong limit cardinal of co#nality 0 ; is a
cardinal of uncountable co#nality. If 2 + and the Singular Cardinal Hypothesis
holds below at least for cardinals of co#nality cf , then in the core model either
(i) o()++1 + 1 or
(ii) { | o()++1 + 1} is unbounded in .
Together with [2,7] this provides the equiconsistency result for cardinal gaps of
uncountable co1nality. Surprisingly the proof uses very little of the indiscernibles theory
for extenders developed in [8]. Instead, basic results of the Shelah pcf-theory play the
crucial role.
E-mail address: gitik@math.tau.ac.il (M. Gitik).
c 2002 Elsevier Science B.V. All rights reserved.
0168-0072/03/$ - see front matter
PII: S 0 1 6 8 - 0 0 7 2 ( 0 2 ) 0 0 0 3 1 - 3
89
90
Results of Sections 2 are built on short extender-based Prikry forcings, mainly those
of [2].
+2
(i) o()+ +1 + 1 or
. We assume that there is no inner model with a strong cardinal. First we will prove
the following:
91
Ai otherwise :
cn | n !; ++1 A if cf = or ++1
icf
Removing its bounded part, if necessary, we can assume that min a|a|+ .
92
Claim 1.4. For every b a | A pcf (b)|6|b| or |Ai pcf (b)|6|b|, for every icf ,
if cf .
Proof. It follows from Shelahs Localization Theorem [13] and Claim 1.3.
Remark 1.7. The theorem implies results of the following type proved in [8]: if 2 =
+m (2m!) and GCH below , then o()+m + 1, provided that for some k!
the set of such that o()+ is bounded in .
In particular, |a| = .
Let b+ [a] be the pcf-generator
corresponding to + . Consider a = a\b+ [a]. For
every 0, if ++1 A or icf Ai then ++1 pcf (a ). Hence, |(pcf a ) A| =
or |pcf (a ) Ai | = i for each icf and by Claim 1.4, then |a | = .
Claim 1.5. Let
n | n! be an increasing unbounded in sequence of limit points
of a of co#nality cf . Then for every ultra#lter D on ! including all co#nite
sets
+
=D
+ :
cf
n
n!
t. In particular,
+
n pcf (a
n ) since
n is a limit point of a .
So {
+
|
n!}
pcf
a
.
By
[13],
then
pcf
{
|
n!}
pcf
(pcf a ) = pcf
n
n
a . But
by the choice of a ; + = pcf a . Hence for every ultra1lter D on !, cf. ( n!
+
n=
D) = 4 .
Now, |a | = ; a = ; cf 0 and cf = 0 . Hence, there is an increasing unbounded in sequence
n | n! of limit points of a so that for every n0 | a
(
n1 ;
n )| = and |(a
n )\| = for every
n . By Claim 1.5,
+
n | n! are
limits of indiscernibles. We refer to [8] for basic facts on this matter used here. There
+
is a principal indiscernible n 6
n for all but 1nitely many ns. By the Mitchell Weak
+
Covering Lemma,
+
n in the sense of the core model is the real
n , since
n is singular.
This implies that n 6
n , since a principal indiscernible cannot be successor cardinal
of the core model. Also, n cannot be
n , since again
+
n computed in the core model
correctly and so there is no indiscernibles between measurable now
n and its successor
+
n . Hence n
n . By the choice of
n , the interval (n ;
n ) contains at least regular
cardinals. So n is a principal indiscernible of extender including at least + 1 regular
cardinals which either seats over or below . This implies that either o()++1 +1
or { | o()++1 + 1} is unbounded in .
Using the same ideas, let us show the following somewhat more technical result:
Theorem 1.6. Let = n! n be a strong limit cardinal with 0 1 n .
1
Assume 2 ++ and SSH
(Shelah Strong Hypothesis below for co#nality 1 ,
i.e. pp
=
+ for every singular
of co#nality 1 ). Then there are at most countably many principal indiscernibles n; m | m; n! with indiscernibles n; m | m; n!
so that for each n; m! n 6n; m 6n; m ; n; m is the principal indiscernible of n; m ,
93
n!
n = sup{n;i | n;i Ci }:
Then each such
n is a singular cardinal of uncountable co1nality. Also,
+
+
+
n pcf a for every n d, since pp
n =
n . But then pcf {
n | n d} pcf a . Hence
+
+
|
n
d}.
Now,
this
implies
as
in
the
proof
of
1.1
that
s
are
indiscernibles
= pcf {
+
n
n
and there are principal indiscernibles for
+
n s below
n . Here this is impossible since
then there should be overlapping extenders. Contradiction.
We will use Theorem 1.6 further in order to deal with ordinal gaps.
As above, we show the following assuming that there is no inner model with a
strong cardinal.
Proposition 1.9. Suppose that
| is an increasing sequence of regular cardinals. is a regular cardinal 1 and
0 2 . Then there is an unbounded S such
that for every of uncountable co#nality which is a limit of points of S the following
holds:
( ) for every ultra#lter D on S including all cobounded subsets of S
bd
+1 ;
tcf
=D = tcf
=JS
S
S
bd
denotes the ideal of bounded subsets of S.
where JS
Proof. Here we apply the analysis of indiscernibles of [8] for uncountable co1nality.
Let | 6 be the increasing enumeration of the closure of
| . Let A
94
be the set of indexes of all principal indiscernibles for among s (). Then A
is a closed subset of . Now split into two cases.
Case 1: A is bounded in .
Let = sup A. We have a club C so that for every C; ( ; ) if is a
principal indiscernible, then it is a principal indiscernible for an ordinal below . Now
let be a limit point
co1nality. Then by results of [8]. pp =
of C of uncountable
Let A be the set of limit points of A. For every A we consider +1 . Let +1
be
ultrapower and i +1 the image of i which is the critical point of the embedding.
Fix for every i a sequence ci unbounded in i , in the core model and of cardinality cf i there. Take a precovering set including {ci | i}. By [8], assignment
functions can change for this new precovering set only on a bounded subset of i s.
Pick i such that i is above supremum of this set. Again, consider the ultrapower
used to move from i to i +1 . Now we have ci in this ultrapower and its cardinality
is i . Let j : M M be the embedding. ci M and M is an ultrapower by extender. Hence for some
and fci = j(f)(
). Let U
= {X i |
j(X )} and j : M M
i ) by i +1 ; j(
i ) = ci and j(f)([id])
= ci .
be the corresponding ultrapower. Denote j(
Let ci = j(f) )([id]) | )) = cf i = cf i be increasing enumeration (everything in
the core model). Then for most s (mod U
) f() = f) () | )) will be a sequence in M co1nal in i of order type ). Which contradicts the assumption that
cf i i .
where the successor is in sense of the core model or the universe which is the same
by the Mitchell Weak Covering Lemma. Also, for every which is a limit point of S
of uncountable co1nality
bd
tcf
O+1 =JS = (sup{O+1 | S })+ :
Proof. Suppose that for some a as in the statement of the theorem |pcf a||a|+1 . Let
= |a|+ + 2 . Then |pcf a|. Pick an increasing sequence
| inside pcf (a).
By 1.9 we can 1nd an unbounded subset S of satisfying the conclusion ( ) of 1.9.
including all cobounded subsets of S. Let
= cf
Let D be an ultra1lter on
(
=D). Then, clearly,
(
)+ . By the Localization Theorem [13], then
there is a0 {
| S}; |a0 |6|a| with
pcf a0 . Consider S\ sup a0 . S\ sup a0 D
since a0 is bounded in S. Hence cf ( S\ sup a0
=D) =
. Again by the Localization
Theorem, there is a1 S\ sup a0 ; |a1 |6|a| and
pcf a1 . Continue by induction and
de1ne a sequence a | !1 such that for every !1 the following holds:
+1 to the interval [ , length of the extender used over ]. This will replace +1 be a
is a principal
member of the interval. So let us concentrate on the situation when +1
indiscernible for but O+1 has co1nality 6 ().
Let us argue that this situation is impossible. Thus we have increasing sequences
i | i6; i | i and i | i such that for every i
rhoi is between i and the length of the extender used over i ; cf i i in the
core model, i is the image of i over i +1 and cf i i +1 in the core model.
Then cf i i again in the core model since i is the image of i in the
a S
|a |6|a|
pcf a
min a sup a for every .
Let = !1 sup a . Then is a limit of points of S and cf = 1 . Hence ( ) of
bd
1.9 applies. Thus tcf ( S
=JS
) exists below
+1 and is equal to tcf ( S
=F)
F on S including all cobounded subsets of S. Denote
for every ultra1lter
bd
) by +. Let c = pcf (a) and b) [c] | ) pcf (a) = c be a generating
tcf ( S
=JS
sequence. Clearly both + and
are in c and +
. Consider b = b
[c]\b+ [c].
For every !1 ; b a = , since
pcf (a ). Hence, b S is unbounded in
on S including b S.
(by (d) of the choice of a s). Let F be an ultra1lter
And all cobounded subsets of S. Then tcf ( S
=F) = + but this means that
+ pcf b, which is impossible by the choice of b, see for example [1, 1.2].
S
S
95
(a)
(b)
(c)
(d)
96
The proof of Theorem 1.10 easily gives a result related to the strength of the negation
of the Shelah Weak Hypothesis (SWH). (SWH says that for every cardinal , the
number of singular cardinals , with pp , is at most countable).
k1
Proof. We prove the statement by induction on ). Fix + . Let ) = 00 k1
,
where 0 = . Set for each .k1
k2 k1 1
n1 1
0
n2 k1
.+00 k2
k1
+1
(.) = ++0
Theorem 1.10.1. Suppose that there is no inner model with strong cardinal. Then for
every cardinal ,22
11
k1
k1
k2
k1
| is an ordinal of kind 00 k2
k1
(.) pcf ({
++1
n; i
; ii(n); n!}).
Let E be the set consisting of all regular cardinals of blocks Bn; i (n!; ii(n))
+
together with all regular cardinals between and min(+ ; 2 ). Set E = pcf E. Then
|pcf E |, since is strong limit. We can assume also that min E |pcf E |. By
[13] then pcf E = E and there is a set b/ [E ] | / E of pcf E generators which is
smooth and closed, i.e.
b/ [E ] implies b
[E ] b/ [E ] and pcf (b/ [E ]) = b/ [E ].
Assumption (2) of the lemma implies that for every unbounded in ++) set B
consisting of regular cardinals above and below ++) max pcf (B) = ++)+1 . In
particular max pcf {(.) | .k1 }) = ++)+1 . Denote ++)+1 by +. Let
A = b+ [E ] {(.) | . k1 }:
Then, |A | =
k1 and for every , A b, [E ] b+ [E ]. For every , A 1x a sequence
+
,n | n! n! n+1
inside b, [E ] such that
k1
sequence of k1 ordinals of kind 00 11 k1
(1)
+
(2) there is no measurable cardinals in the core model between and + .
97
if k = 1 and 0 = 1, i.e. ) = .
For every .k1 , if ) = then by induction
It is possible to 1nd ,n s of the right kind using the inductive assumption, as was
observed above.
Claim 1.16. There are in#nitely many n! such that
|{,n | , a }| = k1 :
Let 0) Kinds [; + ) and 2 ++) , for some + . Then ++)+1
| is an ordinal of kind ); ii(n); n!}, where
ni denotes the principcf {
++1
n; i
pal indiscernible of the block Bn; i , as de#ned in 1.6.
Proof. Otherwise by removing 1nitely many ns or boundedly many ,n s we can assume that for every n|{,n | , A }|k1 . But cf k1 0 . Hence, the total number of
,n s is less than k1 . Now, pcf {,n | n!; , A } A . So, |A pcf {pn, | n!; ,
A }||A | = k1 . By (2) of the statement of the lemma this situation is impossible.
Remark 1.15. (a) The lemma provides a bit more information then will be needed for
deducing the strength of 2 = +)+1 .
(b) Condition (2) is not very restrictive since we are interested in small () gaps
between and its power.
Suppose for simplicity that each n! satis1es the conclusion of the claim. If not,
then we just can remove all the bad ns. This will eQect less than k1 of s which
in turn eQects less than k1 of ,s.
98
10
k2
k1
k1
are of kind 00 11 k2
since cf
= cf k1 and we assumed SSH k1 , i.e. pp
=
+ . Also pp
=
+ implies
that the set {,n | , A }\b
+ [E ] is bounded in
.
11
Theorem 1.20. Let be a strong limit cardinal of con#nality 0 ; 0) Kinds. Assume that
6|)|
(1) SSH
(2) there are no measurable cardinals in the core model between and +)+1 .
If 2 +) , then in the core model either
(i) o()+)+1 + 1 or
(ii) { | o()+)+1 + 1} is unbounded in .
is reasonable}.
In order to conclude the proof we shall argue that there should be + pcf {
+ |
is
reasonable} such that + b+ [E ]. This will imply b+ [E ] = b+ [E ] and hence + = + .
Let us start with the following:
Claim 1.19. |{,n | n!; , A }\ {b
+ [E ]|
is reasonable} |k1 .
Proof. Suppose otherwise. Let S = {,n | n!; , A }\ {b
+ [E ]|
is reasonable}
and |S| = k1 . Then for some n! also {,n | ,n S} has cardinality k1 , since
cf k1 0 . Fix such an n and denote {,n | ,n S} by Sn .
But now there is a reasonable
which is a limit of elements of Sn . pp
=
+
implies that the set {,n | , A }\b
[E ] is bounded in
. In particular, Sn b
+ [E ]
is unbounded. Contradiction, since Sn S which is disjoint to every b
+ [E ] with
reasonable.
{,n
99
Proof. By Lemma 1.14, for in1nitely many ns for some ik i(n) the length of the
++1
for an ordinal
block Bn; in will be at least
+)+1
n; in , since it should contain some
n; in
of kind ). Clearly, ) since ) is the least ordinal of kind ).
We like now outline a way to remove (2) of Theorem 1.20 by cost of restricting
possible )s. First change De1nitions 1.11 and 1.13. Thus in De1nition 1.11 we replace
uncountable by above 1 . Denote by Kinds the resulting class. Then de1ne kind
of ordinal as in De1nition 1.13 replacing Kinds by Kinds .
Theorem 1.21. Let be a strong limit cardinal of co#nality 0 ; 0) Kinds . As6|)|
sume SSH . If 2 +) , then in the core model either
(i) o()+)+1 + 1 or
(ii) { | o()+)+1 + 1} is unbounded in .
The theorem, as in the case of 1.20, will follow from the following:
Lemma 1.22. Let be a strong limit cardinal of co#nality 0 ; a cardinal of
6
. Let 0) Kinds [; + ] and 2 ++) for
co#nality above 1 . Assume SSH
some + . Then
| is an ordinal of kind ); i i(n); n !}
pcf {
++1
ni
[++)+1 ; ++)+)+1 ] = :
Let us 1rst deal with a special case) is a cardinal. We split it into two cases: (a) )
is regular and (b) ) is singular. The result will be stronger than those of Remark 1.22.
Lemma 1.23. Let be a strong limit cardinal of co#nality 0 ; is a
6
. Let 2 ++ for some + .
regular uncountable cardinal. Assume SSH
100
12
Then
+++1 pcf ({
++1
| i i(n); n ! and is an ordinal
ni
of co#nality }):
Proof. Let + = +++1 . We choose E and b/ [E ] | / E as in the proof of
Lemma 1.14. Measurables of a core model between and 2 are allowed here. So in
contrast to Lemma 1.14 we cannot claim anymore for every unbounded B [; ++ )
consisting of regulars max pcf (B) = +++1 . Hence the choice of A (the crucial for
the proof set in Lemma 1.14) will be more careful.
Set A to be the set of cardinals ++
+1 [++1 ; ++ ) such that either o()++
Claim 1.25. If A D .
101
13
k2
k1
00 k2
k1
}) [++)
.+) +1
; ++)
.+) +) +1
];
where
) =
k2
k1 1
0 k2
k1
k1
if ) = 00 k1
and
(k 1
or (k = 1 and 0 1)
0;
if k = 1
and
0 = 1
In the last case the inductive assumption insures the existence of such (.).
De1ne E and b/ [E ] | / E as in the proof of Lemma 1.14. We do not know
now if for every unbounded in ++) set B [; ++) ) consisting of regular cardinals
max pcf (B) = ++)+1 . We may consider the set {++) +1 | k1 }. If for club
++) +1
is not a principle indiscernible then by [8] cf ( B=bounded) =
many s
++)+1
++)
for any unbounded subset B of
consisting of regular cardinals. Note that
cf k1 0 is crucial here. In this case we de1ne A ={(.) | .k1 } b++)+1 [E ]
and proceed as in the proof of Lemma 1.14. The only diQerence will be the use
of 1.10 to eliminate a possible inIuence of k1 cardinals. Here the assumption
k1 1 comes into play. In the general case it is possible to have {(.) | .k1 }
bk ++)+1 [E ] empty. But once for a club of s below k1 +) +1 s are principal
indiscernibles, by [8] we can deduce that
pcf ({(.) | . k1 })\++)
[++)+1 ; ++)+)
+) +1
] [++)+1 ; ++)+)+1 ]:
Let D be an ultra1lter on the set {(.) | .k+1 } containing all cobounded subsets.
Set
+ = cf
{(.) | . k1 }=D :
102
the measure an () of En . In present situation we force over the principal indiscernible
n , i.e. one corresponding to the normal measure of En . The extender-based Magidor
forcing changes its co1nality to and adds for every -n 6-6+n+2
a sequence tnn
of order type co1nal in n . Actually, tn; +n+2
(i) = +n+2
n; i (i), where ni | i is
n
+n+2
the sequence tnn . Now, if -n
is produced by an (), then we connect with
the sequence tn- in addition to its connection with -. Using
about
standard arguments++
for
Prikry-type forcing notions, it is not hard to see that cf ( n! +n+2
n; i ; 1nite) =
every i as witnessed by tn- (i)s.
14
Remark 1.26. The use of Kinds and not of Kinds in Theorem 1.21 (or actually in
Lemma 1.22) is due only to our inability to extend Theorem 1.10 in order to include
the case of a countable set. Still in view of Theorem 1.1 and also Lemmas 1.23 and
1.24, the 1rst unclear case will not be !1 but rather !1 + !1 .
2. Some related forcing constructions
In this section, we like to show that (1) it is impossible to remove SSH assumptions
from Theorem
1.6; (2) the conclusion of Theorem 1.11 is optimal, namely, starting
n
with = n! n ; 0 1 n and o(n ) = n+ +1 + 1 we can construct a
model satisfying 2 + for every , where as in Proposition 1.9 is a cardinal
of uncountable co1nality; (3) the forcing construction for s of co1nality 0 will
be given. All these results based on forcing of [2] and we sketch them modulo this
forcing.
+n
Proof. Without loss of generality, we can assume that is a regular cardinal. We pick
an increasing sequence n | n! converging to so that for every n! o(n ) =
n+n+2 + + 1. Fix at each n a coherent sequence of extenders Ein | i6 with Ein of
the length n+n+2 .
We like to use the forcing of [2, Section 2] with the extenders sequence En | n!
to blow power of to ++ together with extender-based Magidor forcing changing
co1nality of the principal indiscernible of En to (for every n!) simultaneously
blowing its power to the double plus. We refer to Segal [12] or Merimovich [10] for
generalizations of the Magidor forcing to the extender based Magidor forcing.
The de1nitions of both of these forcing notions are rather lengthy and we would not
reproduce them here. Instead let us emphasize what happens with indiscernibles and
why (iii) of the conclusion of the theorem will hold.
Fix n!. A basic condition of [2, Section 2] is of the form an ; An ; fn , where an is
an order preserving function from ++ to n+n+2 of cardinality n ; An is a set of measure one for the maximal measure of rngan which is in turn a measure of the extender
En over n . The function of fn is an element of the Cohen forcing over a + . Each
doman is intended to correspond to indiscernible which would be introduced by
103
15
Remark 2.2. Under the assumptions of the theorem, one can obtain 2 + for any
countable . But we do not know whether it is possible to reach uncountable gaps.
See also the discussion in the Section 3.
Theorem 2.3. Suppose that is a cardinal of co#nality !; is a cardinal of
n
uncountable co#nality and for every n! the set { | o()+ } is unbounded
in . Then for every + there is co#nality preserving, not adding new bounded
subsets to extension satisfying 2 + .
Remark 2.4. By the results of the previous section, this is optimal if [
at least if one forces over the core model.
n!
n ; + ),
104
16
n1
Theorem 2.6. Let be a cardinal of co#nality ! and 00 k1
Kinds .
} is unbounded in
Suppose that for every n! the set { | o()
k1
+ there is co#nality preserving, not adding new
. Then for every 00 k1
+
bounded subsets to extension satisfying 2 .
Again, this is optimal by results of the previous section, if
k1
k1
00 k1
n ; 00 k1
+
17
Table 1
k1 n
+00 k1
=2
o()++
or
n!{ | o()+n }
is unbounded in
20
o()+ + 1
or
n!{ | o()+n }
is unbounded in
at least if one forces over the core model in case = 1 . The construction is parallel to those of Theorem 2.3, only we use the following version of RadoMilner
Paradox:
k1
k1
n ; 00 k1
+ ) there are Xn (n !) such
For every [ n! 00 k1
k1
0
n
that = n! Xn and otp(Xn )0 k1 .
||{ | o()+
}
is unbounded in
cf || = 0
n!
0
cf ||0
is a cardinal
o()++1 + 1
or
{ | o()++1 + 1}
is unbounded in
= || ,
for some
1!
o()+|| +1 + 1
or
{ | o()+|| +1 + 1}
is unbounded in
!{ | o()+|| }
Under the same lines we can deal with gaps of size of a cardinal of countable
co1nality below . Thus the following result which together with the results of the
previous section provides the equiconsistency holds:
||
!
is unbounded in
k1
k1
+
0
00 k1
!
k 60 k1 k
k1
k Kinds
for some 00 k1
0
kk kk
n!{ | o()+0
is unbounded in
0
k +1
o()+0 k
or
+1
+00 kk+1
+ 1}
{ | o()
is unbounded in
{ | o()+
}
is unbounded in
105
106
18
The situation without SSH is unclear. In view of 2.1 probably weaker assumptions
then those used in the case of SSH may work. A simplest question in this direction
is as follows.
Question 3. Is { | o()+n } unbounded in for each n! suTcient for strong
limit, cf = 0 and 2 +!1 ?
If the answer is aTrmative, then the construction will require a new forcing with
short extenders, which will be interesting by itself. We then conjecture that the same
assumption will work for arbitrary gap as well.
For uncountable co1nalities (i.e. cf 0 ), as far as we are concerned with consistency strength, the only unknown case is the case of co1nality 1 . We restate a
question of [8]:
Question 4. What is the exact strength of is a strong limit, cf = 1 and 2 , for
a regular ,+ ?
It is known that the strength lies between o() = , and o() = , + !1 , see [8].
The ManinMumford
conjecture and the
model theory of
difference fields
Acknowledgements
We are grateful to Saharon Shelah for many helpful conversations and for explanations that he gave on the pcf-theory.
References
[1] M. Burke, M. Magidor, Shelahs pcf theory and its applications, Ann. Pure Appl. Logic 50 (1990)
207254.
[2] M. Gitik, Blowing-up power of a singular cardinal-wider gaps, Ann. Pure Appl. Logic, to appear.
[3] M. Gitik, Wide gaps with short extenders, math. LO=9906185.
[4] M. Gitik, The negation of SCH from o() = ++ , Ann. Pure Appl. Logic 43 (3) (1989) 209234.
[5] M. Gitik, The strength of the failure of SCH, Ann. Pure Appl. Logic 51 (3) (1991) 215240.
[6] M. Gitik, There is no bound for the power of the 1rst 1xed point, submitted for publication.
[7] M. Gitik, M. Magidor, The singular cardinals problem revisited, in: H. Judah, W. Just, H. Woodin
(Eds.), Set Theory of the Continuum, Springer, Berlin, 1992, pp. 243279.
[8] M. Gitik, W. Mitchell, Indiscernible sequences for extenders and the singular cardinal hypothesis, Ann.
Pure Appl. Logic 82 (1996) 273316.
[9] K. Kunen, Set Theory: An Introduction to Independence Proofs, North-Holland, Amsterdam, 1983.
[10] C. Merimovich, Extender based radin forcing, Trans. AMS, to appear.
[11] W. Mitchell, The core model for sequences of measures, Math. Proc. Cambridge Philos. Soc. 95 (1984)
4158.
[12] M. Segal, Masters Thesis, The Hebrew University, 1993.
[13] S. Shelah, Cardinal arithmetic, Oxford Logic Guides 29 (1994).
[14] S. Shelah, The singular cardinal problem. Independence results, in: A. Mathias (Ed.), London
Mathematical Society, Lecture Note Series, Surveys in Set Theory, vol. 87, Cambridge University
Press, Cambridge, pp. 116 133.
107
Ehud Hrushovski
Annals of Pure and Applied Logic
112 (2001), Pages 43-115
108
E. Hrushovski
E. Hrushovski
44
pp. 3739; 27, p. 73]. This was proved by Raynaud. Indeed he proved (with K a
denoting the algebraic closure of a 'eld K):
Abstract
Using methods of geometric stability (sometimes generalized to 'nite S1 rank), we determine
the structure of Abelian groups de'nable in ACFA, the model companion of 'elds with an
automorphism. We also give general bounds on sets de'nable in ACFA. We show that these
tools can be used to study torsion points on Abelian varieties; among other results, we deduce
a fairly general case of a conjecture of Tate and Voloch on p-adic distances of torsion points
c 2001 Elsevier Science B.V. All rights reserved.
from subvarieties.
MSC: 03C; 11G; 12H10
Keywords: Abelian varieties; Torsion points; Di&erence 'elds; Geometric stability;
Model theory of di&erence equations
1. Introduction
This paper extends and specializes the general model theory of di&erence equations
in characteristic 0, developed in [9]. We investigate the induced structure on de'nable
Abelian groups of 'nite dimension. We also give general bounds on the number of solutions to a 'nite set of di&erence equations. As a corollary, we obtain a model-theoretic
proof of the ManinMumford conjecture. The proof yields new number-theoretic information, particularly with respect to p-adic and algebraic uniformities of the bounds
obtained.
1.1. The ManinMumford conjecture
The ManinMumford conjecture states that if C is a curve of genus two or more,
embedded in its Jacobian J , then the set of torsion points on C is 'nite, see [18,
1 The work was begun at MIT, with support from the NSF. The latter part was supported by the ISF.
E-mail address: ehud@math.huji.ac.il (E. Hrushovski).
109
110
E. Hrushovski
E. Hrushovski
45
c and e can be (and are, in Section 5) written down explicitly; they are doubly
exponential in some natural parameters associated with A.
Compare this to the bound in [11]. The bound there is given for X de'ned over a
'xed number 'eld K, and is not given uniformly in K. There is a constant, depending
on K and A, whose existence follows from Serres work on Galois representations, but
was not known to be e&ective [18, p. 39; 29].
I learned from the referee report that the constant was later shown to be e&ective, as a
corollary of an alternative transcendence proof of the Tate and Shafarevich conjectures
by Masser and WNustholz [20], and subsequent work by Bost and David [1]. 2 At all
events, the bounds given here are independent of this Galois theoretic data.
The bound we obtain can be written down more easily if one prime is excluded
from the torsion. Let ZCl(Y ) denote the Zariski closure of Y .
Example 1.1.1. Let A be a connected, commutative algebraic group de'ned over a
number 'eld K. Let p be a prime of good reduction, with residue 'eld GF(q) of
characteristic p. Let Tp be the group of points of A(K a ) of 'nite order prime to p.
Let X be a subvariety of A. Then ZCl(X Tp ) is a union of at most
4d (2dr +1) log2 (1+q1=2 ) 22dr dim(X )
(deg(X )2dr +1 d+ r
References and quote from the very useful referee report, for which I am grateful. The referee adds:
This might also be a good place to cite the important paper of Faltings [9], which underlay Serres result.
111
46
The union is taken over a 4nite set of subvarieties C X of the form Ai + c; with
mc A(K). The integer m and the subvarieties Ai can be determined e6ectively.
Tate--Voloch conjecture. Tate and Voloch conjectured that the torsion points on an
Abelian variety A over Cp that do not lie on a subvariety V A, are bounded away
from that variety. Certain special cases were proved by TateVoloch, and by Buium
and Silverman. The proof of the ManinMumford conjecture given above lends itself
immediately to a proof of the TateVoloch conjecture under much weaker restrictions:
A must be assumed de'ned over a 'nite extension of Qp ; must have good reduction;
and the prime-to-p torsion points only are considered. We show this in Lemma 6:6:1.
Structure of the proof of ManinMumford: The proof moves from number theory
through algebra to model theory; the main work is done there. The 'rst step is to
embed the group of torsion points into a group de'ned by di&erence equations. All
the number theory used occurs here. Beyond this point we have a di&erence algebra
setup, and we study it with model theoretic binoculars. Three levels of model theory
are used.
(1) Quanti'er elimination (to a certain level). Consider, for instance, the problem of
showing that if two groups A and B satisfy the conclusion of ManinMumford, then
so does A B. This involves considering projections to B of subvarieties X (A B),
and their 'bers. If A; B are taken to be groups of rational points, or torsion points, the
projections are notoriously undecidable (GNodel). But if we take larger groups de'ned
in a structure with quanti'er elimination, then the projections are de'ned by similar
formulas. This provides a reasonable context in which to carry out a proof.
(2) A dimension theory is developed to study the de'nable sets. We use S1-rank.
This is much younger than Morley rank, and we need to develop the foundations to
some extent (Section 3). The very existence of the dimension theory suQces for some
purposes. For instance, by comparing dimensions, one sees that any de'nable group
has 'nite index in one arising directly from an algebraic group (A 'ner argument can,
in fact, completely give the structure of de'nable groups in terms of algebraic groups.)
(cf. Remark 4.0.3).
(3) Model theory also permits second-order arguments; properties of the class of all
de'nable sets are often more amenable to devissage, more functorial under interpretations, than of individual de'nable sets. A very simple example occurs in Proposition 3.4.1; see the remark following it. Another instance occurs in [5], and enters the
present paper via the key De'nition 4.1.2 (where a dichotomy is found between groups
satisfying ManinMumford, and those embeddable into the set of points of the 4xed
4eld of an algebraic group). Within [5] one uses a relation, for arbitrary structures
with a certain dimension theory, between Galois theory, amalgamation, and modularity
(modularity is the abstract form of ManinMumford, or MordellLang). This makes
no sense for a single algebraic variety; it can be applied, roughly speaking, to an
appropriate class of varieties, i.e. to a structure interpretable in (enriched) algebraic
geometry.
112
E. Hrushovski
E. Hrushovski
47
The model theoretic part of the proof is very similar to that of the geometric Mordell
Lang conjecture in [14]; but the algebraic layer involves 'elds with automorphisms,
rather than derivations, in order to be able to say something about number 'elds; as
a consequence the relevant structures are not stable, leading to the use of a di&erent
dimension theory.
1.2. Di6erence algebra and model theory
A (k-fold) di&erence 'eld is a 'eld with distinguished automorphism
1 ; : : : ;
k . The
deeper results of [5] are available only for k = 1. We will always assume k = 1 except where explicitly stated otherwise. We also restrict attention to characteristic zero
throughout the paper. The (only) reason is that the deeper results of [5] are available only with this assumption. We expect entirely similar results concerning de'nable
subgroups of semi-Abelian varieties in positive characteristic; for subgroups of vector
groups new phenomena are encountered [5, Section 7]. 3 The direct analogue of the
ManinMumford conjecture is of course false for Abelian varieties de'ned over 'nite
'elds; once these are ruled out (in the appropriate sense), the result follows from [14].
The theory of di&erence 'elds of characteristic zero has a model companion; it plays
the same role for di&erence 'elds as the algebraic closure does for a 'eld. See [24] or
[25] or [7] for this notion in general, and [5] for the case of di&erence 'elds (with a
single autormorphism). We give a quick summary.
Denition 1.2.1. Let (K;
1 ; : : : ;
k ) be a di&erence 'eld. K is di6erence-closed if K is
algebraically closed, and the following condition holds:
Let X be an irreducible K-variety, Xi the variety obtained by conjugating by
i . Let
Y be an irreducible subvariety of X X1 Xk , projecting dominantly to each
factor. Then there exists a X (K) with (a;
1 (a); : : : ;
k (a)) Y .
We will say that (K;
1 ; : : : ;
k ) is a universal domain if whenever K is a relatively algebraically closed sub'eld,
(K ) K , K is a countable di&erence 'eld,
and i : K K is an embedding of di&erence 'elds, then there exists an embedding
j : K K of di&erence 'elds, with j i = idK . We will not use any nontrivial properties of universal domains; however it is easy to see that every (countable) di&erence
'eld embeds into some universal domain. This amounts to saying that the class of
algebraically closed di&erence 'elds has the amalgamation property; cf. [7].
Lemma 1.2.2. Any universal domain is di6erence-closed.
Proof. Let X; Xi be as in the de'nition of di&erence-closed. Let K be a countable,
i -invariant, algebraically closed sub'eld of K, over which X is de'ned. Let (a; a1 ; : : : ;
ak ) be a generic point of Y , over K . Let K be an algebraically closed 'eld extending
K (a; a1 ; : : : ; ak ), and of in'nite transcendence degree over K . It is easy to see that
3
113
After these lines were written, major parts of [5] were generalized to positive characteristic in [6].
48
aD
114
E. Hrushovski
E. Hrushovski
49
(ii) One can use the structure of quanti'er-free de'nable subsets of D. The dimension
of D is then the maximal m, such that there exist a chain p0 pm of prime
di&erence ideals, with the de'ning polynomials of D lying in p0 . This could be stated
dually in terms of irreducible di&erence subvarieties of D.
(iii) One can use the structure of all de'nable subsets of D. See the discussion of
S1-rank below. The ranks given in these three ways satisfy (iii)6(ii)6(i), despite the
additional freedom permitted by (iii). It is this fact that will be used, later on, to show
that all de'nable groups embed into algebraic groups, and are simply determined by
certain algebraic-group data (up to 'nite index, in the version given here; a precise
determination requires more work; this reRects the blindness of the rank to 'nite index).
Note that if G is an algebraic group, or generally an algebraic variety, then G has
'nite (Zariski) dimension as such, but in'nite rank in the sense of
.
(Classical model theory has developed a convention of denoting a model and its
universe by the same symbol. This is gradually becoming cumbersome, as one works
more and more with di&erent structures on the same universe. For instance, here it
would be much better to have di&erent symbols for G as an algebraic group, and as
a group de'ned in a di&erence 'eld. However we will stick to this convention in the
present paper, and trust to context.)
From now on, unless explicitly stated otherwise, we work in the ordinary case of
a single automorphism, k = 1.
In our analysis of de'nable Abelian groups below, we will also require the general
classi'cation of di&erence formulas of 'nite rank, developed in [5]. With a little extra
analysis, the central result there can be phrased as follows. Let '(x) be any di&erence
equation, or formula, of 'nite rank. One can try to simplify it by substitution, say the
substitution of x for x, where x is a rational or algebraic function of x;
(x);
2 (x); : : : .
Using such transformations, ' can be reduced to an equation ' of one of the following
forms. Let E = {x : ' (x )}.
E is the 'xed 'eld k, i.e. ' is the equation
(x ) = x .
E is a one-dimensional
-de'nable subgroup of a simple Abelian variety A. Moreover, every
-de'nable subset of E n is a 'nite Boolean combination of
-de'nable
subgroups.
' is trivial in the sense that there are no algebraic relations between pairwise
independent solutions of '.
Though this has been proved only in characteristic zero, we believe that an appropriate
modi'cation of the theorem holds in all characteristics. (In (i) one must allow more
generally the equation for the 'xed 'elds of
l Frobk , and in (ii) certain subgroups
of vector groups must be taken into account.) Note the special role that this theorem
accords to subgroups of Abelian varieties, among all di&erence equations. This by itself
suggests a closer understanding of such groups could be useful.
De4nable groups: In Section 4 we will develop the theory of Abelian groups de'nable in di&erence algebra. In principle, these groups may be de'ned by arbitrary
'rst-order formulas in the language of commutative rings, with a symbol for the automorphism
. However, one quickly sees that up to isogeny and 'nite index subgroups,
115
50
i
they can all be de'ned using linear equations
ei
(x) = 0, where x ranges over a
commutative algebraic group A (this will be explained in more detail in Section 4).
We put linear in quotes, since if A is a semi-Abelian variety, and the equation is
written out in coordinates, it is not linear at all. We are interested however in the inner
structure of these groups, in the model-theoretic sense of induced structure; this includes polynomial relations among elements, intersections with subvarieties, behaviour
of
. We obtain essentially the full story here. As an example, we quote:
Denition 1.2.3. Let A be a semi-Abelian variety over K, de'ned over the 'xed 'eld
k. An equation of the form
mi
i (x) = 0
i=0;:::;n
(or in the inhomogeneous version,
m
i (x) = a) is said to be of restricted
n i=0;:::;i n i
Abelian type if the polynomial
m
T
116
E. Hrushovski
E. Hrushovski
51
These doubly exponential bounds will be proved in Section 2. The proof requires
only Bezouts theorem and Proposition 2.2.1 (which can be taken as a de'nition of
di&erence-closed di&erence 'elds). They apply a posteriori to any di&erence variety
known to be 'nite, regardless of the e&ectivity of the initial proof of such 'niteness.
In particular they apply to the qualitative 'niteness statement of Theorem 1.2.1, and
yield the explicit bounds.
The proof of ManinMumford given in this paper was found in 1994 (cf. [13]),
written and submitted in 1995. The present text is a great improvement over the 1995
preprint, thanks to the graceful help of Elisabeth Bouscaren, ZoSe Chatzidakis, and
Michael McQuillan, as well as Alex Wilkie and Boris Zilber. Asides from many local
corrections, the following aspects are new.
TateVoloch: I heard about the TateVoloch conjecture a few weeks after the original preprint was submitted. It was immediately clear that the methods of this paper,
with no further work other than a classical nonstandard analysis type argument, answer
signi'cant cases of that conjecture. The result (Proposition 6.6.1), circulated separately
in early 1996, is now included in the text.
Uniformity in X: The proof of ManinMumford, unlike the number theoretic proofs,
does not assume that X lies over the 'xed 'eld, and hence gives bounds uniform in X .
In other words, if X (A Y ) is an algebraic variety, Xb = {a A : (a; b) X}, then
there exists m such that the Zariski closure of Xb Tor(A) is a union of at most m
translates of group subvarieties. It seemed at the time of writing that this is a signi'cant
contribution of the present approach. But it turns out that this can be deduced directly
from the statement! See automatic uniformity, below.
Uniformity in A: Uniformity in A (Theorem 1.1.2) does appear to be an important
feature of this proof. In particular, going back to the original statement of Manin
Mumford, 'x p and also g2; there exists an absolute bound b(p; g) on the number
of prime-to-p torsion points lying on a curve over Qp of genus g and with good
reduction at p.
Improvements in model-theoretic technology: The theory of simple unstable 'rstorder theories has greatly matured in the intervening 've years. The main advance
was a generalization of the 'nite-dimensional theory (represented in part here) to the
general simple context. But there were also advances within the 'nite-dimensional
context, and in particular Frank Wagner found a much smoother and more general
approach to internalizing groups. However I left the original treatment intact.
Automatic uniformity: Statements such as ManinMumford, or MordellLang, at
the level of all Abelian varieties (or at least all Cartesian powers of a given one),
enjoy an automatic uniformity property. As soon as one knows that ZCl(X ,n ) is
a 'nite union of cosets of group subvarieties for any n and any subvariety X of An ,
one also obtains a bound on this 'nite number, that does not grow when X moves in
an algebraic family. This uniformity (Corollary 3.5.9) seems to have gone unobserved
117
52
Y (g) =
yY:
n
(gi ; y) U
i=1
118
E. Hrushovski
E. Hrushovski
53
54
' is de'ned outside of the union L of l linear subvarieties. For a subvariety S of Pnl ,
we let S be the Zariski closure of '1 S. Let deg(S) = deg(S ), the latter taken in
projective space. If S is a reducible variety, we de'ne deg(S) to be the sum of the
degrees of the components.
Let
Rn = {(g; h) G n G : Y (g) Y (h)};
n
En = (g1 ; : : : ; gn ) G n :
Y (g1 ; : : : ; gi1 )
= Y (g1 ; : : : ; gi ) :
i=1
deg(Zi ) 6
r
deg(Vj )
j=1
(2) Let V be a subvariety of (Pn )l (Pn )k ; and let VU be the (set-theoretic) projection to (Pn )l . Then deg(VU )6deg(V ).
(3) Let X be a subvariety of (Pn )l (Pn )k of degree d. Let pr1 be the 4rst
projection; X (a) = X pr11 (a). Suppose dim X (a) = r for generic a pr1 X . Then
{a (Pn )l : dim(X (a))r} is contained in a proper Zariski closed subset of pr1 X of
degree at most d.
(4) Let V be a subvariety of (Pn )l (Pn )l (Pn )k ; 2 = {(a; b; c) (Pn )l (Pn )l
(Pn )k : a = b}; and let VU = pr(V 2) be the (set-theoretic) projection of V 2 to
(Pn )l (Pn )k . Then deg(VU )6deg(V ).
Proof. (1) Note that S = '(S U ), and S is irreducible if S is. For S is invariant
under the action of the torus Gml , acting by
119
: ::: :
xn1
x02
: ::: :
xn2
: ::: :
xnl )
((x01
: ::: :
xn1 ); : : : ; (x01
given by
: ::: :
xnl )))
It follows that some component of S of maximal rank must also be invariant; but then
it has the form U for some subvariety U , and necessarily U = S and U = S . Thus the
operation S S takes subvarieties of Pnl to subvarieties of P(n+1)l1 , injectively and
preserving the inclusion and irreducibility. Hence if Z is a component of V1 V2 , then
Z is a component of V1 V2 . The operation also preserves degree by our de'nition
of degree. Thus we are reduced to the case l = 1; this is [10, p. 148, Example 8:4:6].
(2) We may assume V , and hence VU , are irreducible. Let
'1 : P(n+1)l1 (Pn )l
and
'2 : P(n+1)(l+k)1 (Pn )k (Pn )l
be the maps from De'nition 2.1.1 used to de'ne degree on (Pn )l+k and on (Pn )l . Let
4 : (Pn )l (Pn )k (Pn )l
120
E. Hrushovski
E. Hrushovski
55
be the projection. Then clearly 4'2 = '1 5 where 5 is a linear projection from
P(n+1)(l+k)1 to P(n+1)l1 . It remains to show:
56
Wl2
Wk (Pl ) (Pk );
Lemma 2.2.1. Let (K;
) be a di6erence-closed di6erence 4eld; and let X be an irreducible K-variety; X
the conjugate variety; and Y an irreducible K-subvariety of
l
X X
X
:
l1
'0 : Wl Wk Pl Pk ;
'0
4 (x1 ; : : : ; xl ) = (x2 ; : : : ; xl );
41 (x1 ; : : : ; xl ) = x1 :
Now
1
pr0 ('1
0 (V )
20 );
121
Let 4; 4 ; 40 be the projections to X X
X
; X
: : : X
; X; respectively. Suppose (4Y )
= 4 Y (or just that these two sets have the same Zariski
closure). Then
dim(S)
Then deg(Z)6deg(S)2
. In particular; Z has at most deg(S)2
irreducible components.
Say S is de4ned over Q(c). Then every irreducible component of Z is de4ned over
Q(
i (c); : : : ;
idim(S) (c))a for some i; 06i6dim(S).
122
E. Hrushovski
E. Hrushovski
57
T (0) = ;
{W : W a component of S(i)};
T (i + 1) = {W : W a component of S(i)} T (i):
S(i + 1) =
Using Claim (4), it is clear that S(k) = for some k6dim(S) + 1. By Claim (3), if
xU = (x;
x : : : ;
l1 (x)) S then xU S(i) T (i) for each i, hence xU T (k). Thus in this
situation x 41 T (k). It follows that Z is contained in 41 T (k). Conversely, by Claim (2),
every component of 41 T (k) is contained in Z. Hence Z = 41 T (k).
i
i
By Claim (1), deg(S(i) T (i))6deg(S)2 and degT (i + 1)6deg(S)2 for each i. The
required bound on the degree follows using Lemma 2.1.2(2).
The rationality statement follows by induction from Claim (5).
Corollary 2.2.3. Let X be a subvariety of Pn ; and let S be an irreducible subvariety
of Pnl . Let (K;
) be a di6erence-closed di6erence 4eld; and let
Z = Zariski closure of {x X (K) : (x;
(x); : : : ;
l1 (x)) S}:
123
58
Then
d
124
E. Hrushovski
E. Hrushovski
59
4i;+ : PA PAi;+ :
125
60
dim(S)
:
dim(S)
126
E. Hrushovski
E. Hrushovski
61
We include here some of the general theory of groups of 'nite S1-rank. Some of this
material appeared previously in an unpublished preprint PAC (On PAC and related
structures), and some was introduced in [8]. We refer to Chap. 7 of [5].
In these lemmas, all structures are assumed to be of 'nite S1-rank; there are no other
assumptions. To be precise, we assume that we work in a universal domain U, with a
map rk on nonempty de'nable sets, into the nonnegative integers, with the following
properties. (set rk() = )
A point of a de'nable set D of rank d is said to be generic over a base set B if it
does not lie in any B-de'nable set of rank d:
1. Suppose f : D E is a de'nable map. Let
127
62
128
E. Hrushovski
E. Hrushovski
63
64
Lemma 3.2.3. Let G be a group de4nable in a structure of 4nite S1 rank. For any
de4nable Y G there exist de4nable groups Hi of G and 4nitely many cosets Ci of
Hi such that:
1: Y i Ci .
2: For some n; every element of Hi has the form
a(1)
i
16i62n
with ai (Y Ci ).
129
130
E. Hrushovski
E. Hrushovski
65
66
Similarly, we may show that K has a normalizer of 'nite index. Let a; a be independent generic elements of P, and let b = a a1 . We will show that b normalizes K.
Let c K. Then
Proof. Enlarging C, and replacing p by a complete type over the new base, preserves
the hypothesis. Hence we may assume that C is algebraically closed, and that any
subgroup of G of 'nite index, de'nable over C, has a set of coset representatives in
G(C) = {a G : a dcl(C)}. Then by assumption, E(a) is 'nite, of size k say. Replace
C and p by another set and type, in such a way that k is least possible.
Let
K = {a G(C) : for some b P; ab P; and tp(b=C D) = tp(ab=C D)}:
Since P is the solution set of a complete type, one can equally well say for all b P
in the de'nition of K. It follows that K forms a subgroup of G. By assumption, for
b P, only 'nitely many b P have the same type over C D as b; hence Kb is
'nite, so K is 'nite.
Note that if a G, and for some b P with a; b independent over C, tp(b=C
D) = tp(ab=C D), then a K. This is because ab E(b) acl(C {b}), so a
acl(C {b; ab}) = acl(C {b}); as a; b are C-independent, a acl(C) = C; and hence
a K.
Claim. For a P, E(a) = Ka.
Proof. Ka E(a) by de'nition. Let a; d be independent elements of P over C,
a = db1 . Let a E(a); let c = a a1 ; we will show that c K. Note that a acl
(C {a; a }) = acl(C {a}); and that a; b are C-independent, by genericity of P.
We have
tp(a=C {b} D) = tp(a =C {b} D)
131
So
tp(a =(C D)) = tp((bcb1 a =(C D))
and hence bcb1 K. Now by Lemma 3.2.1, it follows that the normalizer N (K)
contains a subgroup of G of 'nite index.
Now P is divided into 'nitely many cosets of N (K); so there exists a translate gP of
P such that gP N (K) has rank equal to G. N (K) has a set of coset representatives in
G(C), so we may take g G(C). Let P = gP N (K). If a P , in particular a gP, so
every conjugate of a over C D is in Ka; hence the image aU of a in GU = (N (K)=K) has
no proper conjugates over C D. Since D is stably embedded, aU dcl(C D). Let PU
be the set of elements of GU with the same type over C as a.
U Then PU dcl(C D). By
U
U Thus every
U P)
U 4 = G.
Lemma 3.2.1, there exists a 'nite subset RU of G(C)
such that R(
element of GU is in dcl(C D).
However, we wanted this conclusion for a quotient of G itself. Let G1 be the intersection of all conjugates of N (K) in G; since N (K) has 'nite index in G, so does
G1 . Let N be the intersection of all conjugates of K; since K is 'nite, it is a 'nite
intersection. By the assumption on C, we have
N =
gi1 Kgi ; g1 ; : : : ; gm G(C):
i=1;:::;m
We have (gi1 (N (K))gi )=(gi1 Kgi ) dcl(C D), so G1 =(gi1 Kgi ) dcl(C D) for each
i, and it follows that G1 =N dcl(C D). But N is normal in G, and there exists in
(G=N )(C) a set of coset representatives for G1 =N ; thus G=N dcl(C D).
Remark 3.3.5. In Lemma 3.3.4, one can 'nd a 0-de'nable 'nite N normal in G, with
G=N D-internal.
Proof. Let N = {
(N ) :
Aut(U)}, where U is the universal domain. Then N
N so N is 'nite. It is Aut(U)-invariant, hence is 0-de'nable. By 'niteness, N =
i=1; :::; m
i (N ) for some
1 ; : : : ;
m Aut(U). Then G=(
i (N )) is D-internal, and G=N
embeds de'nably into the product of these groups.
Internalizing groups (General case): Recall that a CD b i& a; b have the same
type over C D. Observe that CD is -de'nable: a CD a i& for every formula
132
E. Hrushovski
E. Hrushovski
67
5(x; y1 ; : : : ; yr ) over C, E 5 :
133
68
Let C continue to range over sets of the type considered above. By Lemma 3.1.3,
there exists a (0-)-de'nable normal subgroup K of G, commensurable with all the
U
U
C) (a; C are
K(p;
C), and such that a generic element of K lies in some generic K(p;
U
independent). Moreover, (using also the proof of the previous claim), if a K(p;
C)
for any p and any generic C, then a K . It remains only to show that G=K is
134
E. Hrushovski
E. Hrushovski
69
set of the same rank k0 . Since also rk(CD (a)) = k0 , CD (a) is contained in 'nitely
U
U
many cosets of K(p;
C). Since K(p;
C) and K are commensurable, the same is true
for K .
Claim 6. If a ; b Pj and a CD b ; then there exist a; b P; a CD b; with 4j (a) =
a ; 4j (b) = b . (And a may be chosen arbitrarily; with 4j (a) = a :)
Proof. Since D is stably embedded, there exists an automorphism
of U 'xing C D
with
(a ) = b . Pick any a P with 4j (a) = a , and let b =
(a).
It follows from the last two claims that CD has 'nite classes on Gj . By Lemma
3.3.4, there exists a 'nite normal subgroup N of Gj such that Gj =N is D-internal,
and
for any a P and n N ; tp(na=C D) = tp(a=C D):
I claim that N = 1. Let n N , and lift it to n G. We must show that n K . For
this it suQces to show that n K(p; C) for generic C. This follows from Claim 6.
Thus N = 1 so Gj is D-internal.
3.4. Stability and modularity
We introduce here one of our central notions, of a locally modular group. In ordinary
di&erence 'elds of characteristic 0, these groups will be stable and stably embedded.
It is misleadingly easy to de'ne modularity in group-theoretic terms, but the more
abstract point of view of stability is better suited to analyze the relation of such a
group to its environment (or of two such groups), a relation that need not a priori be
group-theoretic. We start with a property of independence in stable theories.
Lemma 3.4.1. If a; b are independent over acl(C {a}) acl(C {b}); and (a; b) is
independent from C over E C; then a; b are independent over acl(E {a}) acl(E
{b}).
Proof. Let E = acl(E {a}) acl(E {b}), C = acl(C {a}) acl(C {b}). It suf'ces to show that a; b are independent from C over E . Note that a is independent
from C {b} over E {b}, since acl(C {b}) = acl(C {b}). Thus (a; b) is independent from C over E {b}. Similarly (a; b) is independent from C over E {a}.
It follows that the canonical base of tp(ab=C ) is contained in acl(E {a}) and in
acl(E {b}), hence in their intersection E .
Denition 3.4.2. A theory is called 1-based if it is stable, and in any model M of T ,
any two algebraically closed substructures of M eq are independent over their intersection. A structure is 1-based if the theory of that structure is that.
An equivalent condition: A saturated stable structure M is 1-based i& the lattice
of algebraically closed substructures of M eq (including imaginary elements) satis'es
135
70
136
E. Hrushovski
E. Hrushovski
71
Proof of Proposition 3.4.1. We will use the following characterization (cf. [5], lemmas
on stable embeddability).
() B is stable and stably embedded i& for any subset X of the universal domain,
of size D = D0 , there are at most D types of elements of B over X .
If () holds for B, then it certainly holds for C and for every 'ber of g, as they are
interpretable in the structure B (with a named parameter). Conversely, assume () holds
for C and for every 'ber of g. Then there are 6D possibilities for tp(c=X ), with c C.
For any given c C, there are 6D possibilities for tp(a=cX ), with a g1 (c). Thus
there are also at most D2 = D possibilities for tp(bc=X ), b B, c = g(b); equivalently,
6D possibilities for tp(b=X ).
Now for 1-basedness.
Note 'rst (#): if a de4nable set D is stably embedded in B and 1-based, and X D,
Y B, then X; Y are independent over acl(X ) acl(Y ).
Indeed with Y = acl(Y ) Deq , we have X; Y independent over (acl(X ) Deq ) Y ,
while by stable embeddability we have Y independent from X Y over Y , and transitivity applies.
More generally, in (#) we may take X acl(D) instead of X D; since then with
X = X Deq , we have acl(X ) = acl(X ).
In fact, (##): Suppose that for some set F independent from X and some
F-de4nable set D that D is stably embedded and X acl(D F). Then for any Y ,
X
72
D always have a canonical base e of de'nition with tp(e) D-internal; and there exists
a representative of the germ, de'ned over e. So if t dcl(X1 Y ) dcl(D2 ), write
t = f(a1 ; b), a1 X1 ; b Y , r = tp(a1 ); then the r-germ g of f( ; b) is in dcl(b) and has
a D2 -internal type; hence g Y2 . Let F be a g-de'nable function agreeing with f( ; b)
on generic realizations of r; then F(a1 ) = f(a1 ; b) = t, so t dcl(a1 ; g) dcl(X1 Y2 ).
Remark. A parallel but more formal proof of the same Claim, using algebraic closure
throughout, may be given along the lines of Proposition 3.4.1; see the proof of (2)
below.
Note that any formula '(u) implying D2 (u) is de'ned over D2 (by stable embeddedness of D2 ); so if we accept the claim, and if ' is also de'ned over X1 Y , then it
is de'ned over X1 Y2 . Thus tp(Y=X1 Y2 ) implies tp(Y=X1 D2 ), so
D2 :
X1 Y2
hence as X dcl(X1 X2 ),
Y X
Y:
X1 Y 2
(acl(X )acl(Y ))
but Y2 X so
For we may assume (X; Y ) is independent from F; so (#) is valid over F; and by
Proposition 3.4.1 it holds over .
Moreover in the conclusion of (#) or (##), we may say X; Y are independent over
acl(acl(X ) acl(Y ) Deq ). For if W = acl(W ) acl(D) then W = acl(W Deq ).
It follows that if B is stable and 1-based, then so is every interpretable set D. Indeed
if X; Y are relatively algebraically closed subsets of Deq , then they are independent over
acl(X ) acl(Y ) Deq .
Let us now prove (3): Let X; Y D1 D2 be relatively algebraically closed where
D1 ; D2 are 1-based, stably embedded. By naming parameters, we may assume X Y
dcl(). We wish to show that X; Y are independent. Let Xi = dcl(X ) dcl(Di ),
Yi = {b Y eq : tp(b) is Di -internal}:
X1
Now to prove (2), assume C and every 'ber of g are stable, stably embedded, and
1-based. Then every 'nite union of 'bers of g is also stably embedded, and 1-based.
We also already know that B is stable. Let X; Y be algebraically closed subsets of
Beq . We must show that X; Y are independent over X Y . By naming parameters, we
may assume X Y dcl(). We may assume X = acl(b), where b = (b1 ; : : : ; bn ); denote
gb = (gb1 ; : : : ; gbn ). Then gb X0 = X C eq . By (#), X0 ; Y are independent. Let Y be
the canonical base of tp(b=Y ). Then b Y , so we may assume Y = acl(Y ). Now if
Y
Y1
(b; b ; : : :) lies in a (gb; gb ; : : :)-de'nable 1-based stably embedded set (namely the union
of the 'bers of g above the elements gbi ; gbi ; : : :). Thus the hypothesis and hence the
conclusion of (##) apply to Y .
We prove the claim using the theory of germs of de'nable functions in a stable
theory; cf. [12]. In a stable theory, germs of de'nable functions into a de'nable set
137
X1
Y X
X1
138
E. Hrushovski
E. Hrushovski
73
is surjective.
Recall that two de'nable sets p; q are orthogonal if for any base set B over which
p, q are de'ned, and any a realizing p and b realizing q, a; b are independent over
B. The term hereditarily orthogonal is sometimes used, but the distinction this reRects
will not be important for us. Orthogonality is inherited by powers. In a language to
be introduced later, the following lemma says that orthogonality of 'nite rank groups
implies complete orthogonality.
Lemma 3.4.9. Let G1 ; G2 be de4nable groups of 4nite rank. Suppose G1 ; G2 are
orthogonal; and at least one of them is stably embedded. Then every de4nable
R G1 G2 is a 4nite union of rectangles X1 X2 .
Proof. Say G2 is stably embedded, and work over an algebraically closed base set
B. Let R G1 G2 be B-de'nable. Let (a1 ; a2 ) R. Since G2 is stably embedded,
R(a1 ) = {y G2 : (a1 ; y) R} is de'nable with a parameter from G2 . Taking a canonical
parameter c, we have c acl(B; a1 ), hence by the hypothesis of the lemma, c acl(B) =
B. Thus R(a1 ) = X2 is B-de'nable. Let X1 = {a G1 : R(a) = X2 }. Then X1 is a
B-de'nable subset of G1 , (a1 ; a2 ) X1 X2 R. So R is a union of B-de'nable rectangles. By compactness, it is a union of 'nitely many.
Remark 3.4.10. By the structure theory we are about to prove, for de'nable groups of
'nite rank in ordinary di&erence 'elds of characteristic 0, the assumption that at least
one of G1 ; G2 is stably embedded is unnecessary. For if Gi is not LMS, we will see
that Gi has a de'nable subquotient of the form Hi (k), Hi an algebraic group over the
constants. But then H1 ; H2 are not orthogonal, and hence neither are G1 ; G2 .
3.5. Algebraic modularity
In generalizations to two automorphisms or to positive characteristic, stability is lost,
and di&erent proofs and de'nitions must be given. Initial steps in this direction were
taken in [8]. Here we will refer to such modularity only in passing, essentially as
a convenient summary of facts about the one-automorphism case. Thus we will not
develop here the theory of modularity in simple theories.
139
74
At the level of de'nitions, we will use the ambient Zariski topology. (We will
use little more about the ambient 'eld, than that every de'nable set is a Boolean
combination of closed ones.)
Let ZCl denote Zariski closure, and ZClk closure for the k-Zariski-topology. ZCl0 =
ZCl .
Denition 3.5.1. Let A be a set of points of an algebraic variety V , over an algebraically closed 'eld L. Assume A is Zariski dense in a subvariety V of V , de'ned
over k L.
Let AZar be the structure whose universe is V , and whose basic relations are the
closed sets ZCl(X ), X An ; i.e. those Zariski-closed sets W V n such that W An is
Zariski dense in W .
Let AZark denote the structure whose universe is V , and whose basic relations are
the sets ZClk (X ), X An .
Denition 3.5.2. Let Azar denote the structure whose universe is A, and whose basic
relations are the sets U An , where U is a subvariety of V m , de'ned over the prime
'eld.
Lemma 3.5.3. Let X Am be arbitrary. Then ZCl(X ) is de4nable in AZark ; using
parameters from A. Conversely; there exists a countable k0 L such that if k0 k;
then every basic relation of AZark is also de4nable in AZar . Thus AZar has an essentially
(i.e. up to constants) countable language.
Thus AZar has an essentially (i.e. up to constants) countable language. Note as a
corollary that for any two suQciently large 'elds k ; k , AZark ; AZark di&er only by
constants.
Proof of Lemma 3.5.3. Let Y V m , X = Y Am , Y = ZCl(X ). Then as a variety Y is
de'ned over k(A), hence over k(a1 ; : : : ; al ) for some a1 ; : : : ; al A. Let a = (a1 ; : : : ; al ),
and let W = ZClk ({a} X ). Then W is a subvariety of V l+m , and W (a) = Y . By
de'nition of AZark , W is one of its basic relations; thus Y is de'nable in AZark , with
parameters from A.
If k0 k, the same proof shows that an AZark -de'nable set is AZark0 -de'nable, with
parameters. Thus it suQces to prove the converse for one countable k. Take k to be
the universe of an elementary submodel of (L; +; ; A). If U V m is a basic relation
of AZark , let X = U Am , Y = ZCl(X ), and let W be as above, so that Y = W (a) for
some a. But for any a k l , if U (k) Am W (a ) U then U = W (a ) (here U (k) is
the set of k-points of U ). By elementarity, for any a Ll , if U (L) Am W (a ) U
then U = W (a ). So U = W (a) = Y . Thus every relation of AZark is also one of AZar ;
we have already seen the other direction.
Lemma 3.5.4. Let U be any subvariety of V m . Then U Am is de4nable in Azar (with
parameters).
140
E. Hrushovski
E. Hrushovski
75
141
76
142
E. Hrushovski
E. Hrushovski
77
So tr:degK K(b;
(b); : : : ;
i1 (b)) = tr:degK K(b;
(b); : : : ;
i (b)) for some i6k, and
i (b) K(b;
(b); : : : ;
i1 (b))a :
Applying
, transitivity of algebraic closure, and induction, we have
j (b) K(b;
(b); : : : ;
i1 (b))a
f or all j i:
78
06i6d
d
on V ), Z Z is 'nite-to-one, so dim(Z)6dim(Z)
= d. Let V (Z) be the projection
So V (Z) is a constructible set, and dim V (Z)6dim Z = d. Since E is
to H of Z.
143
by
fm (y1 ; : : : ; yn ) =
mi f(yi ):
(By applying 4, one sees that f indeed goes into A.) Then f is aQne-homomorphic
i& fm = 0, for all m.
Say that f is almost homomorphic if fm has 'nite image, for all m as above.
Equivalently (in a universal domain), there exists a countable subgroup 2 of G, such
that the composition Y G (G=2) extends to an aQne homomorphism Y (G=2).
If f is an almost homomorphic section, we will also say that the image S of f is
an approximately homomorphic section. Note that S determines f.
If 2 can be taken to be a 4nite subgroup of G, we will call S a virtually homomorphic section. Any de'nable subset of A, or of a coset of A, is also a de'nable
subset of G.
One can take sums of subsets of A, and approximately homomorphic sections of
the previous types. One can also do this modulo a subgroup of A. We thus arrive
at the following de'nition: If N is a de'nable subgroup of A, we have an induced
de'nable exact sequence, 0 (A=N ) (G=N ) B 0. If T a de'nable subset of A=N ,
S (G=N ) an approximately (virtually) homomorphic section, then the pullback to G
of S + T is called an approximate (virtual) rectangle.
144
E. Hrushovski
E. Hrushovski
79
80
All the required properties of X hold, except that we must still show that K is a
subgroup of 'nite index of H A (i.e. with our assumption K = 0, that H A is 'nite).
Note that each element of J is an alternating sum of 2n elements of B1 . Let R J G
be the relation:
(1)i ai
R(c; b) i& there are a1 ; : : : ; a2n B1 with c =
06i62n1
145
and b =
0 6 i 6 2n 1(1) f(ai ):
Then for each c J there are 'nitely many b G with R(c; b).
Let W be the subgroup of A generated by m fm (Cm ). Then W is a countable
group. f extends to a well de'ned map fU : J G=W , namely
(1)i f(ai ):
fU
(1)i ai =
Moreover, fU is a group homomorphism.
U
U 1 (((H A) + W )=W ). fU induces an isomorphism between E=ker(f)
Let E = (f)
and ((H A) + W )=W . The graph of this isomorphism is the image of the de'nable
relation R, after factoring out the possibly nonde'nable subgroup W . The orthogonality
U and ((H A) + W )=W
condition (ii) applied to R shows immediately that E=ker(f)
must be 'nite. Hence (H A)=K is at most countable, so being de'nable it is 'nite,
as required.
We include also the analog of the socle lemma, Proposition 4.3 of [14]. Here we
use stabilizers in the sense of theories of 'nite S1-rank [5]. This lemma will not be
used in our application.
Proposition 3.6.2. Let G be a de4nable Abelian group of 4nite S1-rank; in some
structure. Let A be de4nable subgroup; X a de4nable subset of G. Assume:
(i) Every acl(G=A)-de4nable subgroup of A is commensurable to an acl(0)-de4nable
subgroup.
(ii) G has no de4nable subgroup A containing A with A =A in4nite and A acl(Y; A;
C) for some rank-one Y and 4nite C.
(iii) For any complete type X X over an algebraically closed set; Stab(X ) A is
4nite.
Then X is contained in 4nitely many cosets of A; up to a set of smaller rank.
Proof. Using compactness, we may replace X by a complete type over an algebraically
closed base set C; we must show that X is contained in a single coset of A. Let
X=A = {x + A: x X }. For b (X=A), let A(b) be b viewed as a coset of A. Let b be
an element of X=A. Then X (b) = X A(b)
= . As X is the solution set of tp(a=C) for
a X (b), and b dcl(Ca), X (b) is the solution set of a complete type over C {b}.
Let X1 be the solution set of some type over acl(Cb), extending this type.
146
E. Hrushovski
E. Hrushovski
81
147
82
148
E. Hrushovski
E. Hrushovski
83
We may factor out the bounded, in'nitely de'nable (=pro-'nite) group of S, the
kernel of the projection to G. Then we get a well-de'ned map s : K H , with graph
S; it has bounded kernel.
Now recall that H is a projective limit of a de'nable projective system (Hi ; hij )
of de'nable groups and maps in U ; we have hi : H Hi . Then by compactness, for
some i, hi s has bounded, i.e. 'nite kernel.
Remark. (1) The assumption of 'nite S1-rank is used only to make use of the theory
of group generics and stabilizers. The proof is thus valid in any context in which these
are available.
(2) In a 'nite S1-rank theory, an -de'nable group is an intersection of de'nable
groups (unpublished preprint On PAC and related structures). Using this and some
cosmetics, the conclusion of the proposition can be improved to: every U -de4nable
group admits a U -de4nable homomorphism into a U-de4nable group, with 4nite
kernel.
Lemma 4.0.1. Let G be a de4nable group; de4ned over a 4nite or countable set
C; and suppose any two elements of B have distinct types over C k; k = Fix(
).
Then there exists a de4nable homomorphism F : G H (k); H a k-algebraic group; F
injective; with FG of 4nite index in H (k).
Proof. By Chatzidakis and Hrushovski [5], every type over k is de'nable (k is stably
embedded). Hence any element of G is in dcl(C k). By compactness, there is a 'nite
de'nable partition of G into subsets Gi , and de'nable surjective maps fi : Ri Gi , with
Ri a de'nable subset of k ni . Now the relations induced on the Ri by pulling back the
graph of multiplication are de'nable over k; this uses again the stable embeddedness
of k. Putting the pieces back together, and it follows that G is de'nably isomorphic to
a de'nable group G over k. Now every de'nable group over k has the stated form,
by Hrushovski and Pillay [15].
Lemma 4.0.2. Let G be a de4nable group; de4ned over a set C0 . Suppose C0 C; C
a 4nite or countable set; and every element of G is algebraic over C k. Then there
exists a de4nable homomorphism F : G H (k); H a k-algebraic group; ker F 4nite;
with FG of 4nite index in H (k).
Proof. By Corollary 3.3.6 G has a 'nite normal subgroup K with G=K internal to k.
By Lemma 4.0.1, G=K is of the required form.
Remark 4.0.3. In Lemma 4.0.2, one can take ker F to be C0 -de'nable (but not necessarily F itself).
Proof. Let F : G H (k) be as in the conclusion of Lemma 4.0.2. Let K = ker(F). Let
K = {$(K): $ Aut(U=C0 )}; where U is the universal domain. Then K K so K
is 'nite. It is Aut(U=C0 )-invariant, hence is C0 -de'nable. By 'niteness, K = i=1;:::; m
$i (K) for some $1 ; : : : ; $m Aut(U=C). Let hi : G Hi (k) be the conjugate of h : G
149
84
150
E. Hrushovski
E. Hrushovski
85
n
Lemma 4.1.5. Suppose A is a simple Abelian variety; and for all n; A and A
are not
isogenous Abelian varieties. Then every
-de4nable subgroup of A is commensurable
to a Zariski closed subgroup.
Proof. Immediate from Lemma 4.0.3; since every Zariski closed subgroup S of
A
n A is commensurable to a product of Zariski closed subgroups of the
i A.
Lemma 4.1.6. Suppose Ai is a simple Abelian variety; with no Ai isogenous to
k Aj
unless i = j. Then every
-de4nable subgroup of Bi Ai is commensurable to a product
of
-de4nable subgroups of Ai .
Proof. Similar to the previous Lemma 4.1.5.
Lemma 4.1.7. Let A; E be Abelian varieties; A simple; E isogenous to An ; and let B be
a de4nable subgroup of E. Then there exist de4nable homomorphisms Fj : E
m( j) A
such that B is a subgroup of 4nite index of i=1;:::;l Ker(Fi ). For each j; Fj has the
k
m( j)
such that
where
for i
= i and j; j 0; (Ai )
(A ) . (Thus the Ai are representatives for the equivalence relation generated by isogeny and
-conjugacy; and the ni is the multiplicity
of Ai in A for this notion.) Then
E (A)
Mni E (Ai ):
i
16i6m
Proof. This follows easily from Lemma 4.1.6; it will not be used, except as motivation
for considering the case of A simple.
151
86
152
E. Hrushovski
E. Hrushovski
87
the Claim, hence can be viewed as an element of E (A); thus
m( j) Fj = sj for some
sj E (A), and Ker(Fj ) = Ker(sj ). We conclude.
Claim 2. Every de4nable subgroup of Ap is commensurable with one de4ned by a
4nite number of E (A)-linear equations.
Claim 3. An element of E (A) is a unit in Q E (A) i6 it has 4nite kernel.
Proof. Suppose g E (A) is not invertible in Q E (A). Then it is not a homogeneous
p
element. So after multiplying by a power of $, it can be written as i=0 ai $i , with a0
= 0
and ap
= 0, p0. Opening up the de'nition of $ and multiplying out, we can also write
p
g = i=0 bi
i , where bi is a de'nable homomorphism from
i A to A, and b0
= 0, bp
= 0.
p
Let C be the principal component of the subgroup of A A
de'ned by
bi xi = 0,
and let S = {((x0 ; : : : ; xp ); (y0 ; : : : ; yp )) C
C: x1 = y0 , x2 = y1 ; : : : ; xp = yp1 }. Then
S projects onto C and onto
C, using the surjectiveness of b0 and bp . Thus by the
axioms of model completeness, Lemma 2.2.1, there are in'nitely many x C with
(x;
x) S. This implies that Ker(g) is in'nite. The other direction is trivial, since
ker[n] is 'nite for all n.
The same argument shows that the map from E(A)$ of De'nition 4.1.3 to Q E (A)
is injective, hence an isomorphism. By Claim 2 with p = 1, every de'nable subgroup
B of A is commensurable with one of the form {a A: si a = 0; i = 1; : : : ; q}. However,
by Lemma 4.1.4, the left ideal of Q E (A) generated by {s1 ; : : : ; sq } is generated by
a single element s. We may replace s by Ms with Ms E (A); then B Ker(Ms). This
proves (4).
Claim 4. Every de4nable endomorphism of A is in E (A).
Proof. Let e be a de'nable endomorphism of A, and let E be the graph of e, a subgroup
of A2 . Consider
I2 = {(f; g) E (A)2 : f(x) + g(y) = 0 for any (x; y) E}:
This is a submodule of E (A)2 . By Claim 2, E j {(x; y): fj (x) + gj (y) = 0} for
certain (fj ; gj ) I2 . Let I be the second projection of I2 , an ideal of E (A). We claim
that Q I is the unit ideal of Q E (A). Otherwise, Q I is generated by some g
which is not a unit, hence by Claim 3 has in'nite kernel K. Each gj is a multiple of g in
Q E (A). If a K, then gj (a) = 0 for each j, so a subgroup of 'nite index of (0)K
is contained in E. This contradicts the fact that E is the graph of a homomorphism.
Thus 1 Q I , so M I for some integer M . Thus for some f E (A), f(x)+My = 0
for all (x; y) E. So Me(x) = f(x), hence Me E (A), and so e E (A).
Condition (3) follows from Claim 4.
Claim 5. Let f; g E (A). Then Ker(f)6 Ker(g) i6 for some h Q E (A),
g = hf.
153
88
Proof. One direction is evident. For the other, we may assume upon multiplying by
an integer that Ker(f) Ker(g). Thusone may de'ne an endomorphism h of A by
g(x) = h(f(x)). By Claim 4, h E (A).
Proof of (5) and (6). The inclusion ordering on de'nable subgroups of A, up to commensurability, is now known to be isomorphic to the divisibility ordering on elements
of Q E (A). Thus (5) is immediate. For (6), suppose f is a de'nable endomorphism.
Then Im(f) Ker(g) for some g. So ngf = 0 for some n
= 0. But the ring E (A) has
no zero-divisors, so f = 0 or g = 0. In the latter case, f is surjective.
Remark 4.1.10. A semi-Abelian variety S has only countably many de'nable subgroups.
Proof. By Proposition 4.0.3, any subgroup of S is commensurable to one determined
n
by an algebraic subgroup of S S
for some n. It is well known that there
are at most countably many algebraic subgroups of a semi-Abelian variety (any such
subgroup is the Zariski closure of the torsion points within it). Thus there are countably
many de'nable subgroups up to commensurability. If H is a de'nable subgroup of S,
for any n, the map x nx has 'nite kernel on H ; so nH has the same rank as H ; so
[H : nH ] is 'nite; thus H has 'nitely many subgroups of index n. Similarly, S=H has
only 'nitely many torsion points of order N , so H has only 'nitely many supergroups
of index n. Thus there countably many de'nable subgroups altogether.
Lemma 4.1.11. Let A be a simple Abelian variety.
(a) Let D (A) be the ring of de4nable homomorphisms A A=T; where T is a
de4nable subgroup of A of 4nite rank. One identi4es h with 4h if h : A A=T; T T ;
and 4 : A=T A=T is the natural projection. Then with the natural addition and
multiplication; D (A) is a division ring. E (A) embeds into D (A). Every element of
D (A) can be written as fg1 with f; g E (A) (or alternatively as f1 g).
(b) E (A) is an Ore ring: for any f; g E (A) (0); for some u; v E (A) (0);
gu = fv.
Proof. The fact that D (A) is a division ring is immediate, by de'ning inverses. The
fact that every element of D (A) is a quotient of elements of E (A) can be shown as
in Claim 4 of De'nition 4.1.2. Condition (b) follows formally.
Lemma 4.1.12. Let A be an Abelian variety; B a de4nable subgroup of 4nite rank.
Then B is LMS i6 every c-minimal de4nable subgroup of gB is LMS; for any
g E (A). If B = Ker(f); f = f1 : : : fr fi E (A) irreducible; then B is LMS i6
Ker(fi ) is LMS for each i.
Proof. This reduces easily to the case of simple A. Let E (A) be the ring of de'nable
endomorphisms of A, tensor Q. By Proposition 4.1.1 we have B Ker(f) for some
f E (A). If f is a unit there is nothing to prove. Otherwise we use induction (say
154
E. Hrushovski
E. Hrushovski
89
Noetherian induction on the left ideal generated by f, or on the rank of B). If f = gh,
with h not a unit, then Ker(h) Ker(f). If hB B, we are done using the exact
sequence
0 Ker(h) B hB 0:
Otherwise we have the de'nable exact sequence
0 Ker(h) B h1 B B gB 0:
In this sequence, the map Ker(h) B h1 B is the inclusion. The next map is given
by h. The next is given by g. One has gh = 0 on B. One must also verify that if x B,
g(x) = 0, then x = h(y) for some y B. By Proposition 4.1.1(6), we have x = h(y) for
some y; and f(y) = gh(y) = g(x) = 0, so y B. Thus the sequence is indeed exact.
Now g annihilates Ker h, so gB has smaller dimension than B. On the other hand
hB has smaller dimension than B since 0
= Ker h B, and we assumed that hB * B,
so dim(B h1 B)dim(B). In either case we are done by Propostion 3.4.1 and
induction.
Proposition 4.1.2 (Structure of c-minimal subgroups). Let A be a simple Abelian
variety; B a c-minimal de4nable subgroup of A (up to commensurability).
(a) Precisely one of the following cases occurs:
(i) B = A.
(ii) B is de4nably isomorphic to a subgroup of 4nite index of H (k); k = Fix(
);
H a k-algebraic group.
(iii) B is LMS; of U -rank one.
n
(b) Case (i) occurs i6 A is not isogenous to A
for any n0. Case (iii) occurs if
n
A is isogenous to A for some n0; but not isomorphic to an Abelian variety
A de4ned over Fix(
n ) for some n0.
Thus we may assume A is de4ned over Fix(
n ).
(c) Suppose A is de4ned over Fix(
n ). Then B is not LMS if and only if B Ker(
N
1) for some N (with n|N ).
(d) Suppose B is a c-minimal de4nable subgroup of the multiplicative group Gm .
Then (c) holds: B is not LMS if and only if B Ker(
n 1) for some n.
n
Proof. If A is not isogenous to A
for any n, we have already shown that A has no
n
proper de'nable subgroups. Conversely, if A is isogenous to A
, say via an isogeny f,
then one cannot have B = A since for example {a:
n (a) = f(a)} is a smaller subgroup;
and it is clear that any B must have 'nite rank. Suppose from now on that indeed B
has 'nite rank. If B is LMS, then it has U -rank one since every in'nite LMS group
has a rank one de'nable subgroup, and B is c-minimal. Suppose B is not LMS; we
must show that A is isomorphic to an Abelian variety A de'ned over Fix(
n ) for some
n, and that (c) holds. Choose a base C such that A,B are de'ned over C, and there is
a minimal type X B, also over C. By Lemma 3.2.2, X generates (in boundedly many
steps) a coset of an in'nitely-de'nable subgroup B of B, and B is the intersection
155
90
156
E. Hrushovski
E. Hrushovski
91
92
0 L A 4A A 0
With L a vector group; with the following universal property: For any extension
0 L G A 0
there exists a unique homomorphism h from the A exact sequence to the G exact
sequence; above the identity on A. We have h = (hL ; h; IdA ); and G = hL (G) in the
sense of De4nition 4:2:3.
Proof (Serre [28]). We take L = H 1 (A; OA ) . Given any e L ; we form e(G ); and
obtain an element of H 1 (A; OA ). This describes a map L H 1 (A; OA ); whose dual is
hL . One shows that hL (G) G . Uniqueness of h can be seen by applying Lemma 4.2.5
below: the graph of h is the unique minimal subgroup of {(x; y) G G : 4(x) = 4(y)}
projecting onto the diagonal of A.
The map A A can be made into a functor; given a homomorphism of Abelian
varieties
e : A B;
Lemma 4.2.2. Every 4nite rank de4nable subgroup of a vector group Gam is nonorthogonal to the 4xed 4eld k; and indeed is a 4nite-dimensional k-space. A 4nite
rank de4nable group is a vector group i6 it embeds into an algebraic vector group.
157
e : A B:
158
E. Hrushovski
E. Hrushovski
93
Let
G = {(g; g ) A B : e(4A (g)) = 4B (g )}:
Let L = (Ker(4A ) Ker(4B )) G . Then 0 L G 4A pr1 A 0 is exact. The
universal property of Ga gives a map h : A G . We let e = pr2 h.
We can also describe e using the following lemma.
Lemma 4.2.5. Let 0 L G 4 A 0 be an exact sequence of algebraic groups;
with L linear; and A an Abelian variety. Then there exists a unique minimal H G
with 4(H ) = A.
Proof. 4(H ) = A i& G=H is a linear group. If H1 ; H2 have this property, so does their
intersection H1 H2 .
Then the graph of e is the unique minimal subgroup of A B projecting onto the
k
k
graph of e. In particular, for any algebraic homomorphism A A
; we get e : A A
.
Using Proposition 4.1.1, we obtain a homomorphism from the de'nable endo i
i
we map h =
morphisms of A to the de'nable endomorphisms of A:
ei
to h =
ei
.
Finally, if N = Ker(h); we let N = Ker(h). It is easy to see that this is well de'ned;
and that 4(N ) = N .
The following proposition shows that every de'nable subgroup of A is the pullback
of a subgroup of the vector group ker(4 : A A); under a certain homomorphism,
whose kernel is one of the canonical groups N ; N a subgroup of A.
Proposition 4.2.1. Let A be an Abelian variety; 4 : A A the maximal vector extension A. Let N = Ker(h) be a de4nable subgroup of A of 4nite rank; and let N ; h be
Then:
the corresponding subgroup and endomorphism of A.
(1) N has 4nite rank. There is an exact sequence
0 (L N ) N N 0
whose kernel L N is a de4nable vector group.
(2) There exists a de4nable exact sequence
0 N 41 (N ) ker(4) 0
(3) 4(N ) projects onto N; and is minimal in the following sense: Let M be any
de4nable subgroup of G; projecting onto a 4nite index subgroup of N . Then M
contains a subgroup of N of 4nite index.
Proof. (1) Is clear.
(2) The map h takes 41 (N ) to 41 (h(N )) = 41 (0) = ker(4). The kernel of this
map, by de'nition, is N .
(3) Left to the reader; uses Lemma 4.2.5.
159
94
160
E. Hrushovski
E. Hrushovski
95
161
96
(x); : : : ;
m (x))).
The kernels of the maps f are (up to commensurability) the de'nable subgroups of
the groups A, . We leave the remaining details to the reader.
162
E. Hrushovski
E. Hrushovski
97
163
98
In this subsection, we work in a universal domain U for the theory of 'elds with r
automorphisms,
1 ; : : : ;
r . F denotes the free group generated by
1 ; : : : ;
r . If $ F;
we denote by (U; $) the structure consisting of the underlying 'eld of U; and the automorphism $. It is a universal domain for ordinary di&erence 'elds. Write Ui = (U;
i ).
Recall the action of F on di&erence equations. We obtain an induced action on the
class of all de'nable sets, that we denote: B B$ . Also write BF = $F B$ .
In the ordinary case one could indi&erently study de'nable subgroups, or in'nitely
de'nable ones; the latter were connected components of de'nable subgroups. For r1
the situation is di&erent; one must allow in'nite intersections of de'nable subgroups in
order to obtain a group of 'nite rank. It remains true (with the same proof) that every
de'nable group maps into an algebraic one, with 'nite kernel. Finite rank subgroups
of the additive group are vector spaces over the common 'xed 'eld k of all the
automorphisms. Minimal de'nable groups thus live (up to isogeny) in simple Abelian
varieties, or in Gm ; as before.
Finite transformal degree: Consider a simple Abelian variety (or torus) G de'ned
over k. Let E = Q End(G). The twisted group ring E[F] is de'ned in the obvious
way, taking into account the action of the
i on End(G). (Of course this ring is no
longer Euclidean.) Let A be an -de'nable subgroup of G; of 'nite transformal degree.
Associate to A a left ideal and a two-sided ideal of E[F]:
I0 (A) = {r : rA is 'nite},
I (A) = {r : rsA is 'nite for all s E[F]},
R(A) = E[F]=I (A).
Lemma 4.5.1.
dimE E[F]=I0 (A)6dim(A).
dimQ E[F]=I (A)6(dim(A)dimQ E)2 .
There exists an -de4nable subgroup B of G; of 4nite transformal degree; containing A; such that I0 (B) = I (B) = I (A) and R(A) = E[F]=I (B).
Proof. Let d = dim(A). Note 'rst that
dimQE E[F]=I0 (A)] 6 dim(A):
To prove this it suQces to 'nd an E-dependence relation among any d + 1 elements h0 ; : : : ; hd E[F]. Let h(x) = (h0 (x); : : : ; hd (x)) and consider the subgroup h(A)
of G d+1 . This is a de'nable subgroup of transformal degree at most d. Thus the Zariski
closure has dimension at most d. Using the simplicity of G one obtains an E-linear
dependence relation, as in the case r = 1.
Thus dimQ E[F]=I0 (A)6dimQ (E) dim(A).
n
Let r1 ; : : : ; rn be a Q-basis for E[F]=I0 (A). Let B = i=1 ri (A). If r E[F]; then
mr =
ai ri + s with s(A) 'nite, m0 and m; a1 ; : : : ; an Z. So mr(A) B + s(A); i.e.
r(A) B has 'nite index. Thus r(B) B has 'nite index for any r E[F]. Now clearly
I0 (B) = I (B) = I (A).
164
E. Hrushovski
E. Hrushovski
99
165
100
We restrict ourselves to pointing out one class of ALM groups. It consists of groups
AF ; where A is a de'nable LMS group in some ordinary reduct Ui ; and of their products. The proof that AF is ALM is the same as that given in the previous paragraph;
but the proof that the property holds for products is di&erent, and may be of use in
other situations.
Proposition 4.5.2. Let Gi be a commutative algebraic group over U. Let k = Fix(F).
Let Ai be a Ui -de4nable subgroup; LMS of 4nite dimension as such; de4ned over k.
Then (A1 )F (Ar )F is algebraically modular.
For notational simplicity, we prove Proposition 4.5.2 in case r = 2. But in this
case we formulate a sharper statement, paying attention to the number of conjugates
used. Assume r = 2; and write
=
1 ; $ =
2 . Then Proposition 4.5.2 follows from
Lemma 4.5.3 (keeping in mind the proof of 3:5:6(3)).
Lemma 4.5.3. Let Gi be a commutative algebraic group over U (i = 1; 2); Gi de4ned
over Fix($j ) ({i; j} = {1; 2}); and V a constructible subset of G1 G2 . Let Ai be a
Ui -de4nable subgroup of Gi ; LMS of 4nite transformal degree di as such.
d1
(2d +1)d2 $n
n
Let E2 = n=d
A
2 ; E1 = n=01
A1 .
1
Then the Zariski closure of (E1 E2 ) V is a 4nite union of cosets of group
subvarieties.
Proof. Assume the data are de'ned over an algebraically closed di&erence 'eld K. For
b G2 ; let V (b) = {a G1 : (a; b) V }. Let Z(b) = ZCl(V (b) A1 ). As A1 is ALM,
A1 V (b) is a 'nite union of cosets of U1 -de'nable subgroups. By Hrushovski and
Pillay [15], in a 1-based group, there are no in'nite de'nable families of de'nable subgroups, so as b varies only 'nitely many distinct subgroups arise. Thus also upon taking Zariski closure, there exist K-algebraic subgroups H1 ; : : : ; Hl G1 ; such that for any
b G2 ; Z(b) is a 'nite union of cosets of the Hi . By Proposition 2.2.1, each such coset is
d
d
d
d
de'nable over K(b
1 ; : : : ; b
1 )a . Let G = G22d1 +1 . For b G2 ; let b = (b
1 ; : : : ; b
1 )
G. Let 4((yd1 ; : : : ; yd1 )) = y0 ; 4 : G G2 . By compactness, there exists constructible
sets Wi G1 G such that for any b G2 ; Z(b) = i Wi (b ); and for any y = (yd1 ; : : : ;
yd1 ) G; Wi (y) is a 'nite union of cosets of Hi ; and Wi (y) V (y0 ).
So far, we have not used $ at all.
Let Xi = {(a; b) E1 E2 : (a; b ) Wi }. Then (E1 E2 ) V = i Xi . So it suQces
to prove that ZCl(Xi ) is a 'nite union of cosets of group subvarieties, for each i.
Fix one value of i. Let H = G1 =Hi ; E = the image of E1 in H; B = A22d1 +1 G; V =
image of Wi in H G; X = {(a; b ) : (a; b) Xi }. Note that if b E2 then b B.
Applying Lemma 3.5.11 to this data, and to the automorphism $ (viewing E as a
possibly unde'nable group of points), we 'nd that ZCl(X ) = j Cj ; with Cj cosets
of group subvarieties. Write 4 also for the map (Id; 4) : (G1 G) (G1 G2 ). Then
4Cj is a constructible coset of G1 G2 (hence a coset of a group subvariety). We
have Xi = 4X j 4Cj . As X Cj is Zariski dense in Cj ; 4(X Cj ) is Zariski dense
166
E. Hrushovski
E. Hrushovski
101
167
102
168
E. Hrushovski
E. Hrushovski
103
104
Now suppose A is de'ned over the ring of integers R of a number 'eld K. A prime
p of K is a prime of good reduction for A if the following holds. The reduced variety
Ak over the residue 'eld k = Rp =p becomes also a commutative, connected algebraic
group. Moreover, dimab (Ak ) = dimab (A); and dimm (Ak ) = dimm (A).
Denition 5.0.8. Let p be a rational prime. Tp (A) denotes the group of points of A(L)
of 'nite order prime to p; where L is some algebraically closed 'eld over which A
is de'ned. If p is a prime of a number 'eld, we will sometimes write Tp (A) with
reference to the residue characteristic of p.
We begin with a well-known result of Weils concerning Abelian varieties; it generalizes without e&ort to arbitrary commutative algebraic groups. Fix a prime p; let
k = GF(q) be a 'nite 'eld of characteristic p; and let k a be an algebraic closure. Let
'q be the q-Frobenius automorphism of k a .
Lemma 5.0.9. Let A be a commutative algebraic group over k = GF(q). Then there
exists a polynomial F(T ) Z[T ] with no cyclotomic factors such that F('q ) vanishes on Tp (A). We have deg(F)62 dimr (A). The sum of the absolute values of the
coeEcients of F is at most (1 + q1=2 ) 2 dimr (A) .
Proof. Observe that if f; g are two complex polynomials, and s(f) denotes the sum
of the absolute values of the coeQcients of f; then (): s(fg)6s(f)s(g). This will be
used twice. First () permits a decomposition of A. Note that there exists over k an
exact sequence 0 L A AU 0; with L a linear algebraic group and AU an Abelian
U Thus if F1 ('q )
variety. This gives rise to an exact sequence 0 T (L) T (A) T (A).
U then F1 F2 ('q ) vanishes on T (A). This
vanishes on T (L) and F2 ('q ) vanishes on T (A);
reduces the problem to the three cases of linear tori, commutative unipotent groups,
and Abelian varieties. When A is an Abelian variety, the result comes from [31]. Weil
actually shows the existence of a monic F of degree 2 dimr (A) whose eigenvalues are
all of absolute value q1=2 ; and then we can use the observation () above.
When A is an algebraic torus, there exists an isomorphism g : A Gmn ; de'ned over
a 'nite 'eld GF(ql ). The Frobenius conjugate 'q (g) is another such isomorphism, so
is 'xed
= g 'q (g)1 is an algebraic automorphism of Gmn ; i.e. GLn (Z). So
by 'q ; and
l
169
'q ( ) 'l1
q ( ) = Id
using also that g is 'xed by 'ql . Now (A; q ) is isomorphic (via g) to (Gmn ; g'q g1 );
and g'q g1 ) = 'q . Now ( 'q )l = ('q )l = 'ql ; so the polynomial T l ql works.
When A is a unipotent group, it has no points of 'nite order prime to p; so the
constant polynomial 1 will do.
To lift this to characteristic zero, suppose now that A is a connected commutative
algebraic group over a number 'eld K; and p is a prime of good reduction. Then
170
E. Hrushovski
E. Hrushovski
105
Tp (A) A(L); where L is the maximal unrami'ed extension of the completion Kp of
K at p. The Frobenius automorphism '0 of k a lifts to an automorphism ' of L. 4 The
reduction map from L to k a induces an injective map on Tp (A). 5 It follows that '
satis'es the same functional equation on Tp (A) as 'q does on Tp (Ak ). Thus:
Lemma 5.0.10. With the above assumptions; there exists an automorphism
0 of K a
and an integral polynomial F with no cyclotomic factors; of degree dimr (A); absolute
coeEcient sum bounded by (1 + q1=2 ) 2 dimr (A) ; such that F(
0 ) vanishes on the primeto-p torsion points of A.
If (L;
) is any di&erence 'eld extending (K a ;
0 ); then F(
) vanishes on the primeto-p torsion points of A; since they all lie in K a . We may thus embed (K a ;
0 ) in a
universal domain (L;
) for di&erence 'elds, and work there.
106
on all torsion points. This requires more than the elementary number theory we have
used so far. If one is willing to quote Serre to show the existence of such an automorphism, results (1), (2) and (4) become equally easy for all torsion points, still
using the ordinary theory. (Condition (3) would require a p-adically continuous automorphism of this nature.)
The second route keeps the number theory of this paper at its present primitive
level, but uses two automorphisms instead of one. Using this method we prove Manin
Mumford for semi-Abelian varieties (5), and 'nd the explicit bounds (6). Since we
have not developed orthogonality beyond the 'nite rank context, this method does not
immediately yield the analogs of (1+ ) or (2). We expect these can be done either by
an ad hoc extension of the proof, or by developing orthogonality theory, but will be
content with (5) and (6).
6.1. Qualitative results
171
f(x; y) = x + y:
172
E. Hrushovski
E. Hrushovski
107
108
2D 2 log (1+q1=2 )
2
deg(Sq )6d+ r r
d+ r 2
:
There exists an automorphism
of K a such that F(
) vanishes on Tp ; the primeto-p torsion points of A.
If A is a semi-Abelian variety; {a A : (a; : : : ;
2dr (a)) S} is LMS.
Proof. Everything but the bound on degree has been demonstrated. Multiplication by
an integer M 2; in an Abelian group, can be achieved by 62 log2 (M ) 1 operations of addition or of multiplication by 2 (express M in base 2). Hence a linear
g
polynomial
m x = 0 can be expressed (using additional variables) by means of
i=0 i i
at most g + i (2 log2 (|mi |) 1)6(g + 1) 2 ( log2 (M )) additions or subtractions, where
M = max{|mi |; 2}.
In our case by Lemma 5.0.9, M 6(1 + q1=2 ) 2dr ; g62dr ; so we can express Sq as a
projection of the intersection of
(2dr + 1)2 log2 ((1 + q1=2 )2dr ) = 4dr (2dr + 1) log2 ((1 + q1=2 )
varieties of the form aj + aj = aj . By Lemma 2.1.2(1) and (2), this has degree at most
4dr (2dr +1) log2 ((1+q1=2 )
d+
173
(deg(X )2dr +1 d+ r
174
E. Hrushovski
E. Hrushovski
109
110
(2d1 +1)d2
d1
the Zariski closure of {x : (x;
(x); : : : ;
2dr (x)) S}. By Corollary 2.2.3, Z has a
dim(S)
. We have dim(S)62dr dim(X ),
projective embedding of degree at most deg(S)2
deg(S)6 deg(X )2dr +1 deg(Sq ). The result follows using Lemma 6.3.2.
Let S =
Remark 6.3.3. Assuming () of Section 6.2, we obtain a similar bound for all
torsion points, using two primes of good reduction but a single automorphism, as in
Proposition 6.2.1.
i=0
deg(S ) 6
Sq
j=d1
Sl ;
2|A| dim(A)
where each of the terms deg(S ), deg(Y ), |A| is estimated above. By Proposition 4.5.2
and Lemma 4.4.3, this also bounds the number of components of ZCl((Tp Tl ) Y ),
and hence of ZCl(T X ). To summarize:
Proposition 6.4.1. The number of components of ZCl(T X ) is at most
29 (dr +1)7 (log22 (1+q1=2 ))
d+
deg(X )2
j i
A = {$
: d1 6 i 6 d1 ; 0 6 j 6 d2 } {
$ : 0 6 i 6 (2d1 + 1)d2 ; 0 6 j 6 d1 }:
So A is a connected subset of F of size
|A| 6 16(dr + 1)3 :
Note that Y is the projection to A A of the intersection of the graph of addition on
A, with (P P X ). Thus
deg(Y ) 6 deg(X )d+ :
The MordellLang conjecture stated that , is ALM. Raynaud, Hindry and McQuillan
reduced it to showing that A(K) is ALM (Faltings theorem). We wish to show here
how to do the same with our methods. If F(
)(x) = 0 is an equation capturing the
and the orthogonality theory applies
torsion points, (
1)F(
)(x) = 0 will capture ,;
to ker(
1); ker F(
).
Pick a prime of K, of good reduction, with residue 'eld k of characteristic p, and
let
, p = {a A(K a ) : ma A(K) f or some integer m prime to p}:
We have as before
deg(Sq ) 6 d+ 4dr (2dr +1) log2 (1+q
1=2
For variety, we write the statements for varieties containing no translates of in'nite
group subvarieties. We 'rst give the statement for the prime-to-p torsion points.
so
deg
(2d1 +1)d2
Sq
1=2
6 d+ 16(dr +1)
log2 (1+q1=2 )
i=0
(2d +1)d2 $i
A l ; these amount to the
The last estimate refers to the equations for E1 = i=01
i
equations for Al , applied to $ (x) for each i; 06i6(2d1 + 1)d2 .
d1
j
Similarly, if E2 = j=d1 Aq , the corresponding equations have degree
d1
deg
Sl 6 deg(Sl )2d1 +1 6 32(dr + 1)3 :
j=d1
175
176
E. Hrushovski
E. Hrushovski
111
112
The same proof applies to any automorphism
with the same properties as
; and
since we chose a priori di&erence-'eld bounds on the 'nite numbers involved, the same
number n applies to all such
. Thus
{Fix($) : $ a conjugate of
} = B:
n1 (X ,)
Now the group B on the right is invariant under Aut(K a =K). If a B Tp , then the
reduction map takes a into A(k) (the 'xed 'eld of Frobenius). Let mp be the order
of A(k); then mp a reduces to 0; since the reduction map is injective on Tp , mp a = 0.
So H = mp B K a is p -torsion free, and Aut(K a =K)-invariant. It follows that if b H ,
and mb A(K), (m; p) = 1, then b A(K); for b is the unique mth root of mb in H ,
hence is invariant under Aut(K a =K). Since every element of ,p has a p -multiple in
A(K), we obtain:
nmp (X , p ) A(K):
This 'nishes the proof of Proposition 6.5.1.
Proof. ker(
n 1) is internal to the 'xed 'eld k, while ker(F(
)) is LMS. Hence
they are orthogonal, and in particular have 'nite intersection; the intersection is a 'nite
subgroup, consisting of torsion points. The last statement is also easy to see directly;
both
n 1 and F(
) vanish on the intersection, hence so does G(
) whenever G
is a polynomial of Z[T ] in the ideal generated by F and T n 1. But some constant
polynomial is in this ideal.
(No doubt a proof can also be given without (), using two automorphisms.)
Lemma 6.5.3. X ker((
1)F(
)) is contained in 4nitely many cosets of ker(
1).
6.6. TateVoloch conjecture
Proof. Let h : ker((
1) ker(F(
)) ker((
1)F(
)) be the map h(x; y) = x+y.
h has 'nite kernel. h1 (X ) is a 'nite union of rectangles, U V , by Lemma 3.4.9.
Since ker(F(
)) is LMS, V is a Boolean combination of de'nable cosets; their Zariski
closure is a Zariski closed coset contained in X , so by the assumption on X; h(V ) is
'nite. The lemma follows.
The cosets of ker(
1) have the form {a A : (
1)(a) = 3}; 'nitely many 3
occur in the conclusion of the lemma. Let n1 be such that n1 3 = 0 for each of these 3
that is torsion. (Actually, we only need that n1 pm 3 = 0 for some m; for this, if one is
interested in the explicit version, one can take n1 to be the order of the group A(k ),
where k is the 'eld extension of k of degree [K(3) : K].)
Note that if a Ker((
1)F(
)) A(K a ), and (
1)(a) = 3, then 3 A(K a ), and
F(
)(3) = 0; it follows by Lemma 6.5.2 that 3 is torsion. Thus n1 3 = 0, so n1 a Fix(
).
We have
X ker((
1)F(
)) A(K a ) {x : n1 x Fix(
)}:
Hence by Proposition 6.5.1
n1 (X , p ) Fix(
):
177
Tate and Voloch conjectured that the torsion points on an Abelian variety A over
Cp that do not lie on a subvariety V A, are bounded away from that variety. Certain
special cases were proved by TateVoloch, and by Buium and Silverman. The proof
of the ManinMumford conjecture given above lends itself immediately to a proof of
the TateVoloch conjecture under some restrictions: A must be assumed de'ned over
a 'nite extension of Qp ; must have good reduction; and the prime-to-p torsion points
only are considered. We show this easy deduction here. In later work, using much
more p-adic Galois theory, Scanlon removed the last two constraints. See his papers
[26] for this and for references to the history of the problem.
Assumptions, Notation. Cp is the completion of the algebraic closure of Qp ; the ring
of integers of L is denoted OL , the residue 'eld k. |x| is the p-adic absolute value.
L is a 'nite extension of Qp : OL is the ring of integers of L. kL the residue 'eld.
K = La is the 'eld of algebraic numbers.
S is a group scheme over OL , with generic and special 'bers S and Sp respectively.
S is a semi-Abelian variety, and has good reduction in the sense of Lemma 5.0.10.
Tp (S) = {a S(K) : ma = 0 f or some m; (p; m) = 1};
178
E. Hrushovski
E. Hrushovski
113
In :
In = {r R : r = (b0 ; b1 ; : : :); {i : |bi | 6 pn } U }; I =
n
179
114
ci = b1
for i X; ci = 0 for i = X; s = (c0 ; c1 ; : : :), we see that rs 1 I .) Every dii
agonal sequence is in R, and we obtain an embedding j : K D. Since O is bounded,
Bi O R, and we obtain a map Bi O D, yielding a natural map 5 : Bi S(O) S(D).
Let a S(D) denote the image of (a0 ; a1 ; : : :) there. The assumption d(ai ; X ) 0 implies that for any rational f regular near a and vanishing on X; |f(ai )| 0, hence
the sequence (f(ai ))i lies in I . It follows that f(a ) = 0, hence a X .
Note that since each ai Tp (S), by (#) above, F(
)ai = 0; and so F(
)a = 0. By
Lemma 6.6.1, a Y for some K-de'ned coset Y of a connected subgroup W of S. Let
4 : A S=W be the projection, c = 4(Y ). So 4(a )=c (S=W )(K). But 4(a )=5((4(a0 );
4(a1 ); : : :)). It follows that the sequence 4(ai ) comes arbitrarily close to c, and in particular, for large i, rS=W (4ai ) = rS=W (c). Now rS=W is injective on the prime-to-p torsion
points of S=W , so 4(ai ) = c for large i. Thus for large i; ai Y , and in particular,
ai X .
References
[1] J.-B. Bost, PSeriodes et isogenies des variSetSes abSeliennes sur les corps de nombres, dapres D. Masser
et G. WNustholz. SSeminarie Bourbaki Exp. No. 795.
[2] E. Bouscaren, E. Hrushovski, On one-based theories, J. Symbolic Logic 59 (2) (1994) 579595.
[3] S. Buechler, Locally modular theories of 'nite rank, Ann. Pure Appl. Logic 30(1) (1986) 8394.
[4] A. Buium, Geometry of p-adic jets, Duke J. Math 82 (2) (1996) 349367.
[5] Z. Chatzidakis, E. Hrushovski, Model theory of di&erence 'elds, AMS Trans. 351 (8) (2000) 2997
3071.
[6] Z. Chatzidakis, E. Hrushovski, Y. Peterzil, Model Theory of di&erence 'elds II: periodic ideals and the
trichotomy in all characteristics, Trans. AMS, to appear.
[7] C.C. Chang, J. Keisler, Model Theory, 3rd ed., North-Holland, Amsterdam, Tokyo, 1990.
[8] G. Cherlin, E. Hrushovski, Quasi-'nite structures (preprint available in www.math.rutgers.edu==cherlin).
[9] G. Faltings, EndlichkeitssNatze fNur abelsche VarietNaten uN ber ZahlkNorpern, Invent. Math. 73 (3) (1983)
349366.
[10] W. Fulton, Intersection Theory, Springer, Berlin, Tokyo, 1984.
[11] M. Hindry, Autour dune conjecture de Serge Lang, Invent. Math. 94 (3) (1988) 575603.
[12] E. Hrushovski, Unidimensional theories are superstable, Ann. Pure Appl. Logic 50 (1990) 117138.
[13] E. Hrushovski, The ManinMumford conjecture and the model theory of di&erence 'elds, extended
abstract (5pp), in: M. Jarden (Ed.), Proc. Field Arithmetic conf., Institute for Advanced Study,
Jerusalem, 1995.
[14] E. Hrushovski, The MordellLang conjecture for function 'elds, J. AMS 9 (3) (1996) 667690.
[15] E. Hrushovski, A. Pillay, Weakly normal groups, in: Logic Colloquium 85, North-Holland, Amsterdam,
1986.
[16] E. Hrushovski, A. Pillay, De'nable subgroups of algebraic groups over 'nite 'elds, J. Reine Angew.
Math. 462 (1995) 6991.
[17] B. Kim, A. Pillay, Simple theories, Ann. Pure Appl. Logic 88 (1997) 149164.
[18] S. Lang, in: Number Theory III: Diophantine Geometry, Encyclopaedia of Mathematical Sciences, Vol.
60, Springer, Berlin, Heidelberg, 1991.
[19] S. Lang, J. Tate, Principal homogeneous spaces over abelian varieties, Amer. J. Math. 80 (1958) 659
684.
[20] D. Masser, G. WNustholz, Factorisation estimates for abelian varieties, Pub. Math. IHES 81 (1995) 524.
[21] M. McQuillan, Division points on semi-abelian varieties, Invent. Math. 120 (1995) 143149.
[22] A. Pillay, Model theory, stability theory, and stable groups, in: A. Nesin, A. Pillay (Eds.), The Model
Theory of Groups, Notre Dame Mathematical Lectures 11, University of Notre Dame Press, Notre
Dame, Indiana, 1989.
180
115
[23] M. Raynaud, Around the Mordell conjecture for function 'elds and a conjecture of Serge Lang, in:
Proc. Algebraic Geometry of Tokyo, Lecture Notes, vol. 1016, Springer, Berlin, 1982.
[24] A. Robinson, Introduction to Model Theory and the Metamathematics of Algebra, North-Holland,
Amsterdam, 1963.
[25] G. Sacks, Saturated Model Theory, W.A. Benjamin, Reading, MA, 1972.
[26] T. Scanlon, Conjecture of Tate and Voloch on p-adic proximity to torsion, International Math. Research
Notices 1999 no 17, 909 914; p-adic distance from torsion points of semi-Abelian varieties, J. Reine
Angew. Math. 499 (1998) 225 236.
[27] J.-P. Serre, Lectures on the MordellWeil Theorem, Vieweg, Braunschweig=Wiesbaden, 1997.
[28] J.-P. Serre, in: Groupes algSebriques et corps de classes, ActualitSes scienti'ques et industrielles, Vol.
1264, Hermann, Paris, 1959.
[29] J.-P. Serre, Oeuvres, Vol. IV, 1985 1998, Springer, Berlin, 2000.
[30] S. Shelah, Simple unstable theories, Ann. Math. Logic 19 (1980) 177203.
[31] A. Weil, VariSetSes abeliennes et courbes algSebriques, Hermann, Paris, 1948.
181