Worst Case Efficient Multidimensional Indexing

Worst Case Efficient Multidimensional Indexing
Massimiliano Tomassoli
June 3, 2008
Abstract
Let us consider the following problem.

Let U1 , U2 ,. . . , Ud be totally ordered sets and let U = U1 × U2 × · · · ×
Ud , where d ≥ 2. Given a set S = {s1 , s2 , . . . , sn } ⊂ U and two points a =
(a1 , a2 , . . . , ad ) ∈ U and b = (b1 , b2 , . . . , bd ) ∈ U, find {(x1 , x2 , . . . , xd ) ∈ S | ∀i ∈
{1, 2, . . . , d} ai ≤ xi ≤ bi }.
This paper presents an algorithm that can solve this problem in space
O(n logd−1 n) and time O(logd−1 n) with a precomputation step (independent
of a and b) taking time O(n logd−1 n). Moreover, whenever the restrictions are
imposed only on k ≥ 2 coordinates, the search takes time O(logk−1 n + d − k).
If the space is two-dimensional, the problem consists in finding all the points
that are both in S and in the rectangle represented by a and b. In this case the
algorithm solves the problem in space O(n log n) and time O(log n).
If S is dynamic the problem is more complex. Some ideas aimed at solving
the dynamic version of the problem are presented as well.
About this document
This document was written in a hurry, therefore the bibliography is missing and,
above all, the algorithms and methods described in this document were never
implemented nor tested in any way. The proofs of the theorems presented are
only sketched and no one has ever read them besides me.
I do not take any responsibility for any harm which may result by using the
information contained in this document.
Copyright 2008 by Massimiliano Tomassoli. This document may be freely
distributed and duplicated as long as this copyright notice remains intact. For
any comments: mtomassoli@alice.it.
1
Contents
1 Introduction 3
1.1 The one-dimensional case . . . . . . . . . . . . . . . . . . . . . . 3
1.2 The multidimensional case . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Some common approaches . . . . . . . . . . . . . . . . . . 5
1.2.2 UB Trees and space-filling curves . . . . . . . . . . . . . . 6
2 Static Indexing 9
2.1 Logarithmic Decomposition . . . . . . . . . . . . . . . . . . . . . 9
2.2 First method: O(log2 n) . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Second method: O(log n) . . . . . . . . . . . . . . . . . . . . . . 17
2.4 The Multidimensional Case . . . . . . . . . . . . . . . . . . . . . 23
3 Dynamic Indexing 26
3.1 Splitting and Fusion . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Point Insertion and Deletion . . . . . . . . . . . . . . . . . . . . . 27
2
Chapter 1
Introduction
1.1 The one-dimensional case

Almost every program needs to memorize some data. The problem is how to
quickly retrieve that information when it is needed. We all know the principal
data structures: stack, queue, list, hash table and tree. This paper is mainly
concerned with trees. A tree is a simple but powerful data structure where n
objects are organized in such a way that we can access a single object (it does
not matter which one) in time O(log n) in the worst case. It is a remarkable
result, and the idea behind it is so simple!
Let us take a sorted list of numbers:
1 → 5 → 8 → 9 → 20 → 23 → 28 → 30 (1.1)
We notice that each single number partitions the list into two parts in a
natural way. For instance, the number 20 partitions list (1.1) in the two lists
1→5→8→9 (1.2)
and
20 → 23 → 28 → 30 (1.3)
The way we associate a partition to a number is, of course, completely arbitrary,

but some ways appear to be more natural than others. If we were to search for
the number 23 which (sub)list should we use? It is evident that we should
prefer list (1.3). The idea is clear: if we split a list into two parts, then by
inspecting a single number (20, in our example) we can immediately determine
which sublist cannot possibly contain the number we are searching for. Now
no one prohibits us from reapplying this method over and over until we have
single-object sublists. If we split each sublist into parts of roughly the same
size, we can find every object in time O(log n), as you can see by looking at the
scheme in figure 1.1 or, if you prefer, at the tree in figure 1.2 or, why not, at
the skiplist in figure 1.3. Now let us think of a simple geometric problem we
could successfully tackle using our binary trees (or some variation of them).
Problem 1.1.1. Given a set S = {x1 , x2 , . . . , xn } ⊂ Z and a, b ∈ Z, find
S ∩ [a, b].
3
1 5 8 9 20 23 28 30
Figure 1.1: Simple scheme resembling a binary search tree.
20
8 28
1 9 23 30
Figure 1.2: Simple balanced binary search tree.
1 5 8 9 20 23 28 30
Figure 1.3: Simple skip list.
4
Problem 1.1.1 asks us to find all the integers x in S such that a ≤ x ≤ b.
If we insert the elements of S into a tree or, even better, we sort them in time
O(n log n) and build a balanced tree in time O(n) or put them in an array and
perform a sequence of binary searches on them, we can find the smallest integer
in S greater than or equal to a in time O(log n). The other integers can be
found in time O(1).
1.2 The multidimensional case

The multidimensional case is just a natural generalization of the one-dimensional
case. Let us consider the two-dimensional case:
Problem 1.2.1. Given a set S = {x1 , x2 , . . . , xn } ⊂ Z2 and a, b ∈ Z2 , find
{(x, y) ∈ S | ax ≤ x ≤ bx ∧ ay ≤ y ≤ by }.
Problem 1.2.1 seems much harder than Problem 1.1.1. How can trees be
generalized to solve it efficiently?
1.2.1 Some common approaches

All the methods I have seen share similarities and many classifications are pos-
sible, but there are some important differences. Almost every algorithm tries
to group points in overlapping or non-overlapping regions, but this regions are
organized in different kinds of hierarchies:
Grids The space is divided into non-overlapping hypercubes whose position is
independent of the points. They may be seen as a kind of multidimensional
hash table.
Multilevel Grids The space is divided into non-overlapping hypercubes, which
can be further divided into other hypercubes. They are usually adaptive
in the sense that single hypercubes are further subdivided only if they
contain enough points. Note that while the selection of the hypercubes to
be divided is adaptive, the way a single hypercube is divided is not.
Splitting Planes The space is split by planes which can be axis aligned or
just arbitrary. Axis aligned planes simplify the splitting process but they
might not adapt very well to the distribution of the points. The classic
Binary Space Partition (BSP) Trees use arbitrary planes, while KD-Trees,
for instance, use axis aligned planes.
Bounding Objects The points are enclosed in (usually simple) objects. The
idea is to quickly discard empty spaces and maximize the density of the
enclosed portion of space. The Axis Aligned Bounding Boxes (AABB) are
probably the simplest and most used Bounding Objects (BO). They are
very easy to handle but they might not adapt very well to the distribution
of the points. The BO are usually recursively organized, i.e. each BO but
the first is (often entirely) contained in some other BO.
Regions Induced by Ordering An ordering is imposed on the space so that
for each pair of points (contained in that space) the operator ‘<’ is well-
defined. Now the classic one-dimensional methods and data structures
can be applied.
5
A B C DE F G
B F
A C E G
Figure 1.4: This figure shows a space on which a lexicographic ordering is

imposed. Note that the induced partition is quite naive, indeed point locality is
barely exploited. For instance, the shaded rectangle intersects every region but
does not contain any points.
1.2.2 UB Trees and space-filling curves

When a total ordering is imposed on the space, a region becomes just an interval.
The performance of a particular algorithm of this family depends crucially on the
distribution of the points and on the kind of ordering imposed on the space. For
instance we could just impose the well-known lexicographic ordering, but that
would not be a great choice, indeed the regions induced by it would probably be
very thin and we would not take advantage of spatial locality (see figure 1.4). As
you should have already guessed, we do translate our two-dimensional problem
into a one-dimensional problem but the time required to solve it in the worst
case is O(n). For instance, if we are asked to find all the points in the shaded
rectangle in figure 1.4, we will have to visit every single region!
The UB Trees employ a more complex ordering based on Grids, but the
O(n) still stands. First a hypercube H0 , big enough to contain all the points,
is chosen and then divided into 2d identical hypercubes, where d is the di-
mension of the space. Let H = {h1 , h2 , . . . , h2d } be the set of these hyper-
cubes. Each hypercube hi can be further divided into other 2d hypercubes.
Let Hi = {hi1 , hi2 , . . . , hi2d } be the set of these hypercubes. In general, we
define Ha1 a2 ...an as the set {ha1 a2 ...an 1 , ha1 a2 ...an 2 , . . . , ha1 a2 ...an 2d } of the 2d hy-
percubes into which ha1 a2 ...an is divided. One point p is smaller than another
point q if and only if there exist two words a = a1 a2 . . . ai and b = b1 b2 . . . bj
such that p ∈ ha , q ∈ hb and a <l b, i.e. a is lexicographically smaller than b.
This means that if two points p and q in H0 belongs to two different hypercubes
in, say, H1 we can tell which one of the two points is the smaller. If they are
instead both in the same hypercube, we cannot tell anything about their order
without considering smaller hypercubes. For instance, p might be in h137654 and
q in h137657 . Figure 1.5 should clarify the concept. Note that regions induced
6
h1,1 h1,2
h2
h1,4,1 h1,4,2
h1,3
h1,4,3 h1,4,4
h3 h4
Figure 1.5: This figure shows how, in UB-Trees, we can tell which one of two
points is the smaller. In this case, we clearly see that the point in h1,4,1 is
smaller than the point in h1,4,4 .
Figure 1.6: This figure shows some regions induced by the total ordering used
in the UB-Trees (based on the Z-curve).
by this kind of subdivision scheme (see figure 1.6) cannot be as thin as the ones
induced by the naive subdivision scheme described above (see figure 1.4).
An ordering can also be induced by a space-filling curve. In Mathematics
a space-filling curve is a surjective continuous curve f : [0, 1] → S, where S
is a topological space. Usually, S is Rd . These curves are called space-filling
because they actually fill the entire space. Given p, q in S, we could say that
p < q ⇐⇒ min(f −1 (p)) < min(f −1 (q))) i.e. p < q if and only if p intersects
the curve f “before” q does. Some of these curves are recursive in nature. If
f is a recursive space-filling curve, there is an infinite sequence of functions
f0 , f1 , f2 , . . . where f0 is given and, for each i > 0, fi+1 is obtained by applying
a construction step to fi , and f = limn→∞ fn . Usually we can tell whether
p < q just by looking at a single function of the infinite sequence above. Let
us look at an example. The order imposed on the space by the UB Trees is
induced by the so-called Z-curve. Figure 1.7 shows the first “approximations”
7
Figure 1.7: This figure shows the first four “approximations” of the Z-curve.
The recursive nature of the curve is quite evident.
of the Z-curve. It should be evident that, for instance, points in h1 intersect the
curve “before” points in h2 do.
8
Chapter 2
Static Indexing
This chapter deals with the static version of the problem, i.e. where all points
in our space are known in advance and no points can be inserted or deleted.
2.1 Logarithmic Decomposition

Definition 2.1.1. Let S = (a1 , a2 , . . . , an ) be a finite sequence. A sequence
(ai , ai+1 , . . . , aj ), where 1 ≤ i ≤ j ≤ n or j < i, is called a (finite) connected
subsequence of S.
Definition 2.1.2. Let S = (a1 , a2 , . . . , an ) be a finite sequence and let νij be
the connected subsequence (ai , ai+1 , . . . , aj ). The partial decomposition of level
i of S, where 0 < i ≤ n, is the sequence
(ν1 , νi+1 , . . . , νn−i+1 ) if i divides n

( i 2i n
δS =
i
bn/ici
(ν1 , νi+1 , . . . , νbn/ici−i+1 , νbn/ici+1 ) otherwise
i 2i n
Example 2.1.3. If S = (5, 1, 7, 2, 8, 2, 9, 6), then
δS1 = ((5), (1), (7), (2), (8), (2), (9), (6))

δS2 = ((5, 1), (7, 2), (8, 2), (9, 6))
δS3 = ((5, 1, 7), (2, 8, 2), (9, 6))
δS4 = ((5, 1, 7, 2), (8, 2, 9, 6))
δS5 = ((5, 1, 7, 2, 8), (2, 9, 6))
δS10 = ((5, 1, 7, 2, 8, 2, 9, 6))
Definition 2.1.4. If S is a finite sequence, its logarithmic decomposition is the

Sdlog ne i
set δS = i=0 2 {δS2 }.
Example 2.1.5. If S = (5, 1, 7, 2, 8, 2, 9, 6, 3, 4), then δS = {δS1 , δS2 , δS4 , δS8 , δS16 }
Definition 2.1.6. Let S = (s1 , s2 , . . . , sm ) and P = (p1 , p2 , . . . , pn ) be two
sequences. The concatenation S · P , or simply SP if no confusion arises, is the
sequence (s1 , s2 , . . . , sm , p1 , p2 , . . . , pn ).
9
P s6 s7 s8 s9 s10 s11 s12 s13 s14 s15
δS16 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16
Figure 2.1: In this figure we see a subsequence P of a sequence S and a 2-

selection of the logarithmic decomposition of S such that the concatenation of
that selection contains the same points as P .
Definition 2.1.7. If S = (s1 , s2 , . . . , sn ) is a sequence of sequences, then its

concatenation is the sequence s1 s2 · · · sn .
Definition 2.1.8. Let S = (s1 , s2 , . . . , sn ) be a sequence of sequences. For
each i ∈ {1, 2, . . . , n} let ti be a subsequence of length at most l of si . Let
(k1 , k2 , . . . , kh ) be a subsequence of (1, 2, . . . , n) such that the sequence T =
(tk1 , tk2 , . . . , tkm ) contains every non-empty ti . The concatenation of T is called
an l-selection of S.
Example 2.1.9. Let S = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15) and P =
(3, 4, 5, 6, 7, 8, 9, 10, 11). To make a 2-selection of δS we must choose not more
than two sequences from each δSi , for i = 1, 2, 4, 8, 16. Let the chosen 2-selection
be T = ((11), (3, 4), (9, 10), (5, 6, 7, 8)). U = ((3, 4), (5, 6, 7, 8), (9, 10), (11)) is a
permutation of T and its concatenation is equal to P .
To simplify our discussion, let us give a few more definitions.
Definition 2.1.10. If s is a connected subsequence of a sequence t, we write
s v t. If s is a proper connected subsequence of t, we also write s @ t.
k/2
Definition 2.1.11. Let k be a positive integer such that δSk and δS are two
sequences in δS . If s is a sequence in δSk , there are two sequences t1 and t2 in
k/2
δS such that t1 t2 = s. Let L(s) be t1 and R(s) be t2 .
Theorem 2.1.12. Let S be a sequence and P a connected subsequence of S.
There exists a 2-selection T of δS of length at most log2 n, and a permutation
γ such that the concatenation of γ(T ) is equal to P .
Proof. (sketch) See figure 2.1. Let c = 2dlog2 |S|e . Algorithm 2.1.12.1 can find
the shorter γ(T ) in time O(log n). The algorithm is very simple. If P is empty
then T = (). Let assume P is not empty. Let d be, initially, the concatenation
of δSc , i.e. the sequence in δSc . It is clear that P v d, then we can have four
cases:
1. P = d
10
if P = ()
return ()
γ(T ) := () // initially empty
d := concatenation of δSc
while true
if P = d
return d
if P v L(d)
d := L(d)
else if P v R(d)
d := R(d)
else
d1 := L(d)
P1 := P \ R(d)
while P1 6= ()
if P1 @ R(d1 )
d1 := R(d1 )
else
γ(T ) := (R(d1 )) · γ(T )
P1 := P1 \ R(d1 )
d1 := L(d1 )
d2 := R(d)
P2 := P \ L(d)
while P2 6= ()
if P2 @ L(d2 )
d2 := L(d2 )
else
γ(T ) := γ(T ) · (L(d2 ))
P2 := P2 \ L(d2 )
d2 := R(d2 )
return γ(T )
Algorithm 2.1.12.1: This algorithm is used in the proof of theorem 2.1.12.
11
P s6 s7 s8 s9 s10 s11 s12 s13 s14 s15
Figure 2.2: This figure shows the same situation as figure 2.1, but here a binary
tree connecting the sequences of the logarithmic decomposition is shown. Note
that the arrows of the tree let us reach all the shaded sequences in a total time
of O(log n).
2. P v L(d)
3. P v R(d)
4. none of the above

In the first case, d is selected. If we are in the second or the third case we simply
redefine d as L(d) or R(d), respectively. The fourth case requires that we split
P into two subsequences P1 and P2 such that P1 P2 = P and that P1 v L(d) and
P2 v R(d). We now can handle P1 and P2 separately. The important thing to
note is that the last element of P1 is equal to the last element of L(d), and the
first element of P2 is equal to the first element of R(d). This means that, every
time we select a sequence to add to our 2-selection, we handle the left (right)
part of the current P1 (P2 ). Moreover, the remaining part is strictly smaller
than the handled part, therefore the algorithm must terminate in O(log n) steps
at most. Since we never choose more than one sequence per step, we select
O(log n) sequences at most.
Theorem 2.1.13. Let S be a sequence and P a connected subsequence of S.
The sequence δS can be built in time O(n log n) and a 2-selection satisfying
theorem 2.1.12 can be found in time O(log n).
Proof. (sketch) Let S = (s1 , s2 , . . . , sn ) and c = 2dlog2 |S|e . It is clear that every
sequence δSi can be built from S in time O(n). Because |δS | = O(log n), δS
can be built in time O(n log n). We also build a search binary tree B such that
i
for i = 1, 2, . . . , c, each sequence d ∈ δS2 (or its first element) points to the
sequences (or their first elements) in L(d) and R(d). Have a look at figure 2.2.
We can now solve the problem by using the algorithm described in the proof
of theorem 2.1.12. B makes the implementation of the algorithm completely
straightforward.
12
2.2 First method: O(log2 n)
Let us generalize problem 1.2.1 a little.
Definition 2.2.1. If p ∈ A1 × A2 × · · · × An and p = (p1 , p2 , . . . , pn ), then
pai = τai p = pi .
Example 2.2.2. If p ∈ U × V and p = (x, y), then pu = τu p = x and pv =
τv p = y.
Definition 2.2.3. If S = {s1 , s2 , . . .} ⊂ A1 × A2 × · · · × An , then τai S =
{τai s1 , τai s2 , . . .}. For sequences replace ‘{’ and ‘}’ with ‘(’ and ‘)’, respectively.
Problem 2.2.4. Let U and V be two totally ordered sets. Given a set S =
{s1 , s2 , . . . , sn } ⊂ U × V and a, b ∈ U × V, find {(u, v) ∈ S | au ≤ u ≤ bu ∧ av ≤
v ≤ bv }.
This generalization makes sure that our method does not take advantage of
anything but the fact that the sets are totally ordered.
Definition 2.2.5. Given two sequences a = (a1 , a2 , . . .) and b = (b1 , b2 , . . .), a
is lexicographically smaller than b, written a <l b or simply a < b, if and only
if there exists an integer i ∈ [1, min{|a|, |b|} + 1[ such that, for each 0 < k < i,
ak = bk and ai < bi .
Let us solve problem 2.2.4. Let P = τv S and let Q = (q1 , q2 , . . . , qn ) be the
sequence containing all the elements of P such that qi < qj ⇐⇒ i < j for each
valid i, j (see figure 2.3).
The idea is to group the points in S by looking at the logarithmic decompo-
sition of Q.
Definition 2.2.6. Let φ : Q → 2S be an injective function that associates each
connected subsequence of Q to a different subset of S. For each subsequence
q of Q, φ(q) = φq = s, where s is the biggest subset of S such that τv s and
q contain the same elements. For every sequence X = (x1 , x2 , . . .), let φX =
(φx1 , φx2 , . . .). The sequence G = (G20 , G21 , G22 . . . , G2|δQ |−1 ), where, for each
valid i, Gi = φδSi , is called the group (logarithmic) decomposition associated to
Q.
Example 2.2.7. Figure 2.4 shows how δQ and G, i.e. the group decomposi-
tion associated to Q, are interrelated. For the sake of clarity, we will prefer
representations such as that in figure 2.5.
Definition 2.2.8. If X = {x1 , x2 , . . .}, then we define X< as the sequence

(y1 , y2 , . . .) such that X and X< contains the same elements and yi <l yj ⇐⇒
i < j.
Definition 2.2.9. Let X = (X1 , X2 , . . .) be a sequence of sets. We define X<
as the sequence of sequences (X1< , X2< , . . .).
Definition 2.2.10. Let U and V be two totally ordered sets. Given a set
S = {s1 , s2 , . . . , sn } ⊂ U × V, let G = (G20 , G21 , G22 . . . , G2|δQ |−1 ) be the group
(logarithmic) decomposition associated to Q, where Q is the projection of S
onto V as described above. We call HS = (G20 < , G21 < , G22 < . . . , G2|δQ |−1 < )
the (lexicographically) ordered group (logarithmic) decomposition of S.
13
Q
Figure 2.3: This figure shows a set S of points in the space U×V and the ordered
projection Q of S onto V. Q is ordered because it is a sequence (q1 , q2 , . . . , qn )
such that qi < qj ⇐⇒ i < j for each valid i, j.
14
G Q
V δQ
Figure 2.4: This figure shows the same points as figure 2.3, but here a represen-
tation of the logarithmic decomposition δQ and a representation of the group
logarithmic decomposition G are also shown.
1 2
4 8
1
2
1
1
2
1
1 4
16
2
1
1 8
2
1
1 4
2
1
Figure 2.5: This figure shows the same points as figure 2.4, but represents the
group logarithmic decomposition in a more practical way.
15
Definition 2.2.11. If a ∈ A1 × A2 × · · · × An and a = (a1 , a2 , . . . , an ), let
ā = (an , an−1 , . . . , a1 ) ∈ An × An−1 × · · · × A1 .
Example 2.2.12. Let a, b ∈ U × V. We know that a < b ⇐⇒ au < bu ∨ (au =

bu ∧ av < bv ). Analogously, ā < b̄ ⇐⇒ av < bv ∨ (av = bv ∧ au < bu )
Theorem 2.2.13. Let U and V be two totally ordered sets. Given a set S =
{s1 , s2 , . . . , sn } ⊂ U × V, the ordered group decomposition HS can be built in
time O(n log n).
Proof. (sketch) First we determine T = (t1 , t2 , . . . , tn ) such that it contains the

same elements as S, and t¯i < t¯j ⇐⇒ i < j. Let Hi be the i-th element of
HS . Because all the elements in T with the same coordinate v are adjacent
in T , then T is the concatenation of H1 . Therefore, by sorting S and doing a
bit of bookkeeping, we can obtain H1 in time O(n log n). Let us say hi,j is the
j-th element of Hi . We can obtain Hi+1 by merging b |H2i | c pairs of adjacent
sequences in Hi in time O(n). Let us see why. If a, b are two sorted sequences,
then we can obtain the sorted sequence c that contains the same elements as ab
by performing |ab| − 1 comparisons at most (see the merge sort if you are not
familiar with it). It follows that the merging of all the b |H2i | c pairs of sequences
in Hi requires
|Hi |
b 2 c−1 |Hi |
X X
|hi,2j+1 | + |hi,2j+2 | − 1 < |hi,j | = n
j=0 j=1
comparisons at most. Because |H| = dlog2 ne + 1, the total time is indeed

O(n log n). As we build HS , we also build a tree C similar to the tree B de-
scribed in the proof of theorem 2.1.13, the only difference being that B contains
references to (first elements of) sequences in δS , while C contains references to
(first elements of) sequences in HS .
Theorem 2.2.14. Let U and V be two totally ordered sets. Let a, b ∈ U × V,
let S = {s1 , s2 , . . . , sn } ⊂ U × V and let P = {(u, v) ∈ S | av ≤ v ≤ bv }. There
exists a 2-selection T of HS of length at most log2 n such that the concatenation
of T contains the same elements as P .
Proof. The theorem follows directly from theorem 2.1.12 and the definition of
HS .
Theorem 2.2.15. Let T be defined as in theorem 2.2.14. After a precompu-
tation step independent of T taking time O(n log n), T can be found in time
O(log n).
Proof. The theorem follows directly from theorem 2.1.13, and the fact that HS ,
built as in the proof of theorem 2.2.13, can be accessed through a tree-like
structure analogous to that used in the proof of theorem 2.1.13.
Theorem 2.2.16. Problem 2.2.4 can be solved in space O(n log n) and time
O(log2 n) with a precomputation step (independent of a and b) taking time
O(n log n).
16
Proof. (sketch) According to theorem 2.2.15 we can find T in O(log n) after a
precomputation step (independent of T ) that takes O(n log n). Let a and b be
the two points in problem 2.2.4. We know that |T | = O(log n) and that, for
each t ∈ T and p ∈ t, av ≤ pv ≤ bv . Because each sequence in T is sorted, for
each sequence t ∈ T we can find the smallest point p such that pu ≥ au in time
O(log n). Therefore the algorithm returns the first point in time O(log 2 n) and
each one of the others in O(1). Note that since S is static we can actually let
the client find the other points by itself and say that the problem can indeed be
solved in time O(log2 n).
2.3 Second method: O(log n)

In the first method we first find O(log n) groups gi containing all and only the
points p ∈ S such that av ≤ pv ≤ bv , then, for each gi , we find the first point
p ∈ gi such that pu ≥ au .
It turns out that we can find all the points pi in time O(log n): we first find
one of them with a O(log n)-time search, then we find the others in time O(1).
In the next method we shall not search for the same pi , though. Moreover, in a
real implementation we shall not search for them at all, as we shall see.
Let us first look at a few examples revealing some of the difficulties we are
facing. Figure 2.6 shows a lucky case. Assume that we find a with a single
search in time O(log n). The point b is adjacent to a in the (lexicographically)
sorted sequence in δS8 , and to c in a sorted (lexicographically) sequence in δS2 .
Figure 2.7 shows another lucky case. Assume that we find a1 with a single
search in time O(log n). We note that a1 and b are not adjacent in any sequence,
but we are lucky again: a2 , a3 and a4 are all points we want because they are
in the rectangle, therefore we can visit them one by one and reach b from a4 .
Now let us look at figure 2.8. Here our luck finally runs out. We cannot
reach b, c or d from a in time O(1). Actually, the time required is O(n). If
we start from b, c or d we have not the same problem, but we cannot find a
point in the rectangle in time O(log n) (not yet, at least). We need to seriously
reconsider our strategy.
Definition 2.3.1. Let U and V be two totally ordered sets, and let S =
{s1 , s2 , . . . , sn } ⊂ U × V. For each valid i, let Hi be the sequence in HS re-
lated to δSi . If Hi and H2i are in HS and a point p is in a sequence X in Hi ,
then every point q that is in a sequence in H2i together with p, but is not in X,
is called a brother of level i, or simply a brother, of p.
Example 2.3.2. Look at figure 2.9. The points connected by dotted, dashed
and solid arrows, are brothers of level 1, 2 and 4, respectively. Note that, in the
figure, there are brothers not connected by any arrows.
Definition 2.3.3. Let U and V be two totally ordered sets, and let S be a
subset of U × V. The proximity graph AS of S is the directed graph consisting
of all the points in S and such that, for each valid level l, each point p in AS
points, if there exists, to the biggest brother pb of level l such that pb < p.
Example 2.3.4. Look at figure 2.9. Note that the arrows are labeled according
to the level of brotherhood. Also note that a and b have the same u-coordinate,
17
list 2
list 8
2
1
4
1
a
2
8
1
c
2
b 1
4
1
Figure 2.6: Let us assume we are to search for all the points contained in the
smaller rectangle above and that we are only provided with an ordered group
logarithmic decomposition. Let us assume we have just found a. We can find
each one of the other points in the smaller rectangle in time O(1).
18
list 2
list 4
2 4 8
1
a1 a2 a3 a4 1
2
b 1
4
c 1
2
1
8
1
2
1
4
1
2
1
logarithmic decomposition. Let us assume we have just found a1 . We can find
each one of the other points in the smaller rectangle in time O(1).
19
1
a 1
b 1
c 1
d 1
logarithmic decomposition. Let us assume we have just found a. We cannot
find each one of the other points in the smaller rectangle in time O(1), indeed
the length of the shorter path between a and b, c or d is O(n).
level 1
level 2
level 4
a 1 2
4
1
b 2
1
2
8
c 1
1 4
2
d 1
Figure 2.9: This figure shows a proximity graph. Note that arrows of level (or
length) l connect brothers of level l. Note also that b < a.
20
i := 1
j := 1
while i < s ∧ xi+1 < yj
i := i + 1
while j < t ∧ yj+1 < xi
j := j + 1
add arrow max{xi , yj } → min{xi , yj }
while i < s ∧ j < t
// invariant: max{xi , yj } points to min{xi , yj }.
if xi+1 < yj+1
i := i + 1
add arrow xi → yj
else
j := j + 1
add arrow yj → xi
while i < s
i := i + 1
add arrow xi → yj
while j < t
j := j + 1
add arrow yj → xi
Algorithm 2.3.5.1: This algorithm is used in the proof of theorem 2.3.5.
but a > b because we are under the lexicographic order. That is why c and d
point to a rather than to b.
Theorem 2.3.5. Let U and V be two totally ordered sets, and let S be a subset
of U × V. The proximity graph AS of S takes space O(n log n) and can be built
in time O(n log n).
Proof. (sketch) Look again at figure 2.9. AS contains |S| points. Each point in
AS can point at most to one brother per level, therefore, because the levels are
O(log n), it can point to O(log n) brothers at most. This means that, for each
point p in AS , the set of the arrows that start from p has cardinality O(log n),
therefore we need at most O(log n) pointers per point.
HS takes space O(n log n) and, by theorem 2.2.13, can be built in time
O(n log n). We first build HS and then transform it into AS in the following
way. Let hi be the sequence of level i in HS , and hi,j the j-th sequence in
hi . For each valid level l, all the pointers representing the arrows of level l
in AS can be set by considering every pair (hl,i , hl,i+1 ) such that there exists
h2l,k that has the same elements as hl,i hl,i+1 . Let hl,i = (x1 , x2 , . . . , xs ) and
hl,i+1 = (y1 , y2 , . . . , yt ). We use algorithm 2.3.5.1. Algorithm 2.3.5.1 never
spends more than O(1) time on each pair (xi , yj ), and, because the indices i
and j are only incremented, never consider more than s + t pairs.
Since AS has O(n log n) arrows, AS can be built in time O(n log n). Note
that, in a real implementation, we can, and should, build AS from scratch.
21
Problem 2.3.6. Let U and V be two totally ordered sets, and S a subset of
U × V. Let a, b ∈ U × V, and P = {(u, v) ∈ S | av ≤ v ≤ bv }. Given a and b,
find a 2-selection T = (t1 , t2 , . . . , tm ) of HS of length at most O(log2 n), such
that the concatenation of T contains the same points as P . For each ti , find
also, if there is such a point, the biggest point p in ti such that pu ≤ bu .
Theorem 2.3.7. There exists an algorithm that solves problem 2.3.6 in time
O(log n) and space O(n log n) with a precomputation step, independent of a and
b, taking time O(n log n).
Proof. (sketch) According to theorem 2.3.5 we can build the proximity graph
AS in time O(n log n). To simplify this discussion, let us assume that AS and
HS are joined in forming a single data structure. We do not really need HS ,
but it simplifies somewhat the explanation. But we do need the sequence in the
last sequence in HS (i.e. the lexicographically ordered sequence which contains
all the points in S). Let us call it Z.
Let hi,j be the j-th sequence in the i-th sequence in HS . For each valid i and
j, let Mi,j be the biggest point in hi,j such that Mi,j u ≤ bu . If i > 1 then there
is a k such that Mi,j = Mi−1,k . Let q be the integer such that hi,j contains the
same points as hi−1,k hi−1,q . Since Mi−1,k and Mi−1,q are brothers of level i − 1
and Mi−1,k = Mi,j > Mi−1,q , it is clear that, by definition of AS , Mi−1,k points
to Mi−1,q .
If s is a sequence, let Ms be the biggest point in s such that Msu ≤ bu .
The proof of theorem 2.2.15 (which refers to the proof of theorem 2.1.13) shows
how T can be found in time O(log n) by starting from Z and, for each sequence
s, considering the subsequences L(s) and R(s). Since we can find MZ with a
single O(log n)-time search in Z, and since, for each sequence s, given Ms we
can find ML(s) and MR(s) in time O(1), we can indeed solve problem 2.3.6 in
time O(log n).
Theorem 2.3.8. Problem 2.2.4 can be solved in space O(n log n) and time
O(log n) with a precomputation step (independent of a and b) which takes time
O(n log n).
Proof. (sketch) Let a and b be the two points in problem 2.2.4. According to
theorem 2.3.7, we can solve problem 2.3.6 for the same two points a and b in
time O(log n) and space O(n log n) with a precomputation step, independent
of a and b, taking time O(n log n). Let P = {p1 , p2 , . . . , pk } be the set of the
O(log n) points we are asked to find in problem 2.3.6. All the points in the
rectangle identified by the points a and b can be found by searching for the
biggest points on the left of each one of the points in P . We can do this by
simply following the arrows in AS as we did to find the points pi in the first
instance.
There is only a little obstacle: points with the same v-coordinate do not
point to each other in AS , but we can promptly solve the problem by adding
the required O(n) arrows to AS during its construction.
Note that since S is static we can actually let the client find the other points
by itself and say that the problem can indeed be solved in time O(log n).
Remark 2.3.9. In a real implementation, we should not solve problem 2.2.4
as described in the proof of theorem 2.3.8. Let a and b be the two points in
problem 2.2.4 and let R be the set {(u, v) ∈ U × V | au ≤ u ≤ bu }. We should
22
level 1
level 2
level 4
level 8
1
2
1
4
p3 1
2
p4 1
8
1
2
1
4
p2 1
2
1
16
1
p7 p6 1
2
4
1
p8 1
2
8
p5 1
p 1
2
4
1
2
1
Figure 2.10: This figure shows a partial proximity graph. Let us assume that
we want to find all the points in the search-rectangle above. The biggest point
in the search-rectangle is p therefore our search starts exactly from it. The
important thing to note is that as soon as we reach p2 , p6 and p8 we know that
p and p5 are the only two points in the search-rectangle. There is no need to
proceed any further.
check whether the points Mi,j , defined as in the proof of theorem 2.3.7, are
included in R as we find them. If Mi,j is not in R, there is no need to follow the
arrows that start from it: the pointed points will be clearly out of R as well.
Let us consider the case depicted in figure 2.10. After an O(log n)-time
search we find the point p and then start taking advantage of AS to determine
the other points. In a real implementation, as soon as we see that p2 is not in
R, we stop following that branch and go back to p immediately and follow some
lower-level arrow. Similarly, as soon as we reach p6 and see that it is not in R,
we go back to p5 .
2.4 The Multidimensional Case

The methods described in the previous sections can be easily generalized to
handle the multidimensional case. First let us look at the three-dimensional
case.
Problem 2.4.1. Let U, V and W be three totally ordered sets. Given a set
S = {s1 , s2 , . . . , sn } ⊂ U × V × W and a, b ∈ U × V × W, find {(u, v, w) ∈ S |
au ≤ u ≤ bu ∧ av ≤ v ≤ bv }.
23
Single
2.5D Problem
a8
b4
a7
c2
a6
b3
a5
d1
a4
b2
a3
c1
a2
b1
a1
Figure 2.11: This figure shows how a logarithmic decomposition along the third
axis can be used to partition the space in such a way that problem 2.4.2 may
be solved by solving O(log n) instances of problem 2.4.1.
Problem 2.4.2. Let U, V and W be three totally ordered sets. Given a set
S = {s1 , s2 , . . . , sn } ⊂ U × V × W and a, b ∈ U × V × W, find {(u, v, w) ∈ S |
au ≤ u ≤ bu ∧ av ≤ v ≤ bv ∧ aw ≤ w ≤ bw }.
Theorem 2.4.3. Problem 2.4.2 can be solved in space O(n log2 n) and time
O(log2 n) with a precomputation step (independent of a and b) taking time
O(n log2 n).
Proof. (sketch) First of all, note that problem 2.4.1 can be solved by slight
variations of the two methods described in the previous sections. The impor-
tant thing is that the lexicographic ordering be extended to include the third
coordinate: that way, given two points, one is always smaller than the other.
This suggests that if we partition the three-dimensional space along the
third axis by using the logarithmic decomposition, we can solve problem 2.4.2,
by solving at most O(log n) instances of problem 2.4.1.
Now note also that the addition of a third dimension does not substantially
alter the structures and the algorithms used to handle the group decompositions
and the proximity graphs. Once again, the additional dimension is taken care
of by the lexicographic ordering extended to include it. A group decomposition
partition the space in O(log n) different ways (in groups of length 1, 2, 4, etc. . . ).
Each one of these O(log n) partitioned spaces have to be further partitioned
along the second axis as we do in the two-dimensional case, therefore we need
space and time O(n log2 n).
An example should help. Look at figure 2.11. First we partition the box (con-
taining all the points in S) into 8 boxes of height 1: a1 , a2 , . . . , a8 . Now, we de-
compose each one of them along the second axis as we did in the two-dimensional
case. This operation takes time O(|a1 | log n)+O(|a2 | log n)+. . .+O(|a8 | log n) =
O(n log n), where |ai | is the number of points within ai . We then partition the
24
box into 4 boxes of height 2: b1 , b2 , b3 , b4 . We decompose each one of them in
time O(|b1 | log n) + O(|b2 | log n) + O(|b3 | log n) + O(|b4 | log n) = O(n log n). We
repeat the same procedure for c1, c2 and d1 . The total time required to build a
three-dimensional version of HS or AS is therefore O(n log2 n). Note that if we
merge all these proximity graphs into a single graph, we end up with a graph
whose nodes each have O(log2 n) arrows.
Let us say we want to find all the points p such that au ≤ pu ≤ bu ∧ av ≤
pv ≤ bv ∧ aw ≤ pw ≤ bw . Let Z be the sequence contained in the last sequence
in HS . We know that Z contains the same points as S. If s is a sequence,
let Ms be the biggest point in s such that Msu ≤ bu . Let Hw be the group
decomposition of S along the third-axis. We know that there is a 2-selection
Tw = (t1 , t2 , . . . , tk ) of Hw such that the concatenation of Tw contains the same
points as the set {(u, v, w) ∈ S | aw ≤ w ≤ bw }. We first find the point MZ ,
then use the “third-axis proximity graph” to find the points mt1 , mt2 , . . . , mtk .
For each mti we can now use the “second-axis proximity graphs” to solve the k
(i.e. O(log n)) 2.5D sub-problems. We define them “2.5D” because they are 2D
problems where the points are immersed in a three-dimensional space.
Note that since S is static we can actually find the biggest points and let
the client find the other points by itself and say that the problem can indeed be
solved in time O(log2 n).
Remark 2.4.4. What we said in remark 2.3.9 applies to the multidimensional
case as well. For instance, if we are walking through the “third-axis proximity
graph” and see that the current biggest point p has the u coordinate smaller
than that of a, we can abandon p immediately without even examining the
“second-axis proximity graphs” it leads/is related to.
Problem 2.4.5. Let U1 , U2 ,. . . , Ud be totally ordered sets and let U = U1 ×
U2 × · · · × Ud , where d ≥ 2. Given a set S = {s1 , s2 , . . . , sn } ⊂ U and a, b ∈ U,
find {(x1 , x2 , . . . , xd ) ∈ S | ∀i ∈ {1, 2, . . . , d} aui ≤ xi ≤ bui }.
Theorem 2.4.6. Problem 2.4.5 can be solved in space O(n logd−1 n) and time
O(logd−1 n) with a precomputation step (independent of a and b) taking time
O(n logd−1 n). Moreover, if the restrictions are imposed only on k ≥ 2 coordi-
nates, the search takes time O(logk−1 n + d − k).
Proof. (sketch) Problem 2.4.5 is an almost straightforward generalization of
problem 2.4.2. We can prove it by induction on d by generalizing the reasoning
of the proof of theorem 2.4.3. To be precise, we should also account for the fact
that a lexicographic comparison takes time O(d) in dimension d. We can rule
out the d-factor by noting that the asymptotic estimations are dominated by
the cost of the O(logd−2 n) instances of the two-dimensional problem: instead of
keeping the points distinct by extending the lexicographic ordering as proposed
in the proof of theorem 2.4.3, we consider only the first two coordinates (exactly
as if we were to solve problem 2.2.4) and whenever two points coincide, we put
them in the same multinode. That is not different from what one have to do to
handle repeated keys with structures that does not support them natively.
If the restrictions are imposed only on k ≥ 2 coordinates, we will need
to find only k 2-selections because in the other d − k cases we will choose
immediately the group that contains all the points (in the subspace we are in
at that moment).
25
Chapter 3
Dynamic Indexing
Making a proximity graph dynamic is not a trivial task. We want to be able

to insert new points in it and delete old points from it in an efficient way and,
above all, without adversely affecting the time needed to perform a search.
For the sake of clarity, we will analyze proximity graphs built upon group
decompositions. That will make the analysis and the pictures much clearer.
(Remember that a proximity graph can be built upon a group decomposition
and the two structures can coexist.)
The first thing we are going to do is relax the proximity graph.
Definition 3.0.7. Let G = (g20 , g21 , g22 , . . . , g2n ) be a group decomposition. A
group in gi is called a group of level i. If p is a group of level i and q is both a
group of level i/2 and a subsequence of p, then q is a subgroup of p.
Definition 3.0.8. If two groups are subgroups of level l of the same group,
they are cogroups of level l.
Definition 3.0.9. A relaxed group decomposition is a group decomposition
where groups of level greater than 1 may have any positive number of subgroups.
Definition 3.0.10. A relaxed proximity graph P is a proximity graph such
that from each point in P may start two or more arrows of the same level.
Specifically, for each valid level l, for each point p in P , and for each cogroup c
of level l of p, p points to its biggest brother of level l in c.
Definition 3.0.11. Let P be a proximity graph and a an arrow connecting two
points p and q in P . The length of a is equal to the level of the two cogroups
which contain p and q.
3.1 Splitting and Fusion

Whenever a group g of level i in a proximity graph contains four subgroups,
we split g into two groups g1 and g2 of level i, each one of which takes two of
the four subgroups of g. On the contrary, when two adjacent cogroups contain,
together, less than four subgroups, we should join them together by performing
a fusion.
Can we perform splittings and fusions efficiently? I have not had much time
to think about it, but here are some ideas. The obvious solution is a non-solution
26
(because we change the problem): if our space is the Cartesian product of metric
spaces (not necessarily bounded), then we can build the proximity graph by
connecting the points with arrows whose length is directly proportional to the
distance (along a single axis) of the points to be connected. Because the spaces
may be unbounded, we should determine a unitary length l based on the first
two points we receive and then let the other arrows be of length 2i l, for some
i ∈ Z.
Another possible solution is that of relaxing the proximity graph as much
as possible and perform some rebalancing step each time we add or delete a
point. There is a little problem here: a group of level 1 can have many points,
then many steps may be necessary to handle it. We can overcome this obsta-
cle by (conceptually) perturbing the points so that there are no points on the
same axis-aligned line. Note that that does not alter our asymptotic estima-
tions. That way the work to do to rebalance the proximity graph should be
proportional to the number of points added and deleted.
You might have noticed that something is missing here: how do we insert
and delete points? If we cannot perform these operations efficiently, then there
is no point in talking of splitting and joining groups.
Definition 3.1.1. A relaxed proximity graph is balanced if no splittings or
fusions can be performed on it.
3.2 Point Insertion and Deletion

Definition 3.2.1. A proximity graph is compact if, for each point p and for
each level l, the biggest brother less than p of level l can be found in time O(1).
Theorem 3.2.2. If P is a balanced relaxed proximity graph, a point p can be

inserted in P in time O(log2 n).
Proof. (sketch) For each valid level l, the number of arrows of length l that point
to p is O(n). We can handle the arrows one level at a time. Let us see how we
can handle the arrows of length l. After p is (at least conceptually) inserted in
a group g of level l, in the most general case, p is preceded by a point q1 and
followed by a point q2 , such that q1 < p < q2 . All the arrows of level l that arrive
in p start from points contained in some cogroup of g. Because q1 < p < q2 ,
there could be some arrows that still point to q1 but should now point to p.
Look at figure 3.1. Let R = (r1 , r2 , . . . , rn ) be the sequence of the points that
point to q1 or to q2 with arrows of level l, where ri < rj ⇐⇒ i < j. Note
that only the arrows that point to q1 may need to be redirected to p. Let R0 be
the longest connected subsequence of R such that for each r ∈ R0 we have that
p < r < q2 . It is clear that all (and only) the points in R0 must stop pointing
to q1 and start pointing to p. For each point t, let tol (t) be the sorted sequence
of all the points that point to t with arrows of length l. All we have to do is
split tol (q1 ) into two appropriate subsequences s1 and s2 and let tol (q1 ) = s1
and tol (p) = s2 . Let us note that, consequently, the deletion of a point can be
carried out by joining two sorted subsequences.
Let us see how all this can be implemented. For each point p and each valid
level l, we associate to p a tree Tp,l which contains a reference to all the points
that point to p with arrows of length l. Let us go back to our q1 and p. As
27
q1 p q2
r1 r2 r3 r4 r5 r6 r7 r8
Figure 3.1: This figure shows a small portion of a proximity graph into which
a point p has just been inserted. Let us assume that q1 < p < q2 and that all
the arrows in the figure are of the same level. It is clear that, in this case, r4 ,
r5 and r6 must stop pointing to q1 and start pointing to p.
you can see by looking at figure 3.2, the points r1 , r2 , . . . , rn point both to q1
as to the nodes in Tq1 ,l that represent them (or directly to other elements ri
contained in the tree: it depends on the type of tree used). Figure 3.3 shows
what happens when Tq1 ,l is split. Note that, after the splitting, the points in
Tp,l still point to q1 . Because no direct pointers to q1 need to be modified, the
splitting and the rebalancing can be performed in time O(log n).
We have two ways of determining where a point points to:
(i) we can follow its direct link in time O(1) or
(ii) we can reach the root of the tree it belongs to in time O(log n) and then
see to which point the tree is associated in time O(1).
The last method always gives the correct answer, but is slower, so the former
method is always tried first. Let us say we want to determine which point r3
points to. We first follow the direct link of r3 which, in our example, lead us to
q1 . We then check if r3 is in Tq1 ,l in time O(1) (note that we just have to look
at the the last element of the tree). Because r3 is not in Tq1 ,l we start moving
toward the root of Tp,l and for each node s we visit along the path, we check
whether s points directly to the right node p. If this is the case, we are done,
otherwise we move on to the next parent. After we have determined the node
p one way or the other, we make all the nodes we have visited point directly to
p. This is called path compression and greatly speeds up successive searches.
Because we must redirect arrows of O(log n) different lengths, we need time
O(log2 n).
We note that P is compact if and only if all the direct links are correct. Since
P is balanced, the arrows that start from p are never more than O(log n) and we
can decide to which elements they must point by first searching for the biggest
point less than p in P and then determining the other points by consulting P as
usual. If the proximity graph is compact we can find all those elements in time
O(log n), otherwise we need time O(log2 n). Let b1 , b2 , . . . , bk be the O(log n)
elements to which p must point. For each i ∈ {1, 2, . . . , k}, we must insert p in
one of the trees associated to bi , therefore the total time needed is O(log2 n).
Remark 3.2.3. Of course, in a real implementation we will associate a tree Tp,l
of level l to a point p only if p is pointed by a big enough number of brothers of
level l.
28
Tq1 ,l
r1 r5 r2 r7 r3 r6 r4
q1
Figure 3.2: This figure shows the tree Tq1 ,l of level l associated to the point
q1 . This tree is used to handle all the arrows of level l the points to q1 . In the
figure, the points r1 , r2 , . . . , r7 are the points from which the arrows of level l
start.
29
Tq1 ,l Tp,l
r3 r6 r4
r1 r5 r2 r7
q1 p
Figure 3.3: This figure shows the trees of level l associated to the points q1 and
p. Before the point p was inserted in the proximity graph, the situation was
that shown in figure 3.2. Here the original tree Tq1 ,l has been split into the new
tree Tq1 ,l and the tree Tp,l so that now the points r3 , r6 and r4 points to p (see
the proof of theorem 3.2.2).
30

Worst Case Efficient Multidimensional Indexing

Cargado por

Información del documento

Descripción original:

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Worst Case Efficient Multidimensional Indexing

Cargado por

Copyright:

Formatos disponibles

Worst Case Efficient Multidimensional Indexing

Let us consider the following problem.

1.1 The one-dimensional case

The way we associate a partition to a number is, of course, completely arbitrary,

Figure 1.1: Simple scheme resembling a binary search tree.

Figure 1.2: Simple balanced binary search tree.

Figure 1.3: Simple skip list.

1.2 The multidimensional case

1.2.1 Some common approaches

Figure 1.4: This figure shows a space on which a lexicographic ordering is

1.2.2 UB Trees and space-filling curves

2.1 Logarithmic Decomposition

(ν1 , νi+1 , . . . , νn−i+1 ) if i divides n

Example 2.1.3. If S = (5, 1, 7, 2, 8, 2, 9, 6), then

δS1 = ((5), (1), (7), (2), (8), (2), (9), (6))

Definition 2.1.4. If S is a finite sequence, its logarithmic decomposition is the

δS16 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16

δS8 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16

δS4 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16

δS2 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16

δS1 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16

Figure 2.1: In this figure we see a subsequence P of a sequence S and a 2-

Definition 2.1.7. If S = (s1 , s2 , . . . , sn ) is a sequence of sequences, then its

δS16 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16

δS8 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16

δS4 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16

δS2 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16

δS1 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16

4. none of the above

Definition 2.2.8. If X = {x1 , x2 , . . .}, then we define X< as the sequence

Example 2.2.12. Let a, b ∈ U × V. We know that a < b ⇐⇒ au < bu ∨ (au =

Proof. (sketch) First we determine T = (t1 , t2 , . . . , tn ) such that it contains the

comparisons at most. Because |H| = dlog2 ne + 1, the total time is indeed

2.3 Second method: O(log n)

2.4 The Multidimensional Case

Making a proximity graph dynamic is not a trivial task. We want to be able

3.1 Splitting and Fusion

3.2 Point Insertion and Deletion

Theorem 3.2.2. If P is a balanced relaxed proximity graph, a point p can be

También podría gustarte