Está en la página 1de 3

1

Nearest and Furthest Points in M-dimensional Space


Shashidhar M. Sugur
III sem. M.Tech. Department of Studies in Computer Science
University of Mysore,
Mysore.

Abstract: In this paper, we present an algorithm to find
the k nearest and furthest point from a particular point
for data set having n points in m-dimensional space,. The
distance measure and the neighborhood function has many
applications in various fields of computing and
engineering such as artificial intelligence, image
processing and pattern recognition.
Keywords: Distance measure, k-nearest-neighbors,
and k-farthest-neighbors.
1. INTRODUCTION
Given n points in m dimensional space, we want to find all
distances from the selected point P (x
1
, x
2
, ..x
m
).
The distances between p and other points, is computed by
the distance formula


We tag each distance from the point p and put these points
with their related distances into an array a[1..n], then we
implement any sorting technique either in ascending or
descending order to these points and keeping their tags
attached to them. Using the position of the points which
are sorted we select the K
th
nearest point and K
th
farthest
point.
2. ALGORITHM DEVELOPMENT
To trace out the operation of the algorithm, we take
random example in simple two dimensional array and
explain the mechanism of finding the K
th
nearest and
farthest from the selected point.
Considering five points, for instance (1,9), (2,6), (6,5),
(3,3) (4,4) and (5,7) in two dimensional array as shown in
the graph below.


Note: For two dimensional graphs the distance formula
is
d =
2
1 2
2
1 2
) ( ) ( y y x x +
If the chosen point is (3,3) from the points as given in
the table below.
Tag
1 2 3 4 5
Point (1,9) (2,6) (4,4) (5,7)
(6,5)

Therefore, the algorithm makes use of the formula to
determine (d) the distance between two points for all
neighbors and store them in one-dimensional array as
shown below.
Tag 1 2 3 4 5
Distance 6.32 9.06 1.41 4.47 3.61

The distance is sorted in ascending order by using any
sorting technique. In our case, we have used bubble
sort. As shown below
PrevTag 3 5 4 1 2
NewTag 1 2 3 4 5
Distance 1.41 3.61 4.47 6.32 9.06

p p
i
n
i
i
y x d
/ 1
1
) | | ( =

=
2

The k
th
nearest point is previous tag 3 and the k
th
farthest
point is previous tag 2 are fetched from the sorted array.
Algorithm: K
th
-nearest and farthest.
Input: n, m, f[i][j]
i=1..n,j=1..m
, x[i]
i=1..m
, k.
Output: k nearest and k farthest points from point x.
Method:
{
// to find distance matrix
for (i0 to n)
{
s0
for(j0 to m)
{
qx[j]-f[i][j]
ss+q*q
}
d[i] square root(s)
c[i]i
}

for (i0 to n-1)
{
for (ji+1 to n)
{
if(d[i]>d[j])
{
tempd[j]
d[j]d[i]
d[i]temp
tempc[j]
c[j]c[i]
c[i]temp
}
}
}

//to display k nearest points.
for (i=0 to k)
{
for (j0 to m)
{
Display ( f[c[i]][j])
}

}

//to display k farthest points.
for (i=n to (n-k+1))
{
for(j0 to m)
{
Display (f[c[i]][j])
}

}

}

3. A PRIORI ANALYSIS

After having made an analytical review of the
algorithm, we ended up with the following calculated
time complexity for the algorithm, as implementation
using table would be too lengthy.
The time complexity for the algorithm is as follows:
Best case:
) (
arg
15 12 3
2 2
2
n n t
e l n
for
n mn n t
O ~

+ +
O
O

Worst case:
) (
arg
3 15 12 11
2 2
2
n n t
e l n
for
n mn n t
o
o
O ~

+ + +

(n
2
) O(n
2
) = (n
2
).
The total time taken by the algorithm, hence is
dependent of two factors, total number of points and
size of dimensional space.
4. PROFILING:
As seen in the a priori analysis, total time taken by the
program to execute is dependent on two factors, that is
total number of points in space, and dimensions of the
space.
For the above reason a posteriori analysis has been
done by first considering size of dimensions constant
and varying only number of points whose table of
values is shown below.

3

n Best case Worst case Average
5 270 430 356.5833
10 690 1410 1081.833
15 1260 2970 2093.667
20 1980 5020 3582
25 2850 7650 5541.833
30 3870 10830 6807.5
Table 4.1 Values by keeping constant dimensions.
The graph below has been generated from the table of the
values were generated from the experiment.

The graph shows a quadratic relationship between the
computing time and the size of the list for both best and
worst cases. There exists an average case which lies
averagely between the best and worst case.
Lastly by considering the variation in dimensions and
keeping number of points constant as shown in the table of
the values below.
M Time
2 1260
3 1440
4 1620
5 1800
6 1980
7 2160
Table 4.2 Values by keeping points constant.
The graph below has been generated from the table of the
values were generated from the experiment

We observe that time complexity is linear
5. CONCLUSION
In conclusion the time complexity of the algorithm is
depends on the method of sorting we using it in the
algorithm, here we using bubble sort method, so the
time complexity is ) (
2
n u in n where n represents the
number of points in the neighborhood. If a better
sorting technique like merge sort was used in the above
algorithm then time complexity would be n nlog .
The space complexity is depend on the size of
dimension space, and the calculated distance values are
stored in one-dimensional array whose size depends on
size of the inputs, and some temporary variables.
Reference
[1] R.G Dromey, How to solve by computer, Prentice
Hall of India, New Delhi, 2004.
[2] Horowitz, Sahni, Rajasekaran, Fundamentals of
computer algorithms, Glogotia Publisher New Delhi
2004.
[3] Steven C. Chapra and Raymond P. Canale,
Numerical methods for engineers, Tata Mcgraw
Hill, new Delhi 2004.

También podría gustarte