Está en la página 1de 5

A Thinning Algorithm for Digital Figures of Characters

Michio SHIMIZU, Hiroshi FUKUDA*, Gisaku NAKAMURA**


Nagano Prefectural College, University of Shizuoka*, Tokai University**
E-mail: mitio@ns.nagano-kentan.ac.jp, fukuda@u-shizuoka-ken.ac.jp*

Abstract
A thinning scheme could be used as a useful method of performing a local transaction are introduced. Then, we
pre-processing in image processing. Various algorithms evaluate the performance of our scheme by investigating
have been proposed to produce the skeleton of a digital the thinning for 432 kinds of commonly used character
binary pattern. However, they have undesirable properties fonts. We find that the skeletons obtained from our
that tend to cause shrinking or vanishing of segments, the method give better results than those of any other major
appearance of a beard, and warping where segments thinning algorithms.
intersect. In this paper, we propose a parallel Hilditch
algorithm to acquire a more stable output. In particular, 2. Definition
we introduce two kinds of masks that are effective in the
thinning of digital figures of characters. Then, we First, the terms which are necessary to describe the
evaluate the performance of our scheme by investigating algorithms are defined. A processing object is an binary
thinning for 432 kinds of commonly used character fonts. graphic which takes 0 or 1 on a plane tetragonal lattice. A
We conclude that the skeletons obtained from our method thinning is setting the width of the line of this binary
give better results than those of any other major thinning graphic to 1, and the graphic of the width 1 generated by
algorithms. the thinning is called a core line or skeleton. Figure 1
shows the 8 neighbors of the objective point (pixel 0) on
the pattern of 3 x 3. Among these, especially the pixels
1. Introduction 1,3,5, and 7 are called the 4 neighbors. The 8 neighbors
take the value of 0 or 1, and they are represented by xk (k =
The graphic data taken from an image scanner etc. and 1,2,..,8).
processed by the computer are mainly letters, map
information and design drawing, etc.. In order to process
these graphics, thinning often becomes important. For 4 3 2
example, although a letter is inputted as a picture image 5 0 1
having a certain width, in order to simplify the task of the
computer in character recognition, it is desirable to carry 6 7 8
out a pre-processing to produce a line drawing which is as
thin as possible. Thus the technique of making a line thin Figure 1. 8 neighbors
without spoiling the information in the original is called
thinning. Various algorithms have been proposed to Now, the value of the objective point is set to 1, and 1
produce the skeleton of digital binary pattern[2]-[5]. is expressed by gray and 0 by white. If two pixels adjoin
However, they have undesirable properties that tend to each other in vertical, horizontal or diagonal direction,
cause shrinking or vanishing of segments, the appearance they are said to be connected. Figure 2 shows an example
of a beard, and warping where segments intersect, and the of connection of pixels.
benefits may depend on the processed objects. There is no
clear definition of a thinning, then the quality criterion
would likely become subjective. It could not be said that
the experiment has been fully validated based on concrete
processing objects.
In this paper, a new thinning algorithm for digital
figures of character is proposed. The classic Hilditch
algorithm [2] is parallelized and two kinds of masks for Figure 2. Connection of pixels
The connectivity number Nc is defined by the equation: The classical Hilditch algorithm is a fundamental
sequential algorithm using the connectivity number[2]. In
Nc ¦x
k 1, 3 , 5 , 7
k (1  x k  1 x k  2 ) this paper, in order to introduce the masks described later,
the Hilditch algorithm is transposed to a parallel type, and
this is the kernel of our algorithm. The operation consists
where x k 1  x k , and k + 2 = 1 for k = 7[6]. The
of a pre-processing stage and a scan stage. In the pre-
connectivity number takes the value of 0, 1, 2, 3, and 4. processing stage, boundary points with connectivity
The point of Nc =1 is called the boundary point, and number Nc = 1 are searched. During the scan stage, the
generally it becomes the candidate point for deletion. operation is divided into 4 cycles as illustrated by Figure 3.
However, among the boundary points, if the total value of In each cycle only those pixels whose neighbor can be
8 neighbors is one or less, the point is called endpoint and identified as xc = 0 are checked, and if two conditions are
it can not be deleted. If the graphic does not contain any satisfied:
boundary points except an endpoint, it is called a complete 8
8 connection or complete 8 connected core line.
The complete 8 connected core line is generated by the
(1) ¦x
k 1
k ! 1 : Not end point,
Hilditch algorithm or parallel Hilditch algorithm as stated (2) Nc 1 : Boundary point,
in the following paragraph. But ultimately the core line
generated by our algorithm is called incomplete 8 the pixel is removed. The algorithm is stopped when there
connection. This is defined by the intersection number: are no more point to delete. The flow of a parallel Hilditch
8
algorithm is shown in Figure 4, where point 2 denotes a
d ¦k 1
x k  x k 1 candidate point to delete and point 3 a point to delete.

d expresses the number of 0,1 pattern’s variation in the 8


neighbors. Since the value of d is an even number of 2,4,6 Input of binary pattern
and 8, d’ = d/2 is used instead of d. Then, for all pixels of point 1 → point 2
the core line, if (d’ ≠1) or (d’=1 and Σ χ k ≦1) are
satisfied, it is said to be the incomplete 8 connection.

3. Parallel Hilditch algorithm Boundary points except edge points


among 2 → point 3
A thinning algorithm consists of the repetition of
shaving a boundary and the stop judging process which
decides whether a core line is generated. The repetitive Deletion of point 3 by 4 cycles
shaving process is roughly divided into the sequential
model and the parallel model depending on the timing of
deletion. As for the sequential model, the raster-scan type Satisfy stop N
which scans the pixels one line at a time toward lower condition
right from upper left is general. Although the algorithm is

brief, there is an asymmetric problem due to the scanning
direction. A parallel model is the technique of processing
all pixels simultaneously and there is no problem of Figure 4. Parallel Hilditch algorithm
asymmetry. But, a device which does not extinguish the
graphic of line width 2 is needed. Therefore, the technique 4. Masks
of dividing each repetitive process into several further
cycles is used. By parallelizing, it may be possible to introduce two
kinds of masks which are used for the processing of
exceptional points. Masks are already applied to some
other parallel thinning algorithms. In this study, various
masks were added to the parallel Hilditch algorithm for
the character graphic, and their usefulness was examined.
Because the processing time of an algorithm is affected by
cycle1 cycle 2 cycle 3 cycle 4 the number of masks or the size of masks, suitable masks
need to be considered. Consequently, we decided to add
Figure 3. Four cycles

- - + ▲
- ● -
- - + △
(a) (b) △
(a) char. L
Figure 5. Exceptional point
▲ ▲ ▲
the following internal point masks and cross point masks.
- + + ▲ - + - △
The number of masks becomes 48 as shown below. The
notions in the masks are as follows: + ● - + ● + △
- - + △ - +
●: objective point

+: point whose value is greater or equal to one
(b) char. T(1) (c) char. T(2)
-: point whose value is zero
△,▲: Each of two points is + Figure 7. Cross point mask
space: arbitrary point
Second, we introduce the cross point masks used in
First, in order to protect the angle of the characters “L” the final step of producing a core line. Figure 5(b)
or “V”, the internal point masks are introduced. Although illustrates the crossing point as the center of the “T” figure.
the part shaded gray of Figure 5 (a) are candidate points This point is already removed because of Nc = 1, but it
for deletion found by pre-processing, they do not include may be preferable visually for it to be left. In that case, an
the black point in the corner. However, this point will introduced cross point mask finds such a point as a
cause warp in the corner in the course of successive non-deletable point. There are two kinds of masks, for
processing. Therefore, the internal point mask of character character L and character T. The cross point masks of
L detects this pixel and adds it to the candidate points of a character L are 8 masks which is Figure 7(a) and its
deletion. The internal point masks of character L are 4 rotations of π/2, πand 3π/2. On the other hand, for the
masks as in Figure 6(a) and its rotations of π/2, πand 3 intersection of the character T, 12 cross point masks are
π/2. On the other hand, for the corner of the character V prepared. They are the 4 masks of Figure 7(b), its rotations
which has a smaller angle than L, 24 internal point masks of π/2, π, 3π/2 , and the 8 masks of Figure 7(c) , its
are prepared. They are Figure 6(b), (c), (d) and their rotations of π/2, π, 3π/2 and their mirror images.
rotations of π/2, πand 3π/2 , and their mirror images. Therefore, the sum of cross point masks is 20.
Here, ★ denotes an internal point of character L. So the By the way, the classical Hilditch algorithm or our
sum of internal point masks is 28. parallel Hilditch algorithm described in section 3 produces
a complete 8 connected core line, but by introducing
▲ ▲ + masks, an incomplete 8 connected core line will be
produced. The flow of our parallel Hilditch algorithm with
+ - + + - ▲ masks is shown by Figure 8. This is an extension of Figure
+ ● + △ + ● ★ + ▲ 4, and point 1 denotes an eternal fixed point.
+ △ +
5. Evaluation
(a) char. L (b) char. V(1)
To judge the performance of our algorithm, we made a
+ + comparison with some other major algorithms. The
experiment methods, algorithms, character font,
+ + - + + - evaluation items, and evaluation methods are as follows.
+ ● ★ + △ + ● ★ + ① Algorithms: The former 4 algorithms of Hilditch[2]
+ + - △ + - + (abr.HD), Deutsch[3](DA), Tamura[4](TA),
▲ ▲ ▲ ▲ △ △ Tsuruoka[5](TS), plus our Parallel Hilditch(PH) as a
(c) char. V(2) (d) char. V(3) kernel, Parallel Hilditch with masks(PHM). Thus, 6
algorithms are compared. The parallel Hilditch is
Figure 6. Internal point mask
taken up as reference, in order to clarify the benefits
of masks.
Input of binary pattern
point 1 →  point 2

Boundary points except edge


points among 2 → point 3

Points searched by internal point mask


among 2 → point 3

Points searched by cross point mask


among 2 or 3 →  point 1
Figure
Fig ure 9 Results produced through thinning
Upper: HD, DA, TA Lower: TS, PH, PHM
Deletion of point 3 by 4 cycles

Satisfy stop N Main results are:


condition (a) Hilditch algorithm is likely to produce beard, and
Parallel Hilditch algorithm is likely to produce warp

in the intersection part.
(b) Skeletons produced by the parallel Hilditch algorithm
Figure 8. Parallel Hilditch algorithm with mask with masks showed the best qualities for most of the
factors mentioned above, however processing time
② Character font: MS Gothic font of 3 sizes (18 points, increased slightly.
36 points, 72 points). They consist of 144 characters
The advantage of this algorithm which corrects the
comprising the alphabet (uppercase, lower case),
beard by the concurrency and corrects the warp in the
katakana, and hiragana. Thus the total number of
intersection part by masks is shown. The processing time
characters is 432.
increases slightly because time is taken with the pattern
③ Evaluation items: Seven quality evaluation items[4]
matching of a mask and a part of the letter graphic, since
of core line are used. They are deviation from central
the mask is extended to 5 x 5.
position, shrinking of segment, appearance of beard,
warp in the intersection part of character L, character
T, character + others.
Table 1.
1 . Evaluations for thinning algorithms
④ Evaluation methods: It is impossible to make
quantitative evaluations for the quality of core lines.
So, we estimate their qualities by visual sensory average frequency processing
method property
evaluation of 5 phase; 1 (good), 2 (somewhat good), 3 evaluation (bad) time
(usual), 4 (somewhat bad), and 5 (bad). For the 432 HD beard 2.530 13 0.11
characters and 6 algorithms, the same person performs
an evaluation twice in order to control the subjective multiple pixels at
DA 2.370 21 0.11
factor. crossing point
A system treating bitmap data of a character font was TA shrinking 2.640 28 0.17
developed on Windows, and the thinning experiment was
performed. Figure 9 shows the results of thinning “ア” for TS many beards 3.167 76 0.17
6 algorithms. In this example, a shrinking of TA, beard of
HD,DA,TS, warp in intersection of PH are seen. The warp in crossing
PH 2.465 13 0.17
results of the experiment is shown in Table 1. The unit of point
the processing time is msec. There is very little difference increased
PHM 1.954 9 0.39
between the results when the experiment is repeated. processing time
6. Conclusion References
We have attempted to develop of a thinning algorithm [1] A. Rosenfeld, “Connectivity in digital pictures”, J.ACM,17,
which produces a visually excellent thin letter. In 1, pp.146-160(1970).
particular, various masks were introduced and their effects [2] C. J Hilditch, “Linear skeletons from square cupboards”,
were examined. We are able to establish that a relatively Machine Intelligence 4, (Edinburgh Univ. Press),
pp.403-420 (1969).
small number of masks can produce a desirable result. [3] E. S. Deutsch, “Thinning algorithms on rectangular,
Moreover, the core line satisfies the condition of hexagonal, and triangular arrays”, C.ACM, vol.15,no.9,
incomplete 8 connection which produces a visually pp.827-837 (1972).
excellent core line. [4] H. Tamura, “A comparison of line thinning algorithms from
It may be considered that our algorithm is applicable digital geometry viewpoint”, Proceedings of 4th Int. Joint
not only to letter graphics but also to common binary Conf. On Pattern Recognition, pp.715-719(1978).
image patterns. However, as for the thinning of two lines [5] S. Tsuruoka, F. Kimura, M. Yoshimura, Y. Miyake, “A
which cross at a certain angle, like the letter “X”, the thinning algorithm for digital binary pictures”, PRL 78-47,
intersecting part could probably still be improved. pp.41-49(1978).
[6] S. Yokoi, J Toriwaki, T. Fukumura, “Topological properties
in digitized binary pictures”, ibid., 56-D, 11,
pp.662-669(1973).

También podría gustarte