Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Ibrahim F. Tarrad
Hesham F. A. Hamed
Faculty of Engineering
Faculty of Engineering
Faculty of Engineering
Faculty of Engineering
Minia University
Al Azhar University
Al Azhar University
Al Azhar University
Minia, Egypt
Qena, Egypt
Cairo, Egypt
Qena, Egypt
Email: ahmedf.abdelfatah@gmail.com Email: tarradif@gmail.com Email: aawad@ieee.org Email: hfah66@yahoo.com
AbstractData encryption has become a vital need for protecting the user data in most of communication areas. Advanced
Encryption Standard (AES) algorithm has become the optimum
choice for various security services in numerous applications due
to its reliability and flexibility. The AES algorithm faces two main
challenges which included in both encryption/decryption speed,
and the consumed implementation area. This paper presents an
optimized implementation of the AES algorithm with respect to
the consumed implementation area by combining both data and
key expansion approaches. The optimized implementation of AES
increases its applicability in the small sized devices such as mobile
phones and smart cards. The experimental outcomes prove the
superiority of the proposed optimization approach compared
to the available approaches in the literature with acceptable
frequency and throughput for low throughput applications.
I. I NTRODUCTION
Data encryption is an important process in almost all data
transaction applications such as e-commerce, electronic banking and even over simple web-based applications. Data encryption is the process of transferring data into a scrambled format,
but at the same time, it allows the intended recipient to restore
the original data by using secret key. Data encryption and
decryption are the two major functions in any cryptography
system. Encryption process transfers data into unintelligible
format by secret key to guarantee the user privacy. Decryption
is the opposite process that is used to restore the scrambled
data into its original format using the same secret key [1], [2].
Advanced Encryption Standard (AES) algorithm is one of
the symmetric key block ciphers with fixed block size as 128
bit, and different key lengths as 128 bit, 192 bit, and 256
bit [3], [4]. However, the AES provides moderate security
level with 256 bit key length, some AES applications are
keep struggling for small size implementation area such as
smart card and cellular phones. Therefore, the implementation
area is considered as an important factor for the real time
deployment of the AES algorithm. The optimization of the
AES consumed area is an interesting problem, and reducing
the AES implementation area is a compulsory requirement.
Field Programable Gate Array (FPGA) [5] is an Integrated
Circuit (IC) that can be repeatedly reconfigured according
to the need of the implemented applications. It produces
different behaviors related to simple configuration changes [6].
According to the previous property and its low cost, FPGA is
197
Fig. 1.
shifting the bytes in the second, the third, and the fourth rows
by one, two, and three, respectively. It is worth mentioning
that this function requires no hardware resource, and it can be
executed on the FPGA as plain wiring [10]. MixColumns
is a linear transformation, and it is conducted on the State
matrix column by column. The key schedule operation of the
AES algorithm generates a total number of words equals to
Nb (Nr + 1) in order to accomplish the encryption and the
decryption operations [14], [15], [16].
The decryption process is the inverse operation of the
encryption one which inverse the round transformations in
order to restore back the original plain data. The round
transformations of the decryption process have four functions as AddRoundKey, InvMixColumn, InvShiftRows
and InvSubBytes, respectively. AddRoundKey is an
XOR function. InvShiftRows have the same function as
ShiftRows but only in the inverse direction. Thus, the first
row is not going to be changed, but the second is shifted
by one, the third is shifted by two, and last is shifted by
three. The InvSubBytes transformation is performed using
a permutation table called InvS-box that has 256 numbers
(from 0 to 255) [9]. Fig. 1 demonstrates the block diagram
of the operated modules in the decryption process. It is worth
noticing that the AES algorithm can operate in four modes,
Cipher Block Chain (CBC), Cipher Feedback (CFB), Output
Feedback (OFB), and Electronic Code Block (ECB) [1], [17].
III. R ELATED W ORK
Optimizing the implementation area of the AES algorithm is
an interesting problem, and many tries to tackle this problem
are found in the literature. Hamalainen et al. [13] reduced
the implementation area (number of gates) by parallelizing
the AES operations on the FPGA. The high level architecture
consists of byte permutation, MixColumn multiplier, parallel
198
Fig. 2. The proposed key expansion approach. The 128 bit key length is
divided into 4 words with 32 bit length, and processed using a single S-box.
Fig. 3. The Register Transfer Level (RTL) block diagram of the proposed
area optimization approach with key expansion and data expansion modules.
TABLE I
T HE SPECIFICATIONS OF AES WITH DIFFERENT KEY LENGTHES . T HE Nk
AND THE Nb ARE MEASURED BY WORD THAT CONTAINS 32 BITS [4].
AES128
AES192
AES256
[13]
[18]
[19]
[20]
Proposed
# of rounds (Nr )
10.0
12.0
14.0
199
Area
3100
2699
3576
1838
1226
Frequency (MHz)
152
45
194.7
50.2
157
Throughput (Gbps)
0.121
0.010
24.922
0.642
0.050
Fig. 4. The timing diagram of the encryption process produced from the simulation of the optimized AES algorithm. The encryption process consumes 169
clock cycles with clock period as 6.37 ns. The total consumed time is (169 6.37 = 10.76 s). The figure shows the input data, the key, and the cipher text.
V. P ERFORMANCE E VALUATION
This paper has presented an optimization to the AES algorithm in terms of the consumed implementation area. The
presented optimization reduces the consumed implementation
area by combining the input block expansion with the key
expansion. The combination of the two expansions reduces
the number of total used S-box from 20 to 2 for both input
block and key management. The simulation of the proposed
approach on FPGA hardware proves its superiority compared
to the other approaches. However, the presented approach
optimizes the consumed area, it takes much clock cycles to
accomplish the encryption process. Therefore it is suitable
for the applications with low throughput. Compromising the
tradeoff between the AES area and the encryption speed as a
complete AES optimization will be tackled as a future work.
A. Evaluation Results
The proposed approach has been implemented using FPGA
Spartan3 (XC3s400-4pq208) board, and VHDL programming
language. The Register Transfer Level (RTL) block diagram
of the implemented approach is shown in Fig. 3. The figure demonstrates the three major functional blocks as the
key expansion, the input data expansion, and the main data
encryption block. The key expansion and the input data
expansion blocks work as supporting blocks in priori to the
main encryption block.
The simulation of the implementation shown in Fig. 3
produces the timing diagram represented in Fig. 4. The timing
diagram expresses the input data as {32 43 F6 A8 88 5A 30
8D 31 31 98 D2 E0 37 07 34}, the key in hexadecimal as
{2B 7E 15 16 28 AE d2 A6 AB F7 15 88 09 CF 4F}, and the
output cipher text as {39 25 84 1D 02 DC 09 FB DC 11 85 97
19 6A 0B 32}. Additionally, it demonstrates that the execution
process consumes 169 clock cycles with 6.37 ns clock period.
Therefore, the total consumed time can be calculated as 169
6.37 = 10.76 s. The maximum achieved frequency is the
reciprocal of the clock period which is equal to 157 MHz.
A comparative study between the performance of the proposed optimization approach and the other approaches in the
literature in terms of consumed area, achieved frequency, and
archived throughput is shown in Table II. The table confirms
that the proposed AES optimization approach consumes the
lowest area, 1226 slices, with achieved frequency higher than
the one that consumes 1838 slices. The problem with the
proposed optimization is the low throughput value, 0.05 Gbps,
but this value is acceptable in the low throughput applications.
200
R EFERENCES
[1] H. C. v. Tilborg, Encyclopedia of Cryptography and Security. Secaucus,
NJ, USA: Springer-Verlag New York, Inc., 2005.
[2] C. Paar and J. Pelzl, Understanding Cryptography: A Textbook for
Students and Practitioners, 1st ed.
Springer Publishing Company,
Incorporated, 2009.
[3] J. Daemen and V. Rijmen, The Design of Rijndael: AES - The Advanced
Encryption Standard. Springer-Verlag, 2002.
[4] National Institute of Standards and Technology, Advanced
encryption standard (AES), FIPS Publication 197, pp. 1
51, 2001, last visit on 24/08/2013. [Online]. Available:
http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
[5] S. Kilts, Advanced FPGA Design: Architecture, Implementation, and
Optimization. Wiley-IEEE Press, 2007.
[6] O. Gomes, R. Moreno, and T. Pimenta, A fast cryptography pipelined
hardware developed in FPGA with VHDL, in the 3rd International
Congress on Ultra Modern Telecommunications and Control Systems
and Workshops (ICUMT), October 2011, pp. 16.
[7] M. H. A. Mijalli, Efficient realization of S-Box based reduced residue
of prime numbers using Virtex-5 and Virtex-6 FPGAs, American
Journal of Applied Sciences, vol. 8, no. 8, pp. 754757, 2011.
[8] National Institute of Standards and Technology, last visited on
24/08/2013. [Online]. Available: http://www.nist.gov/index.html
[9] T. Hoang and V. L. Nguyen, An efficient FPGA implementation of
the advanced encryption standard algorithm, in the IEEE RIVF International Conference on Computing and Communication Technologies,
Research, Innovation, and Vision for the Future (RIVF), March 2012,
pp. 14.
[16]
[17]
[18]
[19]
[20]
201