Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Microprocessor Performance
Table of Contents
Table of Figures
Table of tables
Microprocessor Performance
1 Introduction
Microprocessors are everywhere around the world and more precisely they are in every
embedded system that we can imagine. Moreover, they are the main component to look
at when you are evaluating a system or when you are planning to create a new one. For
instance, evaluating microprocessor performance gives the designer a better sense to
select the device that fits the requirements of a embedded system design. However,
assessing performance of a microprocessor can be in some way challenging (Patterson
and Hennessy, 2005). This report will evaluate performance of ten different processors.
2 Scope
This report will evaluate the performance of the following 10 different processors:
68HC12
6805
8051
DS89C420
ST62
AT90S8515
PIC12C5xx
TMS370
Siemens C166
ARM7
The way to measure performance can be defined differently. For this report point of view
the way of measuring performance will be done using time as measurement unit. For
instance, every processor will be evaluated in the amount of time that it takes to execute a
program. The following formula will be used to evaluate execution time (Patterson and
Hennessy, 2005):
Figure 3: Bubble sort algorithm and its implementation in C. (Beckett, P., EEET2039, Embedded System Lecture
Notes)
3 Assembler Code
The way to evaluate performance between different processors is to take a program and
translate into its assembly code. In this particular case to take the bubble sort algorithm,
written in C code. However, “coding in assembler is an art” (Becket, P., EEET2039
Embedded System Design Lecture, 11th September). For instance, one of the factors that
affect execution time is the assembler code and this is done by the programmer.
4 Cycles calculations.
In order to calculate the number of cycles per program it is necessary to get the number of
cycles per instruction. Using the following table it is possible to get the total number of
cycles that the bubble sort algorithm takes in each processor.
Cycles/
Line Label Pseudo Code Cycles Calculation
PseudoCode
1 START: load i, 1; Zvar =ZVar
2 _L5
3 compare i, sizeof(Data) AVar =AVar*SzData
branch to _L1, if
4 BVar BVarN =BVar*1+BVarN*(SzData-1)
i>=sizeof(Data)
5 clear j CVar =CVar*(SzData-1)
6 _L4
Where:
Sizeof(Data)−1
NumIterations = ∑i =1
( i ) = 21 and
For the purpose of this report the numbers where organized in the following way
{7,6,3,4,5,2,1}. For instance:
NumSwaps = 18
5 Evaluation Performance
5.1 68HC12
1
Clock Cycle Time = * 2 = 2µ s
1Mhz
6 EndData:
7 SzData: EQU (EndData-Data)
8 ORG $FF00 ;ROM Memory
BYTES MEMORY RAM 3
9 START:
10 BSort:
11 LDAA #1 ;i = 1 2 1
12 STAA Var_i ; 2 2
13 _L5: LDAA Var_i ; 2 3
14 CMPA #SzData ;if (i<SzData) 2 1
15 BGE _L1 ; { 2 3/1
16 CLR Var_j ; j=0 3 3
17 _L4: LDAA Var_j ; 2 3
18 ADDA Var_i ; 2 3
19 CMPA #SzData ; i+j 2 1
20 BGE _L2 ; if (i+j<SzData) 2 3/1
21 LDAB Var_j ; { 2 3
22 LDX #Data ; 3 2
23 ABX ; 2 2
24 LDAA 0,X ; X = &Data[j] 3 3
25 LDAB 1,X ; A = Data[j] 3 3
26 CBA ; B = Data[j+1] 2 2
27 BLO _L3 ; if (A>B) 2 3/1
28 STAA 1,X ; { Unsigned 3 3
29 STAB ,X ; A->B 3 3
30 _L3: INC Var_j ; B->A } 3 4
31 BRA _L4 ; j++ 3 3
32 _L2: INC Var_i ; } 3 4
33 BRA _L5 ; i++ 2 3
34 _L1: BRA _L1 ; } 2 3
35 ORG $FFFE ;
36 FDB START ;reset vector
;set to start program
BYTES MEMORY ROM 57
Table 2: Assembler code for 68HC12
5.2 6805
1
Clock Cycle Time = * 2 = 2µ s
1Mhz
5.3 8051
1
Clock Cycle Time = *12 = 12 µ s
1Mhz
5.4 DCS89C420
1
Clock Cycle Time = *1 = 1µ s
1Mhz
5.5 ST62
1
Clock Cycle Time = *13 = 13.µ s
1Mhz
5.6 AT90S8515
1
Clock Cycle Time = *1 = 1µ s
1Mhz
5.7 PIC12C5X
1
Clock Cycle Time = * 4 = 4µ s
1Mhz
5.8 TMS370
1
Clock Cycle Time = *1 = 1µ s
1Mhz
5.9 C166
1
Clock Cycle Time = *1 = 1µ s
1Mhz
6 Performance Comparison
6.1 Time
30.000
ExecutionTime(ms)
25.000
20.000
Time (ms)
15.000
10.000
5.000
0.000
x
12
70
05
51
62
7
6
5x
M
42
51
16
S3
C
ST
68
80
AR
C
9C
S8
C
H
12
TM
68
s
90
S8
en
C
AT
PI
D
em
Si
Microprocessor
Figure 4: Performance Comparison (time)
6.2 Memory
Memory Comparison
70
ROM(Bytes)
60
RAM(Bytes)
50
Bytes
40
30
20
10
6
0
12
62
7
15
05
51
5x
16
42
M
C
S3
85
ST
68
80
AR
C
9C
H
12
TM
S
68
s
S8
90
en
C
PI
em
AT
D
Si
Microprocessor
Figure 5: Memory Comparison
7 Conclusions
The codification of each program was in certain way challenge because its processor has
it own architecture and its own set of instructions.
Generally, in each processor assessed in this report was possible to translate the high level
language into machine code. Furthermore each processor, has its own set of instruction
which could be categorized in the following way (Stallings, W., 2006):
Data processing: Through this instructions is possible to perform arithmetic and logic
instruction.
Data Storage: The way to store information into memory
Data movement: instruction that allow movement of data input and output.
Control: instructions to test and execute branch.
There are some processors which their instruction set allows easily the implementation of
the algorithm. Furthermore, they have a set of instructions that allow data manipulation in
a easier way. For example the DS89C420 and 8051 were the two that had less numbers of
program ROM. It means that the instructions for moving data (specially swapping) are very
useful. Indeed, DS89C420 is one of the processor which executed the program faster.
Addressing is the next factor that affects the performance of a processor. Despite the fact
that there are different types of addressing, not every processor has the same kind of
instruction to access memory data. Generally, seven different types of addressing can be
defined (Stallings, W., 2006):
Immediate
Direct
Indirect
Register
Register Indirect
Displacement
Stack
In addressing mode it is possible to define that there were some microprocessors which
their instruction set allow and effective way of implementing and executing the for loop.
Instruction execution is always performed in the same way. No matters which architecture
is being used, the process is always the same (Stallings, W., 2006):
Fetch Instruction
Interpret Instruction
Fetch Data
Process Data
Write Data
The key point to execute faster a set of instructions for each processor is its the architecture.
For instance, depending of the architecture it is possible to define a RISC or a CISC set of
instructions. However, generally speaking RISC architecture are in certain way more
complex that CISC architecture. The main reason of that is that RISC processor should have
a large amount of auxiliary registers in order to support the reduced number of instruction
sets.
Another aspect that affect performance has to do with the kind of architecture
implemented, Von Neuman or Harvard, The main reason is the memory implemented in
each machine; in one hand Von Neuman architecture has only one memory for data and
program and in the other hand Harvard architecture has different memory for data and
memory program.
However, nowadays is not possible to define if a processor has only one specific
architecture. Indeed, is it possible to get RISC processor with some CISC characteristics and
vice versa.
The most important factor to improve performance in some processors (ARM) is the
instruction pipelining. Pipelining allows processor to execute in “parallel” various
instructions. However, there are some aspects that should be taken into considering when
you are designing an application with those processors (Evans, J. and Eckhouse, R.):
In conclusion it is possible to evaluate many different aspects of each processor but is the
real application work who will define which processor is better for certain kind of taks.
(Evans, J. and Eckhouse, R.). However, the activity developed in this report allows to get a
initial sense of the main advantages or disadvantages of using a microprocessor for a
specific application.
8 References
Atmel Corporation 2000, AT90S8515 - 8-Bit AVR Microcontroller, www.atmel.com , San Jose.
Atmel Corporation 1999, ARM7TDMI (Thumb) DataSheet, www.atmel.com, San Jose
Beckett, P 2006, Embedded Systems Design, course notes from EEET2039, RMIT University,
Melbourne.
Dallas Semiconductor 2005, DS89C420 Ultra-High Speed Microcontroller, www.maxim-
ic.com
Evans, JS and EckHouse RH 1999, Alpha Risc Architecture for programmers, Prentice Hall,
New Jersey.
Freescale semiconductor 2006, CPU12 Reference Manual, www.freescale.com
Infineon Technologies 2001, Instruction Set Manual for the C166 Family of Infineon 16-Bit
single-chip microcontrollers, www.infineon.com. München.
Intel Corporation, MCS® 51 1994, Microcontroller Family User’s Manual, Intel Corporation.
Illinois.
Microchip Technology Inc.1999, PIC12C5XX 8-Pin, 8-Bit CMOS Microcontrollers,
www.microchip.com, Chandler.
Motorola, 1999. MC68HC05 Technical Data, Motorola, East Kilbridge.
Patterson DA and Hennesy J L 2005, Computer Organization and Design, 3rd Edition,
Morgan Kaufmann Publishers, San Francisco.
Patterson DA and Hennesy J L 1998, Computer Organization and Design, 2nd Edition,
Morgan Kaufmann Publishers, San Francisco.
SGS-Thomson Microelectronics 1993, ST62 – ST63 Programming Manual.
Stallings, W 2006, Computer Organization and Architecture Designing for Performance, 7th
Edition, Prentice Hall, New Jersey.
Texas Instruments 1996, TMS370 Microcontroller Family User’s Guide, www.ti.com
Texas Instruments 1996, TMS370 and TMS370C8 8 Bit Microcontroller Family Optimizing C
Compiler User’s Guide, www.ti.com
Wunderlich C 2006. “C166: Call-segmented to a ram-address does not work”, start up
assembly code for C166 using Keil Software, 18 August, “Keil Discussion Forum” , viewed
September 11, 2006, www.keil.com/forum/docs/thread8211.asp.
SYMBOL TABLE:
BSORT -FF00 DATA -0003 ENDDATA -000A START -FF00 SZDATA -0003
VAR_I -0000 VAR_J -0001 VAR_TEMP-0002 _L1 -FF32 _L2 -FF2D
_L3 -FF28 _L4 -FF0D _L5 -FF04
Figure 6: Assembler list for 68HC12
The Engineers Collaborative, Inc. WASM05 V2.2 68HC05 Cross Assembler V2.1 (C)1986-1997
SYMBOL TABLE:
DATA -0000 ENDDATA -0007 START -0060 SZDATA -0007 VAR_I -0003
VAR_J -0004 _L1 -008A _L2 -0086 _L3 -0082 _L4 -006D
_L5 -0063
Figure 8: Assembler list for 8051
SYMBOL TABLE:
DATA -0000 ENDDATA -0007 START -0060 SZDATA -0007 VAR_I -0003
VAR_J -0004 _L1 -008A _L2 -0086 _L3 -0082 _L4 -006D
_L5 -0063
Figure 9: Assembler list for DS89C420
SYMBOL TABLE:
DATA -0000 ENDDATA -0007 START -0060 SZDATA -0007 VAR_I -0003
VAR_J -0004 _L1 -008A _L2 -0086 _L3 -0082 _L4 -006D
_L5 -0063
Figure 10: Assembler list for PIC12C5X
The simulator used to test the assembler code was: µvision3 V3.30a - www.keil.com
---------------------------------------------------------------------------
ldiv_t . . . . . . . . . . . . . . . . type struct ----- 8
rem. . . . . . . . . . . . . . . . . member long 000000H 4
quot . . . . . . . . . . . . . . . . member long 000004H 4
_ldiv_t. . . . . . . . . . . . . . . . *tag* struct ----- 8
rem. . . . . . . . . . . . . . . . . member long 000000H 4
quot . . . . . . . . . . . . . . . . member long 000004H 4
div_t. . . . . . . . . . . . . . . . . type struct ----- 8
rem. . . . . . . . . . . . . . . . . member int 000000H 4
quot . . . . . . . . . . . . . . . . member int 000004H 4
_div_t . . . . . . . . . . . . . . . . *tag* struct ----- 8
rem. . . . . . . . . . . . . . . . . member int 000000H 4
quot . . . . . . . . . . . . . . . . member int 000004H 4
wchar_t. . . . . . . . . . . . . . . . type uchar ----- 1
size_t . . . . . . . . . . . . . . . . type uint ----- 4
main . . . . . . . . . . . . . . . . . public code funct 000000H
i. . . . . . . . . . . . . . . . . . *reg* int ----- 4
j. . . . . . . . . . . . . . . . . . *reg* int ----- 4
Data . . . . . . . . . . . . . . . . auto data array 000000H 28
tmp. . . . . . . . . . . . . . . . . *reg* uint ----- 4
BubbleSort?T . . . . . . . . . . . . . public code funct 000000H
?tpl?0001. . . . . . . . . . . . . . . static const array 000000H 28