Está en la página 1de 85

EE382N-4 Embedded Systems Architecture

TheARMInstructionSetArchitecture
MarkMcDermott WithhelpfromourgoodfriendsatARM Fall2008
8/22/2008

EE382N-4 Embedded Systems Architecture

MainfeaturesoftheARMInstructionSet
Allinstructionsare32bitslong. Mostinstructionsexecuteinasinglecycle. Mostinstructionscanbeconditionallyexecuted. Aload/storearchitecture
Dataprocessinginstructionsactonlyonregisters
Threeoperandformat CombinedALUandshifterforhighspeedbitmanipulation

Specificmemoryaccessinstructionswithpowerfulautoindexingaddressing modes.
32bitand8bitdatatypes
andalso16bitdatatypesonARMArchitecturev4.

Flexiblemultipleregisterloadandstoreinstructions

Instructionsetextensionviacoprocessors Verydense16bitcompressedinstructionset(Thumb)
8/22/2008 2

EE382N-4 Embedded Systems Architecture

Coprocessors
Upto16 coprocessorscanbedefined ExpandstheARMinstructionset Eachcoprocessorcanhaveupto16privateregistersofanyreasonablesize Loadstorearchitecture

EE382N-4 Embedded Systems Architecture

Thumb
Thumbisa16bitinstructionset
OptimizedforcodedensityfromCcode Improvedperformanceformnarrowmemory SubsetofthefunctionalityoftheARMinstructionset

Corehastwoexecutionstates ARMandThumb
SwitchbetweenthemusingBXinstruction

Thumbhascharacteristicfeatures:
MostThumbinstructionareexecutedunconditionally ManyThumbdataprocessinstructionusea2addressformat ThumbinstructionformatsarelessregularthanARMinstructionformats,as aresultofthedenseencoding.

EE382N-4 Embedded Systems Architecture

ProcessorModes
TheARMhassixoperatingmodes:
User(unprivilegedmodeunderwhichmosttasksrun) FIQ(enteredwhenahighpriority(fast)interruptisraised) IRQ(enteredwhenalowpriority(normal)interruptisraised) Supervisor(enteredonresetandwhenaSoftwareInterruptinstructionis executed) Abort(usedtohandlememoryaccessviolations) Undef(usedtohandleundefinedinstructions)

ARMArchitectureVersion4addsaseventhmode:
System(privilegedmodeusingthesameregistersasusermode)

8/22/2008

EE382N-4 Embedded Systems Architecture

TheRegisters
ARMhas37registersintotal,allofwhichare32bitslong.
1dedicatedprogramcounter 1dedicatedcurrentprogramstatusregister 5dedicatedsavedprogramstatusregisters 30generalpurposeregisters

Howeverthesearearrangedintoseveralbanks,withthe accessiblebankbeinggovernedbytheprocessormode.Each modecanaccess


aparticularsetofr0r12registers aparticularr13(thestackpointer)andr14(linkregister) r15(theprogramcounter) cpsr(thecurrentprogramstatusregister)

Andprivilegedmodescanalsoaccess
aparticularspsr(savedprogramstatusregister)

8/22/2008

EE382N-4 Embedded Systems Architecture

TheARMRegisterSet
Current Visible Registers
Abort Undef Mode SVC IRQ Mode FIQ Mode User Mode
r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 (sp) r14 (lr) r15 (pc) cpsr spsr

Banked out Registers


User
r8 r9 r10 r11 r12 r13 (sp) r14 (lr)

FIQ
r8 r9 r10 r11 r12 r13 (sp) r14 (lr)

IRQ

SVC

Undef

Abort

r13 (sp) r14 (lr)

r13 (sp) r14 (lr)

r13 (sp) r14 (lr)

r13 (sp) r14 (lr)

spsr

spsr

spsr

spsr

spsr

8/22/2008

EE382N-4 Embedded Systems Architecture

RegisterOrganizationSummary
User
r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 (sp) r14 (lr) r15 (pc) cpsr spsr spsr spsr spsr spsr

FIQ

IRQ

SVC

Undef

Abort

User mode r0-r7, r15, and cpsr

r8 r9 r10 r11 r12 r13 (sp) r14 (lr)

User mode r0-r12, r15, and cpsr

User mode r0-r12, r15, and cpsr

User mode r0-r12, r15, and cpsr

User mode r0-r12, r15, and cpsr

Thumb state Low registers

Thumb state High registers


r13 (sp) r14 (lr) r13 (sp) r14 (lr) r13 (sp) r14 (lr) r13 (sp) r14 (lr)

Note: System mode uses the User mode register set


8/22/2008 8

EE382N-4 Embedded Systems Architecture

AccessingRegistersusingARMInstructions
Nobreakdownofcurrentlyaccessibleregisters.
Allinstructionscanaccessr0r14directly. MostinstructionsalsoallowuseofthePC.

SpecificinstructionstoallowaccesstoCPSRandSPSR. Note:Wheninaprivilegedmode,itisalsopossibletoloadstore the(bankedout)usermoderegisterstoorfrommemory.

8/22/2008

EE382N-4 Embedded Systems Architecture

TheProgramStatusRegisters(CPSRandSPSRs)
31 28 8 4 0

N Z CV

I F T

Mode

CopiesoftheALUstatusflags(latchedifthe instructionhasthe"S"bitset).

*ConditionCodeFlags N=NegativeresultfromALUflag. Z=ZeroresultfromALUflag. C=ALUoperationCarriedout V=ALUoperationoVerflowed *ModeBits M[4:0]definetheprocessormode.

*InterruptDisablebits. I =1,disablestheIRQ. F =1,disablestheFIQ. *TBit(Architecturev4Tonly) T=0,ProcessorinARMstate T=1,ProcessorinThumbstate

8/22/2008

10

EE382N-4 Embedded Systems Architecture

ConditionFlags

LogicalInstruction Flag Negative (N=1) Zero (Z=1) Carry (C=1) oVerflow (V=1) Nomeaning

ArithmeticInstruction

Bit31oftheresulthasbeenset Indicatesanegativenumberin signedoperations Resultofoperationwaszero Resultwasgreaterthan32bits Resultwasgreaterthan31bits Indicatesapossiblecorruptionof thesignbitinsigned numbers

Resultisallzeroes AfterShiftoperation 1wasleftincarryflag Nomeaning

8/22/2008

11

EE382N-4 Embedded Systems Architecture

TheProgramCounter(R15)
WhentheprocessorisexecutinginARMstate:
Allinstructionsare32bitsinlength Allinstructionsmustbewordaligned ThereforethePCvalueisstoredinbits[31:2]withbits[1:0]equaltozero(as instructioncannotbehalfwordorbytealigned).

R14isusedasthesubroutinelinkregister(LR)andstoresthe returnaddresswhenBranchwithLinkoperationsareperformed, calculatedfromthePC. Thustoreturnfromalinkedbranch:


MOVr15,r14

or
MOVpc,lr

8/22/2008

12

EE382N-4 Embedded Systems Architecture

ExceptionHandlingandtheVectorTable
Whenanexceptionoccurs,thecore:
CopiesCPSRintoSPSR_<mode> SetsappropriateCPSRbits
IfcoreimplementsARMArchitecture4Tandis currentlyinThumbstate,then
ARMstateisentered.

Modefieldbits Interruptdisableflagsifappropriate.

Mapsinappropriatebankedregisters StoresthereturnaddressinLR_<mode> SetsPCtovectoraddress

Toreturn,exceptionhandlerneedsto:
RestoreCPSRfromSPSR_<mode> RestorePCfromLR_<mode>

8/22/2008

13

EE382N-4 Embedded Systems Architecture

TheOriginalInstructionPipeline
TheARMusesapipelineinordertoincreasethespeedofthe flowofinstructionstotheprocessor.
Allowsseveraloperationstobeundertakensimultaneously,ratherthan serially.
PC FETCH Instruction fetched from memory

PC - 4

DECODE

Decoding of registers used in instruction

PC - 8

EXECUTE

Register(s) read from Register Bank Shift and ALU operation Write register(s) back to Register Bank

Ratherthanpointingtotheinstructionbeingexecuted,thePC pointstotheinstructionbeingfetched.
8/22/2008 14

EE382N-4 Embedded Systems Architecture

PipelinechangesforARM9TDMI

ARM7TDMI
Instruction Fetch ThumbARM decompress ARM decode Reg Select Reg Shift Read ALU Reg Write

FETCH

DECODE

EXECUTE

ARM9TDMI
Instruction Fetch ARM or Thumb Inst Decode Reg Reg Decode Read Memory Access Reg Write

Shift + ALU

FETCH

DECODE

EXECUTE

MEMORY

WRITE

EE382N-4 Embedded Systems Architecture

PipelinechangesforARM10vs.ARM11Pipelines
ARM10
Branch Prediction Instruction Fetch ARM or Thumb Instruction Decode Reg Read Shift + ALU Memory Access Multiply Add Reg Write

Multiply

FETCH

ISSUE

DECODE

EXECUTE

MEMORY

WRITE

ARM11
Shift ALU Saturate

Fetch 1

Fetch 2

Decode

Issue

MAC 1

MAC 2

MAC 3 Data Cache 2

Write back

Data Address Cache 1

EE382N-4 Embedded Systems Architecture

ARMInstructionSetFormat
3 1 3 0 2 9 2 8 2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 4 1 3 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0

InstructionType
Dataprocessing

Condition Condition Condition Condition Condition Condition Condition Condition Condition Condition Condition Condition Condition Condition

0 0 0 0 0 1 0 0 1 0 1 1

0 0 0 0 1 0 0 0 0 0 1 1

I 0 0 0 I 0 0 0 1 0 0 1 0 0 1

OPCODE 0 0 A

S S S 0

Rn Rd RdHIGH Rn Rn Rn Rn Rn

Rs Rn Rd LOW Rd Rd 0 0 Rs Rs 0 0

OPERAND2 1 1 1 0 0 0 0 0 0 1 1 1 Rm Rm Rm

Multiply LongMultiply Swap Load/Store Byte/Word Load/Store Multiple

1 U A 0 B 0

P U B W L P U B W L P U 1 W L P U 0 W L L 1 0 0 1 0 1

OFFSET REGISTERLIST

Rd Rd 0

OFFSET1 0 0 0

1 1

S H 1 S H 1

OFFSET2 Rm

Halfword TransferImm Off Halfword TransferReg Off Branch

BRANCH OFFSET 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 OFFSET OP2 OP2 0 1 CRm CRm Rn

Branch Exchange COPROCESSOR DATAXFER COPROCESSOR DATAOP COPROCESSOR REGXFER SoftwareInterrupt

P U N W L 0 Op1 OP1 L

Rn CRn CRn

CRd CRd Rd

CPNum CPNum CPNum

SWI NUMBER

8/22/2008

17

EE382N-4 Embedded Systems Architecture

ConditionalExecution
Mostinstructionsetsonlyallowbranchestobeexecuted conditionally. Howeverbyreusingtheconditionevaluationhardware,ARM effectivelyincreasesnumberofinstructions.
AllinstructionscontainaconditionfieldwhichdetermineswhethertheCPU willexecutethem. Nonexecutedinstructionsconsume1cycle.
CantcollapsetheinstructionlikeaNOP.Stillhavetocompletecyclesoastoallow fetchinganddecodingofthefollowinginstructions.

Thisremovestheneedformanybranches,whichstallthe pipeline(3cyclestorefill).
Allowsverydenseinlinecode,withoutbranches. TheTimepenaltyofnotexecutingseveralconditionalinstructionsis frequentlylessthanoverheadofthebranch orsubroutinecallthatwouldotherwisebeneeded.
8/22/2008 18

EE382N-4 Embedded Systems Architecture

TheConditionField
3 1 3 0 2 9 2 8 2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 4 1 3 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0

InstructionType
Dataprocessing

Condition

OPCODE

Rn

Rs

OPERAND2

0000 = EQ - Z set (equal) 0001 = NE - Z clear (not equal) 0010 = HS / CS - C set (unsigned higher or same) 0011 = LO / CC - C clear (unsigned lower) 0100 = MI -N set (negative) 0101 = PL - N clear (positive or zero) 0110 = VS - V set (overflow) 0111 = VC - V clear (no overflow) 1000 = HI - C set and Z clear (unsigned higher)

1001 = LS - C clear or Z (set unsigned lower or same) 1010 = GE - N set and V set, or N clear and V clear (>or =) 1011 = LT - N set and V clear, or N clear and V set (>) 1100 = GT - Z clear, and either N set and V set, or N clear and V set (>) 1101 = LE - Z set, or N set and V clear,or N clear and V set (<, or =) 1110 = AL - always 1111 = NV - reserved.

8/22/2008

19

EE382N-4 Embedded Systems Architecture

UsingandupdatingtheConditionField
Toexecuteaninstructionconditionally,simplypostfixitwiththeappropriate condition:
Forexampleanaddinstructiontakestheform:
ADDr0,r1,r2 ;r0=r1+r2(ADDAL)

Toexecutethisonlyifthezeroflagisset:
ADDEQr0,r1,r2 ;Ifzeroflagsetthen ;...r0=r1+r2

Bydefault,dataprocessingoperationsdonotaffecttheconditionflags(apart fromthecomparisonswherethisistheonlyeffect).Tocausethecondition flagstobeupdated,theSbitoftheinstructionneedstobesetbypostfixing theinstruction(andanyconditioncode)withanS.


Forexampletoaddtwonumbersandsettheconditionflags:
ADDSr0,r1,r2 andsetflags ;r0=r1+r2 ;...

8/22/2008

20

EE382N-4 Embedded Systems Architecture

ConditionalExecutionandFlags
ARMinstructionscanbemadetoexecuteconditionallybypostfixingthemwiththe appropriateconditioncodefield. Thisimprovescodedensityand performancebyreducingthenumberofforward branchinstructions.
CMP BEQ ADD skip r3,#0 skip r0,r1,r2 CMP r3,#0 ADDNE r0,r1,r2

Bydefault,dataprocessinginstructionsdonotaffecttheconditioncodeflagsbutthe flagscanbeoptionallysetbyusingS.CMPdoesnotneedS. loop decrement r1 and set flags SUBS r1,r1,#1 BNE loop if Z flag clear then branch

8/22/2008

21

EE382N-4 Embedded Systems Architecture

Branchinstructions(1)
Branch: BranchwithLink:
3 1 3 0 2 9 2 8 2 7 2 6 2 5 2 4 2 3

B{<cond>}label BL{<cond>}sub_routine_label
2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 4 1 3 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0

Condition

BRANCH OFFSET

Linkbit

0=Branch 1=Branchwithlink

Conditionfield

Theoffsetforbranchinstructionsiscalculatedbytheassembler:
Bytakingthedifferencebetweenthebranchinstructionandthetargetaddress minus8(toallowforthepipeline). Thisgivesa26bitoffsetwhichisrightshifted2bits(asthebottomtwobitsare alwayszeroasinstructionsareword aligned)andstoredintotheinstruction encoding. Thisgivesarangeof 32Mbytes.

8/22/2008

22

EE382N-4 Embedded Systems Architecture

Branchinstructions(2)
Whenexecutingtheinstruction,theprocessor:
shiftstheoffsetlefttwobits,signextendsitto32bits,andaddsittoPC.

ExecutionthencontinuesfromthenewPC,oncethepipelinehas beenrefilled. The"Branchwithlink"instructionimplementsasubroutinecall bywritingPC4intotheLRofthecurrentbank.


i.e.theaddressofthenextinstructionfollowingthebranchwithlink (allowingforthepipeline).

Toreturnfromsubroutine,simplyneedtorestorethePCfrom theLR:
MOVpc,lr Again,pipelinehastorefillbeforeexecutioncontinues.

8/22/2008

23

EE382N-4 Embedded Systems Architecture

Branchinstructions(3)
The"Branch"instructiondoesnotaffectLR. Note:Architecture4ToffersafurtherARMbranchinstruction,BX
SeeThumbInstructionSetModulefordetails.

BL<subroutine>
StoresreturnaddressinLR ReturningimplementedbyrestoringthePCfromLR Fornonleaffunctions,LRwillhavetobestacked func1
: : BLfunc1 : : STMFDsp!,{regs,lr} : BLfunc2 : LDMFDsp!,{regs,pc}

func2
: : : : : MOVpc,lr

8/22/2008

24

EE382N-4 Embedded Systems Architecture

ConditionalBranches
Branch B BAL BEQ BNE BPL BMI BCC BLO BCS BHS BVC BVS BGT BGE BLT BLE BHI BLS Interpretation Unconditional Always Equal Notequal Plus Minus Carryclear Lower Carryset Higherorsame Overflowclear Overflowset Greaterthan Greaterorequal Lessthan Lessorequal Higher Lowerorsame Normaluses Alwaystakethisbranch Alwaystakethisbranch Comparisonequalorzeroresult Comparisonnotequalornonzeroresult Resultpositiveorzero Resultminusornegative Arithmeticoperationdidnotgivecarryout Unsignedcomparisongavelower Arithmeticoperationgavecarryout Unsignedcomparisongavehigherorsame Signedintegeroperation;nooverflowoccurred Signedintegeroperation;overflowoccurred Signedintegercomparisongavegreaterthan Signedintegercomparisongavegreaterorequal Signedintegercomparisongavelessthan Signedintegercomparisongavelessthanorequal Unsignedcomparisongavehigher Unsignedcomparisongavelowerorsame

8/22/2008

25

EE382N-4 Embedded Systems Architecture

DataprocessingInstructions
LargestfamilyofARMinstructions,allsharingthesame instructionformat. Contains:
Arithmeticoperations Comparisons(noresults justsetconditioncodes) Logicaloperations Datamovementbetweenregisters

Remember,thisisaload/storearchitecture
Theseinstructiononlyworkonregisters,NOTmemory.

Theyeachperformaspecificoperationononeortwooperands.
Firstoperandalwaysaregister Rn SecondoperandsenttotheALUviabarrelshifter.

Wewillexaminethebarrelshiftershortly.

8/22/2008

26

EE382N-4 Embedded Systems Architecture

ArithmeticOperations
Operationsare:
ADD ADC SUB SBC RSB RSC operand1+operand2 operand1+operand2+carry operand1 operand2 operand1 operand2+carry1 operand2 operand1 operand2 operand1+carry 1 ;Add ;Addwithcarry ;Subtract ;Subtractwithcarry ;Reversesubtract ;Reversesubtractwithcarry

Syntax:
<Operation>{<cond>}{S}Rd,Rn,Operand2

Examples
ADDr0,r1,r2 SUBGTr3,r3,#1 RSBLESr4,r5,#5

8/22/2008

27

EE382N-4 Embedded Systems Architecture

Comparisons
Theonlyeffectofthecomparisonsistoupdatethecondition flags.ThusnoneedtosetSbit. Operationsare:
CMP CMN TST TEQ operand1 operand2 operand1+operand2 operand1ANDoperand2 operand1EORoperand2 ;Compare ;Comparenegative ;Test ;Testequivalence

Syntax:
<Operation>{<cond>}Rn,Operand2

Examples:
CMP TSTEQ r0,r1 r2,#5

8/22/2008

28

EE382N-4 Embedded Systems Architecture

LogicalOperations
Operationsare:
AND operand1ANDoperand2 EOR operand1EORoperand2 ORR operand1ORoperand2 ORNoperand1NORoperand2 BIC operand1ANDNOToperand2[iebitclear]

Syntax:
<Operation>{<cond>}{S}Rd,Rn,Operand2

Examples:
AND r0,r1,r2 BICEQ r2,r3,#7 EORS r1,r3,r0

8/22/2008

29

EE382N-4 Embedded Systems Architecture

DataMovement
Operationsare:
MOV operand2 MVN NOToperand2

Notethatthesemakenouseofoperand1. Syntax:
<Operation>{<cond>}{S}Rd,Operand2

Examples:
MOV MOVS MVNEQ r0,r1 r2,#10 r1,#0

8/22/2008

30

EE382N-4 Embedded Systems Architecture

TheBarrelShifter
TheARMdoesnthaveactualshiftinstructions. Insteadithasabarrelshifterwhichprovidesamechanismto carryoutshiftsaspartofotherinstructions. Sowhatoperationsdoesthebarrelshiftersupport?

8/22/2008

31

EE382N-4 Embedded Systems Architecture

BarrelShifter LeftShift
Shiftsleftbythespecifiedamount(multipliesbypowersoftwo) e.g.
LSL#5=>multiplyby32

LogicalShiftLeft(LSL)

CF

Destination

8/22/2008

32

EE382N-4 Embedded Systems Architecture

BarrelShifter RightShifts
LogicalShiftRight(LSR) Shiftsrightbythespecified amount(dividesbypowersof two)e.g. LSR#5=divideby32

LogicalShiftRight ...0

Destination

CF

zeroshiftedin

ArithmeticShiftRight
ArithmeticShiftRight(ASR) Shiftsright(dividesbypowersof two)andpreservesthesignbit, for2'scomplementoperations. e.g. ASR#5=divideby32

Destination
Signbitshiftedin

CF

8/22/2008

33

EE382N-4 Embedded Systems Architecture

BarrelShifter Rotations
RotateRight(ROR) SimilartoanASRbutthebits wraparoundastheyleavethe LSBandappearastheMSB. e.g.ROR#5 Notethelastbitrotatedisalso usedastheCarryOut. RotateRightExtended(RRX) ThisoperationusestheCPSRC flagasa33rdbit. Rotatesrightby1bit.Encoded asROR#0

RotateRight

Destination

CF

RotateRightthroughCarry

Destination

CF

8/22/2008

34

EE382N-4 Embedded Systems Architecture

UsingtheBarrelShifter:TheSecondOperand
Operand 1 Operand 2 Barrel Shifter
Register,optionallywithshift operationapplied. Shiftvaluecanbeeitherbe: 5bitunsignedinteger Specifiedinbottombyteof anotherregister.

ALU

* Immediatevalue 8bitnumber Canberotatedright throughanevennumber ofpositions. Assemblerwillcalculate rotateforyoufrom constant.

Result
8/22/2008 35

EE382N-4 Embedded Systems Architecture

SecondOperand:ShiftedRegister
Theamountbywhichtheregisteristobeshiftediscontainedin either:
theimmediate5bitfieldintheinstruction
NOOVERHEAD Shiftisdoneforfree executesinsinglecycle.

thebottombyteofaregister(notPC)
Thentakesextracycletoexecute ARMdoesnthaveenoughreadportstoread3registersatonce. Thensameasonotherprocessorswhereshiftis separateinstruction.

Ifnoshiftisspecifiedthenadefaultshiftisapplied:LSL#0
i.e.barrelshifterhasnoeffectonvalueinregister.

8/22/2008

36

EE382N-4 Embedded Systems Architecture

SecondOperand:UsingaShiftedRegister
Usingamultiplicationinstructiontomultiplybyaconstantmeansfirstloading theconstantintoaregisterandthenwaitinganumberofinternalcyclesfor theinstructiontocomplete. Amoreoptimumsolutioncanoftenbefoundbyusingsomecombinationof MOVs,ADDs,SUBsandRSBswithshifts.
Multiplicationsbyaconstantequaltoa((powerof2) 1)canbedoneinonecycle. MOVR2,R0,LSL#2 ;ShiftR0leftby2,writetoR2,(R2=R0x4) ADDR9,R5,R5,LSL#3 ;R9=R5+R5x8orR9=R5x9 RSBR9,R5,R5,LSL#3 ;R9=R5x8 R5orR9=R5x7 SUBR10,R9,R8,LSR#4;R10=R9 R8/16 MOVR12,R4,RORR3 ;R12=R4rotatedrightbyvalueofR3

8/22/2008

37

EE382N-4 Embedded Systems Architecture

SecondOperand:ImmediateValue(1)
Thereisnosingleinstructionwhichwillloada32bitimmediateconstantinto aregisterwithoutperformingadataloadfrommemory.
AllARMinstructionsare32bitslong ARMinstructionsdonotusetheinstructionstreamasdata.

Thedataprocessinginstructionformathas12bitsavailableforoperand2
Ifuseddirectlythiswouldonlygivearangeof4096.

Insteaditisusedtostore8bitconstants,givingarangeof0 255. These8bitscanthenberotatedrightthroughanevennumberofpositions(ie RORsby0,2,4,..30).


Thisgivesamuchlargerrangeofconstantsthatcanbedirectlyloaded,thoughsome constantswillstillneedtobeloadedfrommemory.

8/22/2008

38

EE382N-4 Embedded Systems Architecture

SecondOperand:ImmediateValue(2)
Thisgivesus:
0 255 256,260,264,..,1020 1024,1040,1056,..,4080 4096,4160,4224,..,16320 [0 0xff] [0x1000x3fc,step4,0x400xffror 30] [0x4000xff0,step16,0x400xffror 28] [0x10000x3fc0,step64,0x400xffror 26] ;=>MOVr0,#0x1000(ie4096)

Thesecanbeloadedusing,forexample:
MOVr0,#0x40,26

Tomakethiseasier,theassemblerwillconverttothisformforusifsimply giventherequiredconstant:
MOVr0,#4096 MOVr0,#0xFFFFFFFF ;=>MOVr0,#0x1000(ie0x40ror 26) ;assemblestoMVNr0,#0

ThebitwisecomplementscanalsobeformedusingMVN: Iftherequiredconstantcannotbegenerated,anerrorwill bereported.

8/22/2008

39

EE382N-4 Embedded Systems Architecture

Loadingfull32bitconstants
AlthoughtheMOV/MVNmechanismwillloadalargerangeofconstantsintoa register,sometimesthismechanismwillnotgeneratetherequiredconstant. Therefore,theassembleralsoprovidesamethodwhichwillloadANY32bit constant:
LDRrd,=numericconstant

IftheconstantcanbeconstructedusingeitheraMOVorMVNthenthiswillbe theinstructionactuallygenerated. Otherwise,theassemblerwillproduceanLDRinstructionwithaPCrelative addresstoreadtheconstantfromaliteralpool.


LDRr0,=0x42 LDRr0,=0x55555555 ;generatesMOVr0,#0x42 ;generateLDRr0,[pc,offsettolitpool] : : DCD0x55555555

Asthismechanismwillalwaysgeneratethebestinstructionforagivencase,it istherecommendedwayofloadingconstants.
8/22/2008 40

EE382N-4 Embedded Systems Architecture

MultiplicationInstructions
TheBasicARMprovidestwomultiplicationinstructions. Multiply
MUL{<cond>}{S}Rd,Rm,Rs ;Rd=Rm*Rs

MultiplyAccumulate Restrictionsonuse:

doesadditionforfree
;Rd=(Rm*Rs)+Rn

MLA{<cond>}{S}Rd,Rm,Rs,Rn

RdandRmcannotbethesameregister
CanbeavoidedbyswappingRmandRsaround.Thisworksbecausemultiplication iscommutative.

CannotusePC.

Thesewillbepickedupbytheassemblerifoverlooked. Operandscanbeconsideredsignedorunsigned
Uptousertointerpretcorrectly.

8/22/2008

41

EE382N-4 Embedded Systems Architecture

MultiplicationImplementation
TheARMmakesuseofBoothsAlgorithmtoperforminteger multiplication. OnnonMARMsthisoperateson2bitsofRsatatime.
Foreachpairofbitsthistakes1cycle(plus1cycletostartwith). Howeverwhentherearenomore1sleftinRs,themultiplicationwillearly terminate.

Example:Multiply18and1:Rd=Rm*Rs
Rm Rs 17cycles 18 0000 0000 0000 0000 0000 0000 0001 0010 1 1111 1111 1111 1111 1111 1111 1111 1111 18 1 Rs Rm 4cycles

Note:Compilerdoesnotuseearlyterminationcriteriato decideonwhichordertoplaceoperands.
8/22/2008 42

EE382N-4 Embedded Systems Architecture

ExtendedMultiplyInstructions
MvariantsofARMcorescontainextendedmultiplication hardware.Thisprovidesthreeenhancements:
An8bitBoothsAlgorithmisused
Multiplicationiscarriedoutfaster(maximumforstandardinstructionsisnow5 cycles).

Earlyterminationmethodimprovedsothatnowcompletesmultiplication whenallremainingbitsetscontain
allzeroes(aswithnonMARMs),or allones.

Thusthepreviousexamplewouldearlyterminatein2cyclesinboth cases. 64bitresultscannowbeproducedfromtwo32bitoperands


Higheraccuracy. Pairofregistersusedtostoreresult.

8/22/2008

43

EE382N-4 Embedded Systems Architecture

MultiplyLong&MultiplyAccumulateLong
Instructionsare
MULLwhichgivesRdHi,RdLo:=Rm*Rs MLALwhichgivesRdHi,RdLo:=(Rm*Rs)+RdHi,RdLo

Howeverthefull64bitoftheresultnowmatter(lowerprecision multiplyinstructionssimplythrowstop32bitsaway)
Needtospecifywhetheroperandsaresignedorunsigned

Thereforesyntaxofnewinstructionsare:
UMULL{<cond>}{S}RdLo,RdHi,Rm,Rs UMLAL{<cond>}{S}RdLo,RdHi,Rm,Rs SMULL{<cond>}{S}RdLo,RdHi,Rm,Rs SMLAL{<cond>}{S}RdLo,RdHi,Rm,Rs

Notgeneratedbythecompiler. Warning:UnpredictableonnonMARMs.

8/22/2008

44

EE382N-4 Embedded Systems Architecture

Load/StoreInstructions
TheARMisaLoad/StoreArchitecture:
Doesnotsupportmemorytomemorydataprocessingoperations. Mustmovedatavaluesintoregistersbeforeusingthem.

Thismightsoundinefficient,butinpracticeitisnt:
Loaddatavaluesfrommemoryintoregisters. Processdatainregistersusinganumberofdataprocessinginstructions whicharenotsloweddownbymemoryaccess. Storeresultsfromregistersouttomemory.

TheARMhasthreesetsofinstructionswhichinteractwithmain memory.Theseare:
Singleregisterdatatransfer(LDR/STR). Blockdatatransfer(LDM/STM). SingleDataSwap(SWP).

8/22/2008

45

EE382N-4 Embedded Systems Architecture

Singleregisterdatatransfer
Thebasicloadandstoreinstructionsare:
LoadandStoreWordorByte
LDR/STR/LDRB/STRB

ARMArchitectureVersion4alsoaddssupportforHalfwordsand signeddata.
LoadandStoreHalfword
LDRH/STRH

LoadSignedByteorHalfword loadvalueandsignextenditto32bits.
LDRSB/LDRSH

Alloftheseinstructionscanbeconditionallyexecutedby insertingtheappropriateconditioncodeafterSTR/LDR.
e.g.LDREQB

Syntax:
<LDR|STR>{<cond>}{<size>}Rd,<address>

8/22/2008

46

EE382N-4 Embedded Systems Architecture

LoadandStoreWordorByte:BaseRegister
Thememorylocationtobeaccessedisheldinabaseregister
STRr0,[r1] LDRr2,[r1] ;Storecontentsofr0tolocationpointedto ;bycontentsofr1. ;Loadr2withcontentsofmemorylocation ;pointedtobycontentsofr1.
r0 0x5 Memory

Source Register forSTR

Base Register

r1 0x200
0x200

r2 0x5 0x5

Destination Register forLDR

8/22/2008

47

EE382N-4 Embedded Systems Architecture

Load/StoreWordorByte:OffsetsfromtheBaseRegister
Aswellasaccessingtheactuallocationcontainedinthebase register,theseinstructionscanaccessalocationoffsetfromthe baseregisterpointer. Thisoffsetcanbe
Anunsigned12bitimmediatevalue(ie0 4095bytes). Aregister,optionallyshiftedbyanimmediatevalue

Thiscanbeeitheraddedorsubtractedfromthebaseregister:
Prefixtheoffsetvalueorregisterwith+(default)or.

Thisoffsetcanbeapplied:
beforethetransferismade:Preindexedaddressing
optionallyautoincrementingthebaseregister,bypostfixingtheinstructionwith an!.

afterthetransferismade:Postindexedaddressing
causingthebaseregistertobeautoincremented.

8/22/2008

48

EE382N-4 Embedded Systems Architecture

Load/StoreWordorByte:PreindexedAddressing
Example:STRr0,[r1,#12]
Memory Offset 12 Base Register r1 0x200
0x200 0x20c

r0 0x5

Source Register forSTR

0x5

Tostoretolocation0x1f4insteaduse:STRr0,[r1,#12] Toautoincrementbasepointerto0x20cuse:STRr0,[r1,#12]! Ifr2contains3,access0x20cbymultiplyingthisby4:


STRr0,[r1,r2,LSL#2]

8/22/2008

49

EE382N-4 Embedded Systems Architecture

LoadandStoreWordorByte:PostindexedAddressing
Example:STRr0,[r1],#12
Memory r1 0x20c r0
0x20c

Updated Base Register

Offset 12

0x5

Source Register for STR

Original Base Register

r1 0x200

0x200

0x5

Toautoincrementthebaseregistertolocation0x1f4insteaduse:
STRr0,[r1],#12

Ifr2contains3,autoincrementbaseregisterto0x20cbymultiplyingthisby 4:
STRr0,[r1],r2,LSL#2

8/22/2008

50

EE382N-4 Embedded Systems Architecture

LoadandStoreswithUserModePrivilege
Whenusingpostindexedaddressing,thereisafurtherformof Load/StoreWord/Byte:
<LDR|STR>{<cond>}{B}TRd,<post_indexed_address>

Whenusedinaprivilegedmode,thisdoestheload/storewith usermodeprivilege.
Normallyusedbyanexceptionhandlerthatisemulatingamemoryaccess instructionthatwouldnormallyexecuteinusermode.

8/22/2008

51

EE382N-4 Embedded Systems Architecture

ExampleUsageofAddressingModes
Imagineanarray,thefirstelementofwhichispointedtobythecontentsofr0. Ifwewanttoaccessaparticularelement, thenwecanusepreindexedaddressing:
r1iselementwewant. LDRr2,[r0,r1,LSL#2]
3 12 8 4 0 element Memory Offset

Ifwewanttostepthroughevery 1 elementofthearray,forinstance 0 r0 toproducesumofelementsinthe array,thenwecanusepostindexedaddressingwithinaloop:


r1isaddressofcurrentelement(initiallyequaltor0). LDRr2,[r1],#4

Pointer to start of array

Useafurtherregistertostoretheaddressoffinalelement, sothattheloopcanbecorrectlyterminated.

8/22/2008

52

EE382N-4 Embedded Systems Architecture

OffsetsforHalfwordandSignedHalfword/ByteAccess
TheLoadandStoreHalfwordandLoadSignedByteorHalfword instructionscanmakeuseofpre andpostindexedaddressingin muchthesamewayasthebasicloadandstoreinstructions. Howevertheactualoffsetformatsaremoreconstrained:
Theimmediatevalueislimitedto8bits(ratherthan12bits)givinganoffset of0255bytes. Theregisterformcannothaveashiftappliedtoit.

8/22/2008

53

EE382N-4 Embedded Systems Architecture

Effectofendianess
TheARMcanbesetuptoaccessitsdataineitherlittleorbig endianformat. Littleendian:
Leastsignificantbyteofawordisstoredinbits07ofanaddressedword.

Bigendian:
Leastsignificantbyteofawordisstoredinbits2431ofanaddressedword.

Thishasnorealrelevanceunlessdataisstoredaswordsandthen accessedinsmallersizedquantities(halfwords orbytes).


Whichbyte/halfwordisaccessedwilldependontheendianess ofthe systeminvolved.

8/22/2008

54

EE382N-4 Embedded Systems Architecture

YAEndianess Example
r0 = 0x11223344
31 24 23 16 15 87 0

11

22

33

44

STR r0, [r1]

31

24 23

16 15

87

31

24 23

16 15

87

r1 = 0x100

11

22

33

44

Memory
LDRB r2, [r1]

44

33

22

11

r1 = 0x100

Little-endian
31 24 23 16 15 87 0

Big-endian
31 24 23 16 15 87 0

00

00

00

44

00

00

00

11

r2 = 0x44
8/22/2008

r2 = 0x11
55

EE382N-4 Embedded Systems Architecture

BlockDataTransfer(1)
TheLoadandStoreMultipleinstructions(LDM/STM)allow betweeen1and16registerstobetransferredtoorfrom memory. Thetransferredregisterscanbeeither:
Anysubsetofthecurrentbankofregisters(default). Anysubsetoftheusermodebankofregisterswheninapriviledgedmode (postfixinstructionwitha^).
31 28 27 24 23 22 21 20 19 16 15 0

Cond

1 0 0 P U S W L

Rn

Register list

Condition field
Up/Down bit
0 = Down; subtract offset from base 1 = Up ; add offset to base

Base register
Load/Store bit
0 = Store to memory 1 = Load from memory

Each bit corresponds to a particular register. For example:


Bit 0 set causes r0 to be transferred. Bit 0 unset causes r0 not to be transferred.

Pre/Post indexing bit


0 = Post; add offset after transfer, 1 = Pre ; add offset before transfer

Write- back bit


0 = no write-back 1 = write address into base

At least one register must be transferred as the list cannot be empty.

PSR and force user bit


0 = dont load PSR or force user mode 1 = load PSR or force user mode

8/22/2008

56

EE382N-4 Embedded Systems Architecture

BlockDataTransfer(2)
Baseregisterusedtodeterminewherememoryaccessshould occur.
4differentaddressingmodesallowincrementanddecrementinclusiveor exclusiveofthebaseregisterlocation. Baseregistercanbeoptionallyupdatedfollowingthetransfer(byappending itwithan!. Lowestregisternumberisalwaystransferredto/fromlowestmemory locationaccessed.

Theseinstructionsareveryefficientfor
Savingandrestoringcontext
Forthisusefultoviewmemoryasastack.

Movinglargeblocksofdataaroundmemory
Forthisusefultodirectlyrepresentfunctionalityoftheinstructions.

8/22/2008

57

EE382N-4 Embedded Systems Architecture

Stacks
Astackisanareaofmemorywhichgrowsasnewdatais pushedontothetopofit,andshrinksasdataispoppedoff thetop. Twopointersdefinethecurrentlimitsofthestack.
Abasepointer
usedtopointtothebottomofthestack(thefirstlocation).

Astackpointer
usedtopointthecurrenttopofthestack.
PUSH {1,2,3}
SP 3 2 1 SP BASE BASE BASE SP 2 1

POP
Result of pop = 3

8/22/2008

58

EE382N-4 Embedded Systems Architecture

StackOperation
Traditionally,astackgrowsdowninmemory,withthelastpushedvalueat thelowestaddress.TheARMalsosupportsascendingstacks,wherethestack structuregrowsupthroughmemory. Thevalueofthestackpointercaneither:
Pointtothelastoccupiedaddress(Fullstack)
andsoneedspredecrementing(iebeforethepush)

Pointtothenextoccupiedaddress(Emptystack)
andsoneedspostdecrementing(ieafterthepush)

Thestacktypetobeusedisgivenbythepostfixtotheinstruction:
STMFD/LDMFD:FullDescendingstack STMFA/LDMFA:FullAscendingstack. STMED/LDMED:EmptyDescendingstack STMEA/LDMEA:EmptyAscendingstack

Note:ARMCompilerwillalwaysuseaFulldescendingstack.

8/22/2008

59

EE382N-4 Embedded Systems Architecture

StackExamples
STMFD sp!, {r0,r1,r3-r5} STMED sp!, {r0,r1,r3-r5} STMFA sp!, {r0,r1,r3-r5} STMEA sp!, {r0,r1,r3-r5}

0x418
SP r5 r4 r3 r1 r0 SP r5 r4 r3 r1 r0

Old SP

Old SP

SP

r5 r4 r3 r1 r0

r5 r4 r3 r1 r0

Old SP

Old SP

0x400

SP

0x3e8

8/22/2008

60

EE382N-4 Embedded Systems Architecture

StacksandSubroutines
Oneuseofstacksistocreatetemporaryregisterworkspaceforsubroutines. Anyregistersthatareneededcanbepushedontothestackatthestartofthe subroutineandpoppedoffagainattheendsoastorestorethembefore returntothecaller:
STMFD sp!,{r0-r12, lr} ........ ........ LDMFD sp!,{r0-r12, pc} ; stack all registers ; and the return address ; load all the registers ; and return automatically

SeethechapterontheARMProcedureCallStandardintheSDTReference Manualforfurtherdetailsofregisterusagewithinsubroutines. IfthepopinstructionalsohadtheSbitset(using^)thenthetransferofthe PCwheninaprivilegedmodewouldalsocausetheSPSRtobecopiedintothe CPSR(seeexceptionhandlingmodule).

8/22/2008

61

EE382N-4 Embedded Systems Architecture

DirectfunctionalityofBlockDataTransfer
WhenLDM/STMarenotbeingusedtoimplementstacks,itis clearertospecifyexactlywhatfunctionalityoftheinstructionis:
i.e.specifywhethertoincrement/decrementthebasepointer,beforeor afterthememoryaccess.

Inordertodothis,LDM/STMsupportafurthersyntaxin additiontothestackone:
STMIA/LDMIA:IncrementAfter STMIB/LDMIB:IncrementBefore STMDA/LDMDA:DecrementAfter STMDB/LDMDB:DecrementBefore

8/22/2008

62

EE382N-4 Embedded Systems Architecture

Example:BlockCopy
Copyablockofmemory,whichisanexactmultipleof12wordslongfromthe locationpointedtobyr12tothelocationpointedtobyr13.r14pointstothe endofblocktobecopied.
; r12 points to the start of the source data ; r14 points to the end of the source data ; r13 points to the start of the destination data loop LDMIA STMIA CMP BNE r12!, {r0-r11} ; load 48 bytes r13!, {r0-r11} ; and store them r12, r14 loop ; check for the end ; and loop until done
r13 r14 Increasing Memory

Thislooptransfers48bytesin31cycles Over50Mbytes/secat33MHz

r12

8/22/2008

63

EE382N-4 Embedded Systems Architecture

SwapandSwapByteInstructions
Atomicoperationofamemoryreadfollowedbyamemorywrite whichmovesbyteorwordquantitiesbetweenregistersand memory. Syntax:
SWP{<cond>}{B}Rd,Rm,[Rn]

Rn 2 Memory Rm

1 temp 3 Rd

ToimplementanactualswapofcontentsmakeRd=Rm. Thecompilercannotproducethisinstruction.
8/22/2008 64

EE382N-4 Embedded Systems Architecture

SoftwareInterrupt(SWI)
3 1 3 0 2 9 2 8 2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 4 1 3 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0

InstructionType
SoftwareInterrupt

Condition

SWI NUMBER

Ineffect,aSWIisauserdefinedinstruction. ItcausesanexceptiontraptotheSWIhardwarevector(thus causingachangetosupervisormode,plustheassociatedstate saving),thuscausingtheSWIexceptionhandlertobecalled. Thehandlercanthenexaminethecommentfieldofthe instructiontodecidewhatoperationhasbeenrequested. BymakinguseoftheSWImechanism,anoperatingsystemcan implementasetofprivilegedoperationswhichapplications runninginusermodecanrequest. SeeExceptionHandlingModuleforfurtherdetails.
8/22/2008 65

EE382N-4 Embedded Systems Architecture

Backup

8/22/2008

EE382N-4 Embedded Systems Architecture

Assembler:Pseudoops
AREA>chunksofdata($data)orcode($code) ADR>loadaddressintoaregister ADRR0,BUFFER ALIGN>adjustlocationcountertowordboundaryusuallyaftera storagedirective END>nomoretoassemble

8/22/2008

67

EE382N-4 Embedded Systems Architecture

Assembler:Pseudoops
DCD>definedwordvaluestoragearea BOWDCD1024,2055,9051 DCB>definedbytevaluestoragearea BOBDCB10,12,15 %>zeroedoutbytestoragearea BLBYTE%30

8/22/2008

68

EE382N-4 Embedded Systems Architecture

Assembler:Pseudoops
IMPORT>nameofroutinetoimportforuseinthisroutine IMPORT_printf;Cprintroutine EXPORT>nameofroutinetoexportforuseinotherroutines EXPORTadd2;add2routine EQU>symbolreplacement loopcntEQU5

8/22/2008

69

EE382N-4 Embedded Systems Architecture

AssemblyLineFormat
label <whitespace> instruction <whitespace> ; comment label: created by programmer, alphanumeric whitespace: space(s) or tab character(s) instruction: op-code mnemonic or pseudo-op with required fields comment: preceded by ; ignored by assembler but useful to the programmer for documentation NOTE: All fields are optional.

8/22/2008

70

EE382N-4 Embedded Systems Architecture

Example:Cassignments
C:
x = (a + b) - c;

Assembler:
ADR r4,a LDR r0,[r4] ADR r4,b LDR r1,[r4] ADD r3,r0,r1 ADR r4,c LDR r2,[r4] SUB r3,r3,r2 ADR r4,x STR r3,[r4]
2008WayneWolf

; get address for a ; get value of a ; get address for b, reusing r4 ; get value of b ; compute a+b ; get address for c ; get value of c ; complete computation of x ; get address for x ; store value of x
ComputersasComponents2nd ed.

8/22/2008

71

EE382N-4 Embedded Systems Architecture

Example:Cassignment
C:
y = a*(b+c);

Assembler:
ADR LDR ADR LDR ADD ADR LDR MUL ADR STR r4,b ; get address for b r0,[r4] ; get value of b r4,c ; get address for c r1,[r4] ; get value of c r2,r0,r1 ; compute partial result r4,a ; get address for a r0,[r4] ; get value of a r2,r2,r0 ; compute final value for y r4,y ; get address for y r2,[r4] ; store y

2008WayneWolf

ComputersasComponents2nd ed.

8/22/2008

72

EE382N-4 Embedded Systems Architecture

Example:Cassignment
C:
z = (a << 2) | (b & 15);

Assembler:
ADR r4,a ; get address for a LDR r0,[r4] ; get value of a MOV r0,r0,LSL 2 ; perform shift ADR r4,b ; get address for b LDR r1,[r4] ; get value of b AND r1,r1,#15 ; perform AND ORR r1,r0,r1 ; perform OR ADR r4,z ; get address for z STR r1,[r4] ; store value for z

2008WayneWolf

ComputersasComponents2nd ed.

8/22/2008

73

EE382N-4 Embedded Systems Architecture

Example:ifstatement
C:
if (a > b) { x = 5; y = c + d; } else x = c - d;

Assembler:
; compute and test condition ADR r4,a ; get address for a LDR r0,[r4] ; get value of a ADR r4,b ; get address for b LDR r1,[r4] ; get value for b CMP r0,r1 ; compare a < b BLE fblock ; if a ><= b, branch to false block

2008WayneWolf

ComputersasComponents2nd ed.

8/22/2008

74

EE382N-4 Embedded Systems Architecture

ifstatement,contd.
; true block MOV r0,#5 ; generate value for x ADR r4,x ; get address for x STR r0,[r4] ; store x ADR r4,c ; get address for c LDR r0,[r4] ; get value of c ADR r4,d ; get address for d LDR r1,[r4] ; get value of d ADD r0,r0,r1 ; compute y ADR r4,y ; get address for y STR r0,[r4] ; store y B after ; branch around false block

2008WayneWolf

ComputersasComponents2nd ed.

8/22/2008

75

EE382N-4 Embedded Systems Architecture

ifstatement,contd.
; false block fblock ADR r4,c ; get address for c LDR r0,[r4] ; get value of c ADR r4,d ; get address for d LDR r1,[r4] ; get value for d SUB r0,r0,r1 ; compute a-b ADR r4,x ; get address for x STR r0,[r4] ; store value of x after ...

2008WayneWolf

ComputersasComponents2nd ed.

8/22/2008

76

EE382N-4 Embedded Systems Architecture

Example:Conditionalinstructionimplementation
; true block MOVLT r0,#5 ; generate value for x ADRLT r4,x ; get address for x STRLT r0,[r4] ; store x ADRLT r4,c ; get address for c LDRLT r0,[r4] ; get value of c ADRLT r4,d ; get address for d LDRLT r1,[r4] ; get value of d ADDLT r0,r0,r1 ; compute y ADRLT r4,y ; get address for y STRLT r0,[r4] ; store y

2008WayneWolf

ComputersasComponents2nd ed.

8/22/2008

77

EE382N-4 Embedded Systems Architecture

Conditionalinstructionimplementation,contd.
; false block ADRGE r4,c ; get address for c LDRGE r0,[r4] ; get value of c ADRGE r4,d ; get address for d LDRGE r1,[r4] ; get value for d SUBGE r0,r0,r1 ; compute a-b ADRGE r4,x ; get address for x STRGE r0,[r4] ; store value of x

2008WayneWolf

ComputersasComponents2nd ed.

8/22/2008

78

EE382N-4 Embedded Systems Architecture

Example:switchstatement
C:
switch (test) { case 0: break; case 1: }

Assembler:
ADR r2,test ; get address for test LDR r0,[r2] ; load value for test ADR r1,switchtab ; load address for switch table LDR r1,[r1,r0,LSL #2] ; index switch table switchtab DCD case0 DCD case1 ...

2008WayneWolf

ComputersasComponents2nd ed.

8/22/2008

79

EE382N-4 Embedded Systems Architecture

Example:FIRfilter
C:
for (i=0, f=0; i<N; i++) f = f + c[i]*x[i];

Assembler
; loop initiation code MOV r0,#0 ; use r0 for I MOV r8,#0 ; use separate index for arrays ADR r2,N ; get address for N LDR r1,[r2] ; get value of N MOV r2,#0 ; use r2 for f

2008WayneWolf

ComputersasComponents2nd ed.

8/22/2008

80

EE382N-4 Embedded Systems Architecture

FIRfilter,cont.d
ADR r3,c ; load r3 with base of c ADR r5,x ; load r5 with base of x ; loop body loop LDR r4,[r3,r8] ; get c[i] LDR r6,[r5,r8] ; get x[i] MUL r4,r4,r6 ; compute c[i]*x[i] ADD r2,r2,r4 ; add into running sum ADD r8,r8,#4 ; add one word offset to array index ADD r0,r0,#1 ; add 1 to i CMP r0,r1 ; exit? BLT loop ; if i < N, continue

2008WayneWolf

ComputersasComponents2nd ed.

8/22/2008

81

EE382N-4 Embedded Systems Architecture

ARMInstructionSetSummary(1/4)

82

EE382N-4 Embedded Systems Architecture

ARMInstructionSetSummary(2/4)

83

EE382N-4 Embedded Systems Architecture

ARMInstructionSetSummary(3/4)

84

EE382N-4 Embedded Systems Architecture

ARMInstructionSetSummary(4/4)

85

También podría gustarte