Go to main content

School of Computer Science Intranet

APT research areas

Discover our main research areas

ARM System-on-Chip Architecture

By: S.B. Furber
Published by: Addison Wesley Longman (2000)
ISBN: 0-201-67519-6

This is the 2nd edition of ARM System Architecture...
...including many updates on the first edition: the ARM9 and ARM10 processor families are covered, as are embedded trace, AMULET3, and several new system-on-chip applications such as Bluetooth, GSM telephony, and so on.

Contents

The book describes the design and operation of the ARM processor, a 32-bit RISC microprocessor from ARM Ltd. Full details are given of the ARM and Thumb instructions sets and the memory management architecture. There is illustrated discussion of how the ARM instruction set supports high-level language constructs.

ARM products are described, including the ARM7TDMI, ARM9TDMI and ARM10TDMI cores, the 710T, 720T, 740T, 920T, 940T, 1020E and StrongARM and some information on the AMULET asynchronous ARM cores. Embedded system design using the ARM is covered, including the debug methodology, hardware system design principles and software issues. Examples include the ARM71OO chip used in the Psion 5, the ARM7500 NC chip, the VLSI Ruby II, VIP and OneC GSM chips, the Ericsson-VLSI Bluetooth chip and the SA-1100 high-performance PDA chip.

The full Table of Contents is given below.

Aims

This book introduces the concepts and methodologies employed in designing a system-on-chip (SoC) based around a microprocessor and in designing the microprocessor core itself. The principles of microprocessor design are made concrete by extensive illustrations based upon the ARM.

The aim of the book is to assist the reader in understanding how SoCs and microprocessors are designed and used, and why a modern processor is designed the way that it is. The reader who wishes to know only the general principles should find that the ARM illustrations add substance to issues which can otherwise appear somewhat ethereal; the reader who specifically wishes to understand design of the ARM should find that the general principles illuminate the rationale for the ARM being as it is.

Other commercial microprocessor architectures are not described in this book. The reader who wishes to make a comparative study of such architectures will find the required information on the ARM here but must look elsewhere for information on other designs.

Audience

The book is intended to be of use to two distinct groups of readers:

  • Professional hardware and software engineers who are tasked with designing an SoC product which incorporates an ARM processor, or who are evaluating the ARM for a product, should find the book helpful in their duties. Although there is considerable overlap with ARM technical publications, this book provides a broader context with more background. It is not a substitute for the manufacturer's data, since much detail has had to be omitted, but it should be useful as an introductory overview and adjunct to that data.
  • Students of computer science, computer engineering and electrical engineering should find the material of value at several stages in their courses. Some chapters are closely based on course material previously used in undergraduate teaching; some other material is drawn from a postgraduate course.

Prerequisite knowledge

This book is not intended to be an introductory text on computer architecture or computer logic design. Readers are assumed to have a level of familiarity with these subjects equivalent to that of a second year undergraduate student in computer science or computer engineering. Some first year material is presented, but this is more by way of a refresher than as a first introduction to this material.

No prior familiarity with the ARM processor is assumed.

Table of Contents

Preface v

1 An Introduction to Processor Design 1

  • 1.1 Processor architecture and organization 2
  • 1.2 Abstraction in hardware design 3
  • 1.3 MU0 - a simple processor 7
  • 1.4 Instruction set design 14
  • 1.5 Processor design trade-offs 19
  • 1.6 The Reduced Instruction Set Computer 24
  • 1.7 Design for low power consumption 28
  • 1.8 Examples and exercises 32

2 The ARM Architecture 35

  • 2.1 The Acorn RISC Machine 36
  • 2.2 Architectural inheritance 37
  • 2.3 The ARM programmer's model 39
  • 2.4 ARM development tools 43
  • 2.5 Example and exercises 47

3 ARM Assembly Language Programming 49

  • 3.1 Data processing instructions 50
  • 3.2 Data transfer instructions 55
  • 3.3 Control flow instructions 63
  • 3.4 Writing simple assembly language programs 69
  • 3.5 Examples and exercises 72

4 ARM Organization and Implementation 74

  • 4.1 3-stage pipeline ARM organization 75
  • 4.2 5-stage pipeline ARM organization 78
  • 4.3 ARM instruction execution 82
  • 4.4 ARM implementation 86
  • 4.5 The ARM coprocessor interface 101
  • 4.6 Examples and exercises 103

5 The ARM Instruction Set 105

  • 5.1 Introduction 106
  • 5.2 Exceptions 108
  • 5.3 Conditional execution 111
  • 5.4 Branch and Branch with Link (B, BL) 113
  • 5.5 Branch, Branch with Link and eXchange instructions (BX, BLX) 115
  • 5.6 Software Interrupt (SWI) 117
  • 5.7 Data processing instructions 119
  • 5.8 Multiply instructions 122
  • 5.9 Count leading zeros (CLZ - architecture v5T only) 124
  • 5.10 Single word and unsigned byte data transfer instructions 125
  • 5.11 Half-word and signed byte data transfer instructions 128
  • 5.12 Multiple register transfer instructions 130
  • 5.13 Swap memory and register instructions (SWP) 132
  • 5.14 Status register to general register transfer instructions 133
  • 5.15 General register to status register transfer instructions 134
  • 5.16 Coprocessor instructions 136
  • 5.17 Coprocessor data operations 137
  • 5.18 Coprocessor data transfers 138
  • 5.19 Coprocessor register transfers 139
  • 5.20 Breakpoint instruction (BRK - architecture v5T only) 141
  • 5.21 Unused instruction space 142
  • 5.22 Memory faults 143
  • 5.23 ARM architecture variants 147
  • 5.24 Example and exercises 149

6 Architectural Support for High-Level Languages 151

  • 6.1 Abstraction in software design 152
  • 6.2 Data types 153
  • 6.3 Floating-point data types 158
  • 6.4 The ARM floating-point architecture 163
  • 6.5 Expressions 168
  • 6.6 Conditional statements 170
  • 6.7 Loops 173
  • 6.8 Functions and procedures 175
  • 6.9 Use of memory 180
  • 6.10 Run-time environment 185
  • 6.11 Examples and exercises 186

7 The Thumb Instruction Set 188

  • 7.1 The Thumb bit in the CPSR 189
  • 7.2 The Thumb programmer's model 190
  • 7.3 Thumb branch instructions 191
  • 7.4 Thumb software interrupt instruction 194
  • 7.5 Thumb data processing instructions 195
  • 7.6 Thumb single register data transfer instructions 198
  • 7.7 Thumb multiple register data transfer instructions 199
  • 7.8 Thumb breakpoint instruction 200
  • 7.9 Thumb implementation 201
  • 7.10 Thumb applications 203
  • 7.11 Example and exercises 204

8 Architectural Support for System Development 207

  • 8.1 The ARM memory interface 208
  • 8.2 The Advanced Microcontroller Bus Architecture (AMBA) 216
  • 8.3 The ARM reference peripheral specification 220
  • 8.4 Hardware system prototyping tools 223
  • 8.5 The ARMulator 225
  • 8.6 The JTAG boundary scan test architecture 226
  • 8.7 The ARM debug architecture 232
  • 8.8 Embedded Trace 237
  • 8.9 Signal processing support 239
  • 8.10 Example and exercises 245

9 ARM Processor Cores 247

  • 9.1 ARM7TDMI 248
  • 9.2 ARM8 256
  • 9.3 ARM9TDMI 260
  • 9.4 ARM10TDMI 263
  • 9.5 Discussion 266
  • 9.6 Examples and exercises 267

10 Memory Hierarchy 269

  • 10.1 Memory size and speed 270
  • 10.2 On-chip memory 271
  • 10.3 Caches 272
  • 10.4 Cache design - an example 279
  • 10.5 Memory management 283
  • 10.6 Examples and exercises 289

11 Architectural Support for Operating Systems 290

  • 11.1 An introduction to operating systems 291
  • 11.2 The ARM system control coprocessor 293
  • 11.3 CP15 protection unit registers 294
  • 11.4 ARM protection unit 297
  • 11.5 CP15 MMU registers 298
  • 11.6 ARM MMU architecture 302
  • 11.7 Synchronization 309
  • 11.8 Context switching 310
  • 11.9 Input/Output 312
  • 11.10 Example and exercises 316

12 ARM CPU cores 317

  • 12.1 The ARM710T, ARM720T and ARM740T 318
  • 12.2 The ARM810 323
  • 12.3 The StrongARM SA-110 327
  • 12.4 The ARM920T and ARM940T 335
  • 12.5 The ARM946E-S and ARM966E-S 339
  • 12.6 The ARM1020E 341
  • 12.7 Discussion 344
  • 12.8 Example and exercises 346

13 Embedded ARM Applications 347

  • 13.1 The VLSI Ruby II Advanced Communication Processor 348
  • 13.2 The VLSI ISDN Subscriber Processor 349
  • 13.3 The OneC VWS22100 GSM chip 352
  • 13.4 The Ericsson-VLSI Bluetooth Baseband Controller 355
  • 13.5 The ARM7500 and ARM7500FE 360
  • 13.6 The ARM7100 364
  • 13.7 The SA-1100 368
  • 13.8 Examples and exercises 371

14 The AMULET Asynchronous ARM Processors 374

  • 14.1 Self-timed design 375
  • 14.2 AMULET1 377
  • 14.3 AMULET2 381
  • 14.4 AMULET2e 384
  • 14.5 AMULET3 387
  • 14.6 The DRACO telecommunications controller 390
  • 14.7 A self-timed future? 396
  • 14.6 Examples and exercises 397

Appendix: Computer Logic 399

Glossary 405

Bibliography 410

Index 413

Figures

Figures and tables are available as gzipped Powerpoint files by FTP, each file containing all the figures and tables (but not photos) from a chapter arranged as one figure or table per page, each page being a titled landscape slide.

The figures and tables are made freely available on the understanding that any course which makes use of them will have this book as a recommended text.

Get the figures from here.

Please email S.B. Furber if you use these figures as I am interested in tracking their use.

Errata

The following changes were made for the 2nd print run of the book but then seemed to get forgotten in the 3rd print run:

pages 141 and 201, ARM and Thumb breakpoint instructions:

  • The assembler mnemonic for the ARM and Thumb breakpoint instructions is 'BKPT', not 'BRK' (architecture v5T only).

page 154, under 'Number ranges':

  • '0 to 4 294 976 295...' should read
  • '0 to 4 294 967 295...'.

page 155, under 'Signed integers':

  • '-2 147 488 148 to +2 147 488 147...' should read
  • '-2 147 483 648 to +2 147 483 647...'.

page 395, Figure 14.14:

  • The processor core legend should be 'AMULET3', not 'AMULETS'. The overlay is also somewhat misaligned relative to the die plot.

page 397, 3rd paragraph:

  • 'Synopsis' should be 'Synopsys'.

The following changes were made for the 3rd print run of the book (in addition to various typographic corrections), but note that the above changes in the 2nd print run were not carried over to the 3rd print run:

page 125, CLZ example:

  • The result of CLZ r1, r0 should be r1 := 23, not r1 := 8.

page 244, under 'v5TE code example':

  • The second instruction should subtract #2, not r2:
  • SUBS r4, r4, #2 ; decrement loop counter

page 299, Figure 11.2:

  • The break in the address field is between bits 11 & 12, not between bits 19 & 20 as shown.

The following notes apply to all print runs:

page 32, Example 1.1:

  • C[0] = 1 is incorrect; C[0] = Q[0] corresponds to the figure on page 33 and gives the correct result.

page 39, and probably elsewhere, reference to 'system modes':

  • Be careful not to confuse 'system modes' (as in privileged, non-user modes) with 'system mode' (the specific privileged mode that uses the user-mode registers - see page 108). The book should really reserve the term 'system mode' for the specific mode and use 'privileged modes' as the generic term, but it doesn't do this consistently!

pages 67-73, 194 and 204-5, ARM and Thumb supervisor calls (SWIs):

  • The style of the SWIs given in the book are based on the now obsolete Demon monitor program (which is not supported by ARM software tools after SDT2.11). The Angel SWI convention uses a single SWI (0x123456 in ARM code, 0xAB in Thumb code) and specifies the system call in r0 with other parameters in memory pointed to by r1. The example programs will therefore not work under SDT2.50 or ADS.
  • For examples of Angel-compatible versions of the "Hello World" programs see: Angel code examples.

page 70, block copy code example:

  • As addresses are unsigned integers the BLT on line 11 of the code should be BLO. (This will only matter if TABLE1 includes address 0x80000000.)

page 89, Figure 4.12:

  • The logic in the figure is incorrect when the function select values in Table 4.1 (on page 90) are applied. Both the 'NB bus' input and the 'ALU bus' output should be inverted.

page 123, Assembler formats:

  • The assembly format for the 64-bit result multiply instructions has RdHi and RdLo in the wrong order.

    The correct assembly format is: <mul>{<cond>}{S} RdLo, RdHi, Rm, Rs

page 268, Exercise 9.1.2:

  • The units on the 4th line should be MIPS^3/W, not MIP^3/W.

pages 406-407, Glossary:

  • The definitions given for EPLD and FPGA suggest the latter is one-time programmable whereas the former is reprogrammable. Current usage of FPGA is less strict, and now includes reprogrammable devices (such as Xilinx parts), so it is more accurate to view FPGA as covering all field programmable devices including one-time and reprogrammable while EPLD just covers reprogrammable devices.

page 406, Glossary: EPROM

  • The acronym 'EPROM' stands for Erasable Programmable Read Only Memory, not Electrically-Programmable Read Only Memory.