

## **System Development**

The University of Manchester

### Outline:

- O system modelling
- O on-chip debug
- $\bigcirc$  AMBA
- O rapid silicon prototyping
- O embedded ARM cores

#### hands-on: system modelling



## **System Development**

### Outline:

- → system modelling
- O on-chip debug
- O AMBA
- O rapid silicon prototyping
- O embedded ARM cores

#### hands-on: system modelling



## **System Modelling**

- The University of Manchester
- From prototype environment ...
  - O Undefined resources: Core, Memory, Cache
  - O C library use of hardware
  - O Debug libraries
- ... to final product
  - Standalone embedded application
  - O Specific memory layout of the target hardware
  - Initialisation sequence

## System development & debugging

**MANCHEstER** 

- A common set of debugger front-ends
  - armsd, AXD, RVD
- □ Same code can be debugged on:
  - O Software simulation targets
    - ARMulator
  - Hardware targets
    - RealView ICE, Multi-ICE, RV Trace, MultiTrace, Angel



## **Debugger-target interface**

| ARM Debugger | r |
|--------------|---|
| AXD          |   |
| RDI          |   |
|              |   |

#### Remote Debug Interface (RDI)



## Software debug: the ARMulator

A software model of an ARM core with:

- O support for Thumb instructions
- a programmable memory interface
  - for modelling the target memory system
  - various rapid prototyping tools are supplied
- a coprocessor interface
  - supporting custom coprocessor models
- O an operating system interface
  - system calls handled by host or emulation

MANCHEstER



## The **ARMulator**

- The core of a complete system model
  - O clock-cycle accurate
  - O inspect registers and memory
  - O set breakpoints and watchpoints
- Supports software development
  - O concurrently with hardware development
  - O higher performance than detailed hardware models

## From ARMulator to on-chip Debug

MANCHEstER

- Important to understand simulator's default behaviour
- Default build needs to be tailored to specific needs:
  - O Uses of ADS/RVDS C library
    - semi-hosted SWI calls
  - O Memory map and Linker placement rules
  - O Reset and initialisation



## **ADS/RVDS C Library**

- Avoiding C library semihosting
  - O import \_\_use\_no\_semihosting\_swi (in C: #pragma import)
  - O linker reports any remaining SWI call
- Retargeting C library calls
  - O example: retargeting the printf() family of function to print out to a hardware UART

```
extern void sendchar (char *c); /* UART communications */
int fputc (int c, FILE *f)
{ /* redirect a char to the UART */
    sendchar (c);
    return c;
}
```



## Image memory map

- Target hardware usually has several memory devices at different address ranges
- Scatterloading
  - O describes memory location of code&data at load&run-time

O armlink -scatter scatfile.scf file1.o file2.o





### **Reset and Initialisation**

- Usually need to initialise:
  - vector table
  - O stack pointers in IRQ/FIQ modes
  - MMU/MPU
  - O other hardware

## Initialisation sequence example

MANCHEstER





## **System Development**

### Outline:

- O system modelling
- → on-chip debug
- O AMBA
- O rapid silicon prototyping
- O embedded ARM cores

#### hands-on: system modelling



## **On-chip debug**

- The University of Manchester
- Debug monitor: Angel
  - O runs on target hardware with the application
  - requires target resources (memory, exception vectors, ...)
- Integrated on-chip debug: Multi-ICE / RealView ICE
  - O non intrusive, requires almost no resources
  - Instead, uses additional debug hardware within the core
    - ARM processor debug extension signals (main ones: BREAKPT, DBGRQ, DBGACK)
    - EmbeddedICE, Embedded Trace



## EmbeddedICE

- Hardware registers controlled through:
  - JTAG boundary scan
  - O Debug coprocessor
- Two possible actions:
  - Halt debug-mode debugging
    - processor halts at debug events
    - unsuitable for real-time systems
  - Monitor debug-mode debugging
    - debug events generate exceptions (aborts)
    - non-intrusive mode, for debugging real-time systems



- Joint Test Action Group
  - Iooked especially at PCB production test
    - surface mount defeats bed of nails approach
  - O on-chip scan path gives access to pins
    - so chip to chip paths can be tested
  - other uses are a side benefit:
    - in-circuit testing of the chip core logic
    - chip debug support, e.g. EmbeddedICE
  - Note: not primarily for VLSI production test!

**MANCHEstER** 

## **JTAG Boundary Scan Organization**



MANCHEstER 1824

### EmbeddedICE

**MANCHEstER** 

### □ ICE functions:

> breakpoints, watchpoints

- generate an event at a particular instruction/data access
- hardware can easily be included on chip
- N.B. ROM breakpoints require hardware!
- O trace buffer
  - retains interface state before and after trigger
  - Embedded Trace Macrocell now supported
  - uses hardware compression to reduce pin requirement

## **Breakpoints and Watchpoints**

- Breakpoint
  - if this memory address is fetched as an instruction an exception occurs
    - may be inserted as an instruction (BKPT)
    - may be detected in hardware
- Watchpoint
  - if this memory address is accessed by a load or store an exception occurs
    - must be detected in hardware

**MANCHEstER** 

## **Breakpoints and Watchpoints**

- ARM break- watchpoint hardware
  - O mask and pattern
  - O trap if selected bits match desired pattern
  - example:



0000000000100110001110000000000000000

breaks on word addresses 0x00131C00 - 0x00131CFC

**MANCHEstER** 

## EmbeddedICE register read and write structure



#### O Registers accessed via scan chain

MANCHEstER



| Address | Width | Function                     |  |  |
|---------|-------|------------------------------|--|--|
| 00000   | 3     | Debug control                |  |  |
| 00001   | 5     | Debug status                 |  |  |
| 00100   | 6     | Debug comms control register |  |  |
| 00101   | 32    | Debug comms data register    |  |  |
| 01000   | 32    | Watchpoint 0 address value   |  |  |
| 01001   | 32    | Watchpoint 0 address mask    |  |  |
| 01010   | 32    | Watchpoint 0 data value      |  |  |
| 01011   | 32    | Watchpoint 0 data mask       |  |  |
| 01100   | 9     | Watchpoint 0 control value   |  |  |
| 01101   | 8     | Watchpoint 0 control mask    |  |  |
| 10000   | 32    | Watchpoint 1 address value   |  |  |
| 10001   | 32    | Watchpoint 1 address mask    |  |  |
| 10010   | 32    | Watchpoint 1 data value      |  |  |
| 10011   | 32    | Watchpoint 1 data mask       |  |  |
| 10100   | 9     | Watchpoint 1 control value   |  |  |
| 10101   | 8     | Watchpoint 1 control mask    |  |  |

## **Embedded Trace**

**MANCHEstER** 

#### □ The **embedded trace macrocell** (ETM) comprises:

- Itrace port outputs processor signals
- filtering/triggering allows capture of wanted data
  - triggering allows capture from selected code
  - filtering disregards unwanted data saves storage/bandwitdh

□ these can make the processor behaviour *observable* 

O signals available at trace port



### **Embedded Trace**





### **Embedded Trace**

### A trace buffer can be added to store trace signals

- essential at high speeds!
- **Comprises:** 
  - O trace interface
  - O JTAG interface
  - AHB bus interface
- Needs:
  - RAM to store traces



## **Debug Unit**

Programmable through CP14 or scan chains

#### Characteristics

- instruction address comparators for triggering breakpoints
- O data address comparators for triggering watchpoints
- O bidirectional Debug Communication Channel
- O ability to disable caches and TLBs
- mode for debugging real-time systems (e.g. servo mechanisms)



## **Debug Unit**

- Halt debug-mode debugging
  - O processor halts at debug events (breakpoints, ...)
  - when halted, external host can examine and modify its state using the DBGTAP pin
  - O unsuitable for real-time systems
  - O requires external hardware to control DBGTAP
- Monitor debug-mode debugging
  - O debug events generate exceptions
  - O handler can program new debug events through CP14



### **CP14 Registers**

| Register Opcode2:CRm | Abbreviation     | Name                               |  |
|----------------------|------------------|------------------------------------|--|
| 0                    | DIDR             | Debug ID Register                  |  |
| 1                    | DSCR             | Debug Status and Control Register  |  |
| 2-4                  | -                | Reserved                           |  |
| 5                    | DTR              | Data Transfer Register             |  |
| 6                    | WFAR             | Watchpoint Fault Address Register  |  |
| 7                    | VCR              | Vector Catch Register              |  |
| 8-9                  | -                | Reserved                           |  |
| 10                   | DSCCR            | Debug State Cache Control Register |  |
| 11                   | DSMCR            | Debug State MMU Control Register   |  |
| 12-63                | -                | Reserved                           |  |
| 64-69                | BVR <sub>N</sub> | Breakpoint Value Registers         |  |
| 70-79                | _                | Reserved                           |  |
| 80-85                | BCR <sub>N</sub> | Breakpoint Control Registers       |  |
| 86-95                | _                | Reserved                           |  |
| 96-97                | WVR <sub>N</sub> | Watchpoint Value Registers         |  |
| 98-111               | -                | Reserved                           |  |
| 112-113              | BVR <sub>N</sub> | Watchpoint Control Registers       |  |
| 114-127              | _                | Reserved                           |  |

## **System Performance Monitoring**

□ A small collection of counters, triggered by 'events'

- e.g. cache miss, TLB miss, dependency stall, branch mispredicted, …
- O configurable
- O can cause interrupts after a preset number of events
- Introduced in ARM11
- Can be used for code profiling
- Accessible via CP15

**MANCHEstER** 



## **System Development**

### Outline:

- O system modelling
- O on-chip debug

#### → AMBA

- O rapid silicon prototyping
- O embedded ARM cores

#### hands-on: system modelling

The University of Manchester

## AMBA

### Advanced Microprocessor Bus Architecture

- O a systematic solution to assembling macrocell-based systems
- O ARM Ltd's attempt to establish an on-chip bus standard
- □ AMBA structure:
  - Advanced High-performance Bus (AHB)
    - high-performance, multi-master
  - O Advanced Peripheral Bus (APB)
    - interface for low performance peripherals
  - O Advanced eXtensible Interface (AXI) (new)

# A typical AMBA-based System



## **AMBA Test Interface**

VLSI production test is an economically important issue

- macrocell based designs present problems
  - how can each macrocell be systematically tested?
- O AMBA offers a standardised solution
  - based on 32-bit parallel access, via the bus, to test registers

**MANCHEstER** 



### **AMBA Standards**

| Bus | Master | Performance | pipelined/split<br>transactions | Other                                          |
|-----|--------|-------------|---------------------------------|------------------------------------------------|
| AHB | multi  | high        | yes                             | 32- to 1024-bit data bus                       |
| APB | single | low         | no                              | used to reduce main bus load                   |
| AXI | multi  | high        | yes                             | separate data buses<br>out-of-order completion |

□ AXI is intended as a replacement for the AHB bus

#### O used for future designs

Some components already developed:

- L220 level-2 cache controller
- PL300 configurable interconnect
- PL340 SDRAM controller



## **System Development**

### Outline:

- O system modelling
- O on-chip debug
- O AMBA
- rapid silicon prototyping
- O embedded ARM cores

#### hands-on: system modelling



### Excalibur





○ ARM-based computer ...

O ... plus LOTS of uncommitted gates



### **Excalibur**





## **System Development**

### Outline:

- O system modelling
- O on-chip debug
- O AMBA
- O rapid silicon prototyping
- embedded ARM cores

#### hands-on: system modelling



## **VLSI OneC GSM chip**



# Typical OneC system configuration





## DRACO

### DECT Radio Communications Controller

- In collaboration with Hagenuk GmbH
- O combines ISDN and DECT telecommunications systems
- world's first "commercial" 32-bit asynchronous SoC product
  - ... would have been ...

Process 0.35 µm Metal layers 3 Vdd 3.3 V Transistors 825,000 Die area 21 mm<sup>2</sup> Clock none

MIPS 100 Power 215 mW MIPS/W 465



The University of Manchester

### DRACO



## Hands-on: system modelling

MANCHEstER

- Using the ARMulator
  - to generate address traces
  - O to get performance estimates
    - using the memory map facility
  - O advanced configuration
    - adding your own system models
  - Follow the 'Hands-on' instructions