School of Computer Science Intranet
Configuring a Massively Parallel CMP System for Real Time Neural Applications
Khan, M. M.
Configuring a million-core parallel system at boot time is a difficult process when the system has neither specialised boot-up hardware support nor a preconfigured default state that puts it in operating condition. The SpiNNaker massively-parallel computing system has been designed to support large-scale simulations of biologically-inspired neural networks in real-time. The system building block is a Chip Multiprocessor (CMP) using low-power embedded processors, with an asynchronous network-on-chip to support high-performance, scalable, and fault-tolerant parallel distributed processing. Where most large CMP systems feature a sideband network to complete the boot process, SpiNNaker has a single homogeneous network interconnect for both application inter-processor communication and system control functions such as boot load and run-time user-system interaction. This network improves fault tolerance and makes it easier to support dynamic run-time reconfiguration. However, it requires a boot process that is transaction-level compatible with the application's communications model. Since SpiNNaker uses event-driven asynchronous communication throughout, the loader operates with purely local control: there is no global synchronisation, state information, or transition sequence. A novel two-stage “unfolding” boot-up process efficiently configures a multi-CMP SpiNNaker into an integrated computing system and loads the application using a high-speed flood-fill technique with support for run-time reconfiguration. SystemC simulation of a multi-CMP SpiNNaker system indicates an error-free CMP configuration time of ~1.3 ms, while a high-level simulation of a full-scale system (with 64,000 CMPs) indicates a mean application-loading time of ~20 ms for a 100-Kbyte application that is virtually independent of the size of the system. The configuration process also supports application development through a hardware abstraction layer (HAL) that provides architectural visibility appropriate to the developer's purpose..
The thesis is available as PDF (4MB).