Papers

2017

The Potential of Dynamic Binary Modification and CPU/FPGA SoCs for Simulation
J Mawer, O Palomar, C Gorgovan, W Toms, A Nisbet, M Luján. FCCM 2017 - The 25th IEEE International Symposium on Field-Programmable Custom Machines.

MaxSim: A Simulation Platform for Managed Applications
A Rodchenko, C Kotselidis, A Nisbet, A Pop, M Luján ISPASS 2017: IEEE International Symposium on Performance Analysis of Systems and Software

Designing Low-Power, Low-Latency Networks-on-Chip by Optimally Combining Electrical and Optical Links
Sebastian Werner, Javier Navaridas, and Mikel Luján HPCA 2017: IEEE International Symposium on High Performance Computer Architecture

Heterogeneous Managed Runtime Systems: A Computer Vision Case Study
Christos Kotselidis, James Clarkson, Andrey Rodchenko, Andy Nisbet, John Mawer, and Mikel Luján. Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

Low Overhead Dynamic Binary Translation on ARM
Amanieu d'Antras, Cosmin Gorgovan, Jim Garside, Mikel Luján. Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation 2017

HyperMAMBO-X64: Using Virtualization to Support High-Performance Transparent Binary Translation
Amanieu d'Antras, Cosmin Gorgovan, Jim Garside, John Goodacre, Mikel Luján. Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments.

Boosting Java Performance using GPGPUs
J Clarkson, C Kotselidis, G Brown, M Luján. ARCS 2017: International Conference on Architecture of Computing Systems

Vectorization of Hybrid Breadth First Search on the Intel Xeon Phi
M Paredes, G Riley, M Luján. Proceedings of the ACM International Conference on Computing Frontiers (available here)

Algorithmic Performance-Accuracy Trade-off in 3D Vision Applications Using HyperMapper,
Luigi Nardi, Bruno Bodin, Sajad Saeedi, Emanuele Vespa, Andrew J. Davison and Paul H. J. Kelly. iWAPT '17: Proceedings of the 2017 IEEE International Workshop on Automatic Performance Tuning, hosted at 31st IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2017 (available here)

Application-oriented Design Space Exploration for SLAM Algorithms,
Sajad Saeedi, Luigi Nardi, Edward Johns, Bruno Bodin, Paul H. J. Kelly and Andrew J. Davison. ICRA '17: Proceedings of IEEE International Conference on Robotics and Automation, 2017 (available here)

Merge or Separate? Multi-job Scheduling for OpenCL Kernels on CPU/GPU Platforms,
Wen, Y. & O'Boyle, M. 5 Feb 2017 Workshop about general purpose processing using GPUs (GPGPU-10): Held in cooperation with 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'17). ACM, p. 22-31 10 p. (available here)

Discovery and exploitation of general reductions: a constraint based approach,
Ginsbach, P. & O'Boyle, M. Feb 2017 CGO 2017 Proceedings of the 2017 International Symposium on Code Generation and Optimization. IEEE, p. 269-280 12 p. (available here)

SimBench: A Portable Benchmarking Methodology for Full-System Simulators,
Wagstaff, H., Bodin, B., Spink, T. & Franke, B. 27 Jan 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) 2017. IEEE, 10 p. (available here)

2016

Diplomat: mapping of multi-kernel applications using a static dataflow abstraction,
Bruno Bodin, Luigi Nardi, Paul H. J. Kelly and Michael O'Boyle. MASCOTS '16: Proceedings of the International Conference on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, 2016 (available here)

Comparative Design Space Exploration of Dense and Semi-Dense SLAM,
M. Zeeshan Zia, Luigi Nardi, Andrew Jack, Emanuele Vespa, Bruno Bodin, Paul H. J. Kelly and Andrew J. Davison. ICRA '16: Proceedings of the IEEE International Conference on Robotics and Automation, 2016 (available here)

Integrating Algorithmic Parameters into Benchmarking and Design Space Exploration in 3D Scene Understanding,
Bruno Bodin, Luigi Nardi, M. Zeeshan Zia, Harry Wagstaff, Govind Sreekar Shenoy, Murali Emani, John Mawer, Christos Kotselidis, Andy Nisbet, Mikel Luján, Björn Franke, Paul H.J. Kelly, Michael O'Boyle. PACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016 (avaliable here)

MAMBO: A Low-Overhead Dynamic Binary Modification Tool for ARM,
Cosmin Gorgovan, Amanieu D'Antras and Mikel Luján. ACM Transactions on Architecture and Code Optimization. Volume 13, Issue 1 April 2016 (avaliable here)

Optimizing Indirect Branches in Dynamic Binary Translators,
Amanieu D'Antras, Cosmin Gorgovan, Jim Garside and Mikel Luján. ACM Transactions on Architecture and Code Optimization. Volume 13, Issue 1 April 2016 (avaliable here)

Portable and transparent software managed scheduling on accelerators for fair resource sharing,
Christos Margiolas and Michael F. P. O'Boyle. Proceedings of the 2016 International Symposium on Code Generation and Optimization (avaliable here)

Breadth First Search Vectorization on the Intel Xeon Phi,
Mireya Paredes, Graham D. Riley, Mikel Luján. ACM International Conference on Computing Frontiers 2016 (avaliable here)

Towards co-designed optimizations in parallel frameworks: A MapReduce case study,
Colin Barrett, Christos Kotselidis, Mikel Luján. ACM International Conference on Computing Frontiers 2016 (avaliable here)

Compiler-Driven Software Speculation for Thread-Level Parallelism,
Paraskevas Yiapanis, Gavin Brown and Mikel Luján. ACM Transactions on Programming Languages and Systems Volume 38, Issue 2, January 2016 (avaliable here)

A Survey on Design Approaches to Circumvent Permanent Faults in Networks-on-Chip,
Sebastian Werner, Javier Navaridas and Mikel Luján. ACM Computing Surveys Vol 48, Issue 4, May 2016 (avaliable here)

Selecting Heterogeneous Cores for Diversity,
Tomusk, E-A., Dubach, C. & O'Boyle, M. Dec 2016 In : ACM Transactions on Architecture and Code Optimization. 13, 4, p. 1-25 25 p. (avaliable here)

Diversity: A Design Goal for Heterogeneous Processors,
Tomusk, E., Dubach, C. & O'Boyle, M. Jul 2016 In : Computer Architecture Letters. 15, 2, p. 81 - 84 4 p. (avaliable here)

Hardware Accelerated Cross-Architecture Full-System Virtualization,
Spink, T., Wagstaff, H. & Franke, B. Oct 2016 In : ACM Transactions on Architecture and Code Optimization. 13, 4, 25 p., 36 (available here)

2015

Four Metrics to Evaluate Heterogeneous Multicores.
Erik Tomusk, Christophe Dubach, and Michael O'boyle ACM Transactions on Architrchitecture and Code Optimization. Volume 12, Issue 4 November 2015 (available here)

Cyclic Power-Gating as an Alternative to Voltage and Frequency Scaling,
Yaman Cakmakci, Will Toms, Javier Navaridas, Mikel Luján. IEEE Computer Architecture Letters, 2015 (pdf)

A Scalable Implementation of Information Theoretic Feature Selection for High Dimensional Data,
Anthony Kleerekoper, Michael Pappas, Adam Pocock, Gavin Brown, Mikel Lujan. IEEE International Conference on Big Data 2015.

Analysis of FPGA and Software Approaches to Simulate Unconventional Computer Architectures,
Mohsen Ghasempour, Jonathan Heathcote, Javier Navaridas, Luis A. Plana, Jim Garside and Mikel Luján. International Conference on Reconfigurable Computing and FPGAS 2015.

Effective Barrier Synchronization on Intel Xeon Phi Coprocessor
Andrey Rodchenko , Andy Nisbet, Antoniu Pop, Mikel Luján. Euro-Par 2015: Parallel Processing (avaliable here)

Amon: Advanced Mesh-like Optical NoC,
Sebastian Werner, Javier Navaridas and Mikel Luján IEEE Symposium on High Performance Interconnects, 2015 (pdf)

Boosting Java Performance using GPGPUs,
James Clarkson, Christos Kotselidis, Gavin Brown, Mikel Luján. CoRR (pdf)

Experiences in Speeding Up Computer Vision Applications on Mobile Computing Platforms,
Luna Backes, Alejandro Rico, Björn Franke. Proceedings of the International Symposium on Systems, Architectures, Modeling, and Simulation (SAMOS XV) (pdf)

Comparative Design Space Exploration of Dense and Semi-Dense SLAM,
M. Z. Zia, L. Nardi, A. Jack, E. Vespa, B. Bodin, P. H. J. Kelly and A. J. Davison. Submitted. (pdf)

Reasoning in complex environments with the SelectScript declarative language,
A. Dietrich, S. Zug, L. Nardi and J. Kaiser. Int. Workshop on Domain-Specific Languages and models for ROBotic systems (DSLRob), Hamburg, Germany, October 2015. (pdf)

Free Rider: A Tool for Retargeting Platform-Specific Intrinsic Functions,
Stanislav Manilov, Bj�rn Franke, Anthony Magrath, Cedric Andrieu. Proceedings of the 2015 Conference on Languages, Compilers, Tools and Theory for Embedded Systems (LCTES'15), Portland, USA, June, 2015. (pdf)

PALMOS: A Transparent, Multi-tasking Acceleration Layer for Parallel Heterogeneous Systems. ,
Christos Margiolas, Michael F. P. O'Boyle: 2015 Proceedings of the 29th ACM on International Conference on Supercomputing (ICS). New York, NY, USA: ACM, p. 307-318 12 p.

Celebrating diversity: a mixture of experts approach for runtime mapping in dynamic environments. ,
Murali Krishna Emani, Michael F. P. O'Boyle, 2015 Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). New York, NY, USA: ACM, p. 499-508 10 p.

Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM,
Luigi Nardi, Bruno Bodin, M. Zeeshan Zia, John Mawer, Andy Nisbet, Paul H. J. Kelly, Andrew J. Davison, Mikel Luján, Michael F. P. O'Boyle, Graham Riley, Nigel Topham, Steve Furber. IEEE Intl. Conf. on Robotics and Automation (ICRA), May 2015. (pdf)

2014

Portable and Transparent Host-Device Communication Optimization for GPGPU Environments.,
Christos Margiolas, Michael F. P. O'Boyle, International Symposium on Code Generation And Optimisation (CGO) 2014: 55.

A compiler framework for automatically mapping data parallel programs to heterogeneous MPSoCs. ,
Kiran Chandramohan, Michael F. P. O'Boyle, Proceedings of the 2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES). New York, NY, USA: ACM, p. 9:1-9:10 10 p. 9

Partitioning data-parallel programs for heterogeneous MPSoCs: time and energy design space exploration. ,
Kiran Chandramohan, Michael F. P. O'Boyle, 2014 Proceedings of the 2014 SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems(LCTES). New York, NY, USA: ACM, p. 73-82 10 p.

Automatic optimization of thread-coarsening for graphics processors.,
Alberto Magni, Christophe Dubach, Michael F. P. O'Boyle. International Conference on Parallel Architectures and Compilation (PACT) 2014: 455-466. (pdf)

Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms. ,
Yuan Wen, Zheng Wang, Michael F. P. O'Boyle, High Performance Computing (HiPC) 2014: 1-10.

An empirical evaluation of High-Level Synthesis languages and tools for database acceleration,
Oriol Arcas-Abella, Geoffrey Ndu, Nehir Sonmez, Mohsen Ghasempour, Adria Armejach, Javier Navaridas, Wei Song, John Mawer, Adri�n Cristal, Mikel Lujan. 24th International Conference on Field Programmable Logic and Applications (FPL) (pdf)

2013

Smart, adaptive mapping of parallelism in the presence of external workload.,
Murali Krishna Emani, Zheng Wang, Michael F. P. O'Boyle, 2013 Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). Washington, DC, USA: IEEE Computer Society, p. 1-10 10 p.

A large-scale cross-architecture evaluation of thread-coarsening. ,
Alberto Magni, Christophe Dubach, Michael F. P. O'Boyle: SuperComputing (SC) 2013: 11:1-11:11. (pdf)

A Novel Technique to Improve Parallel Program Performance Co-executing with Dynamic Workloads ,
Murali Krishna Emani, Michael F. P. O'Boyle, High Performance Computing (HiPC) 2013. 1-10.

Workshops:

Multi-space multi-objective design-space exploration for 3D scene understanding using active learning. Imperial College Department of Computing RA Symposium, London, June 11, 2015. Luigi Nardi (pdf)

Project Beehive: A Hardware/Software Co-designed Stack for Runtime and Architectural Research Christos Kotselidis. Ecoop 2015 Truffle workshop

Project Beehive: A HW/SW Codesigned Stack for Runtime & Architecture Research using Maxine VM Andy Nisbet (pdf)

Vertically-integrated exploration of algorithmic and implementation design spaces in 3D scene understanding. Invited talk IFIP 2.11, London, UK, November 11, 2015. Luigi Nardi (pdf)

A performance, energy and accuracy aware benchmarking methodology for robot vision. Invited speaker at NVIDIA GTC 2015 San Jose, California, March 17, 2015. Luigi Nardi (pdf)(video)

Change Detection Based Parallelism Mapping: Exploiting Offline Models and Online Adaptation. ,
Murali Krishna Emani, Michael F. P. O'Boyle, Languages and Compilers for Parallel Computing (LCPC) 2014: 208-223.

Exploiting GPU Hardware Saturation for Fast Compiler Optimization.,
Alberto Magni, Christophe Dubach, Michael F. P. O'Boyle. GPGPU@ASPLOS 2014: 99. (pdf)