Research

At its core, research in my group has traditionally been concerned with electronic system-level design (ESL/SLD) of embedded computer systems, where a specific focus has been on system-level design automation methodologies, technologies and tools. Many results continue to flow into the development of the System-on-Chip Design Enviroment (SCE), which realizes an automated design flow for synthesis of high-level system specifications down to highly heterogeneous multi-processor and multi-core systems-on-chip (MPCSoCs) spanning across hardware and software boundaries. A commercial derivate of SCE, called called SER (Specify-Explore-Refine), has been developed and deployed for use in suppliers of space-electronic components for the Japanese Aerospace Exploration Agency (JAXA). Both SCE and SER are based on the SpecC system-level design language (SLDL), which has been cited as major reference for the development of SystemC, the leading, industry-standard SLDL today.

Andreas Gerstlauer, Christian Haubelt, Andy D. Pimentel, Todor P. Stefanov, Daniel D. Gajski, Jürgen Teich, "Electronic System-Level Synthesis Methodologies," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 28, no. 10, pp. 1517-1530, October 2009.
D. D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner, Embedded System Design: Modeling, Synthesis, Verification, Springer, September 2009.
Rainer Dömer, Andreas Gerstlauer, Junyu Peng, Dongwan Shin, Lukai Cai, Haobo Yu, Samar Abdi, and Daniel D. Gajski, "System-on-Chip Environment: A SpecC-Based Framework for Heterogeneous MPSoC Design," EURASIP Journal on Embedded Systems (JES), vol. 2008, Article ID 647953, 13 pages, 2008.
A. Gerstlauer, R. Dömer, J. Peng, D. D. Gajski, System Design: A Practical Guide with SpecC, Kluwer, 2001.

More recently, I have become interested in emerging system design challenges at the boundaries of embedded, general-purpose, high-performance and distributed computing, where traditional boundaries are blurring. This creates fundamentally new design challenges and research opportunities that we aim to investigate in my group.

More details about recent and on-going research projects in my group are available on my group's webpage.

Internet of Things (IoT) and Edge Computing

In the Internet of Things (IoT), Cyber-Physical Systems (CPS) and edge computing, applications and architectures are characterized by inherently networked and distributed processing of data-intensive tasks on small, resource-constrained embedded devices. In such networks-of-systems (NoS), computation and communication are tightly coupled. This brings new challenges and opportunities for co-design of system devices, networks, and the mapping of applications onto them. We aim to develop novel network-level design and design automation approaches to support such co-design of IoT, edge computing and networked CPS/embedded systems. This includes research into novel application programming models, application partitioning approaches, runtimes and middlewares, mapping tools, as well as fast and accurate NoS/IoT simulators and design space exploration solutions.

Zhuoran Zhao, Kamyar Mirzazad Barijough, Andreas Gerstlauer, "DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters," IEEE Transactions on Computer-Aided Design (TCAD), Special Issue on Embedded Systems Week (ESWEEK) 2018, International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), vol. 37, no. 11, pp. 2348-2359, October 2018. (pre-print)
Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer, "Network-level Design Space Exploration of Resource-constrained Networks-of-Systems," ACM Transactions on Embedded Computing Systems (TECS), vol. 19, no. 4, pp. 22:1–22:26, June 2020. (preprint)
Kamyar Mirzazad Barijough, Zhuoran Zhao, and Andreas Gerstlauer, "Quality/Latency-Aware Real-time Scheduling of Distributed Streaming IoT Applications," ACM Transactions on Embedded Computer Systems (TECS), Special Issue on Embedded Systems Week (ESWEEK), International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), vol. 18, no. 5s, pp. 83:1–83:23, October 2019. (preprint)

Accelerator-Rich, Heterogeneous Multi-Core Architectures

With architectural innovations and technology scaling reaching fundamental limits, energy efficiency is one of the primary design concerns today. It is well-accepted that specialization and heterogeneity can achieve both high performance and low power consumption, but there are fundamental tradeoffs between flexibility and specialization in determining the right mix of cores on a chip. Furthermore, with increasing acceleration, communication between heterogeneous components is rapidly becoming the major bottleneck, where architectural and runtime support for orchestration of data movement and optimized mapping of applications is critical. We study these questions through algorithm/architecture co-design of specialized architectures and accelerators for various domains, as well as novel system architectures and tools for accelerator integration and heterogeneous system design.

Kishore Punniyamurthy and Andreas Gerstlauer, "TAFE: Thread Address Footprint Estimation for Capturing Data/Thread Locality in GPU Systems," Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), virtual conference, October 2020.
Mochamad Asri, Dhairya Malhotra, Jiajun Wang, George Biros, Lizy K. John, and Andreas Gerstlauer, "Hardware Accelerator Integration Tradeoffs for High-Performance Computing: A Case Study of GEMM Acceleration in N-Body Methods," IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 32, no. 8, pp. 2035–2048, February 2021. (preprint)
Ardavan Pedram, Robert van de Geijn, Andreas Gerstlauer, "Codesign Tradeoffs for High-Performance, Low-Power Linear Algebra Architectures ," IEEE Transactions on Computers (TC), Special Issue on Energy Efficient Computing, vol. 61, no. 12, December 2012.

Machine Learning-Based Power and Performance Prediction

Early power and performance estimation is a key challenge in computer system design today. Traditional simulation-based or purely analytical methods are often too slow or inaccurate. We instead aim to apply advanced machine learning techniques to synthesize models that can accurately predict the power and performance of hardware or software components in a target platform purely from statistics obtained while performing high-level simulations or natively executing code on a host. We study such learning-based approaches for modeling of both software running on CPUs and hardware accelerators. In addition, we investigate approaches for machine learning-based modeling and prediction of workload behavior to aid in runtime optimization of systems.

Erika S. Alcorta, Pranav Rama, Aswin Ramachandran, and Andreas Gerstlauer, "Phase-Aware CPU Workload Forecasting," Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS), virtual conference, July 2021.
Xinnian Zheng, Lizy K. John, Andreas Gerstlauer, "Accurate Phase-Level Cross-Platform Power and Performance Estimation," Proceedings of the ACM/IEEE Design Automation Conference (DAC), Austin, TX, June 2016. (best paper award)
Dongwook Lee and Andreas Gerstlauer, "Learning-Based, Fine-Grain Power Modeling of System-Level Hardware IPs," ACM Transactions on Design Automation of Electronic Systems (TODAES), vol. 23, no. 3, pp. 30:1–30:25, February 2018.
Wooseok Lee, Youngchun Kim, Jee Ho Ryoo, Dam Sunwoo, Andreas Gerstlauer, Lizy K. John, "PowerTrain: A Learning-based Calibration of McPAT Power Models," Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), Rome, Italy, July 2015.

Source-Level Simulation and Host-Compiled Modeling

Simulations remain one of the primary mechanisms for early validation and exploration of software-intensive systems with complex, dynamic multi-core and multi-processor interactions. With traditional virtual platforms becoming too inaccurate or slow, we are investigating alternative, fast yet accurate source-level and host-compiled simulation approaches. In such models, fast functional source code is back-annotated with statically estimated target metrics and natively compiled and executed on a simulation host. So-called host-compiled models extend pure source-level approaches by wrapping back-annotated code into lightweight models of operating systems and processors that can be further integrated into standard, SystemC-based transaction-level modeling (TLM) backplanes for co-simulation with other system components.

Zhuoran Zhao, Andreas Gerstlauer, Lizy K. John, "Source-Level Performance, Energy, Reliability, Power and Thermal (PERPT) Simulation," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 36, no. 2, pp. 299-312, February 2017.
Oliver Bringmann, Wolfgang Ecker, Andreas Gerstlauer, Ajay Goyal, Daniel Mueller-Gritschneder, Prasanth Sasidharan, Simranjit Sing, "The Next Generation of Virtual Prototyping: Ulta-fast Yet Accurate Simulation of HW/SW Systems," Proceedings of the Design, Automation and Test in Europe (DATE) Conference, Grenoble, France, March 2015.
Parisa Razaghi, Andreas Gerstlauer, "Host-Compiled Multi-Core System Simulation for Early Real-Time Performance Evaluation," ACM Transactions on Embedded Computer Systems (TECS), Special Issue on Virtual Prototyping of Parallel and Embedded Systems (ViPES), vol. 13, no. 5s, November 2014.
Andreas Gerstlauer, Haobo Yu, Daniel D. Gajski, "RTOS Modeling for System-Level Design," in Design, Automation, and Test in Europe: The Most Influential Papers of 10 Years DATE, edited by Rudy Lauwereins and Jan Madsen, Springer, Netherlands, ISBN 978-1-4020-6487-6, March 2008.

Approximate Computing

Approximate computing has emerged as a novel paradigm for achieving significant energy savings by trading off computational precision and accuracy in inherently error-tolerant applications, such as machine learning, recognition, synthesis and signal processing systems. This introduces a new notion of quality into the design process. We are exploring such approaches at various levels. At the hardware level, we have studied fundamentally achievable quality-energy (Q-E) tradeoffs in core arithmetic and logic circuits applicable to a wide variety of applications. The on-going goal is fold such insights into formal analysis and synthesis techniques for automatic generation of Q-E optimized hardware and software systems.

Seogoo Lee, Lizy K. John, Andreas Gerstlauer, "High-Level Synthesis of Approximate Hardware under Joint Precision and Voltage Scaling," Proceedings of the Design, Automation and Test in Europe (DATE) Conference, Lausanne, Switzerland, March 2017.
Jin Miao, Andreas Gerstlauer, Michael Orshansky, "Approximate Logic Synthesis under General Error Magnitude and Frequency Constraints," Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA, November 2013.
Jin Miao, Ku He, Andreas Gerstlauer, Michael Orshansky, "Modeling and Synthesis of Quality-Energy Optimal Approximate Adders," Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA, November 2012.
Ku He, Andreas Gerstlauer, Michael Orshansky, "Circuit-Level Timing-Error Acceptance for Design of Energy-Efficient DCT/IDCT-based Systems," IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), vol. 23, no. 6, June 2013.

System Compilation and Synthesis

The key to automation of any design process (synthesis) is a formal design methodology with well-defined, semantically sound models and transformation. At the system level, concurrent models of computation (MoCs) for specification of system behavior are transformed into instances of models of architectures (MoAs), which are then further synthesized into heterogeneous hardware and software. Within this context, we are investigating algorithms and tools for system-level synthesis of widely-used parallel programming MoCs onto MPCSoC platforms all the way down to final hardware and software implementations. Overall, this is aimed at establishing a complete system compiler that can automatically and optimally map parallel application models onto heterogeneous multi-processor/multi-core platforms, which include FPGAs and other hardware components.

Jing Lin, Andreas Gerstlauer, Brian L. Evans, "Communication-Aware Heterogeneous Multiprocessor Mapping for Real-Time Streaming Systems," Journal of Signal Processing Systems, vol. 69, no. 3, December 2012.
Jens Gladigau, Andreas Gerstlauer, Christian Haubelt, Martin Streubühr, Jürgen Teich, "Automatic System-Level Synthesis: From Formal Application Models to Generic Bus-Based MPSoCs," Transactions on High-Performance Embedded Architectures and Compilers (Transactions on HiPEAC), vol. 5, no. 4, 2011.
Dongwook Lee, Hyungman Park, Andreas Gerstlauer, "Synthesis of Optimized Hardware Transactors from Abstract Communication Specifications," Proceedings of the IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), Tampere, Finland, October 2012.