|
|
Research
At its core, research in my group has traditionally been concerned
with electronic system-level design (ESL/SLD) of embedded computer
systems, where a specific focus has been on system-level design
automation methodologies, technologies and tools. Many results
continue to flow into the development of
the System-on-Chip
Design Enviroment (SCE), which realizes an automated design flow
for synthesis of high-level system specifications down to highly
heterogeneous multi-processor and multi-core systems-on-chip (MPCSoCs)
spanning across hardware and software boundaries. A commercial
derivate of SCE, called called SER (Specify-Explore-Refine), has been
developed and deployed for use in suppliers of space-electronic
components for the Japanese
Aerospace Exploration Agency (JAXA). Both SCE and SER are based
on the SpecC system-level
design language (SLDL), which has been cited as major reference for
the development of SystemC, the
leading, industry-standard SLDL today.
- Andreas Gerstlauer, Christian Haubelt, Andy D. Pimentel, Todor P. Stefanov, Daniel D. Gajski, Jürgen Teich,
"Electronic System-Level Synthesis Methodologies,"
IEEE Transactions on Computer-Aided Design of Integrated Circuits and
Systems (TCAD), vol. 28, no. 10, pp. 1517-1530, October 2009.
- D. D. Gajski, S. Abdi, A. Gerstlauer, G. Schirner,
Embedded
System Design: Modeling, Synthesis, Verification,
Springer, September 2009.
- Rainer Dömer, Andreas Gerstlauer, Junyu Peng, Dongwan Shin, Lukai Cai, Haobo Yu, Samar Abdi, and Daniel D. Gajski,
"System-on-Chip Environment: A SpecC-Based Framework for Heterogeneous MPSoC Design,"
EURASIP Journal on Embedded Systems (JES), vol. 2008, Article ID 647953, 13 pages, 2008.
- A. Gerstlauer, R. Dömer, J. Peng, D. D. Gajski,
System
Design: A Practical Guide with SpecC,
Kluwer, 2001.
More recently, I have become interested in emerging system design challenges
at the boundaries of embedded, general-purpose, high-performance and
distributed computing, where traditional boundaries are blurring.
This creates fundamentally new design challenges and
research opportunities that we aim to investigate in my
group.
More details about recent and on-going research projects in my group are available on my group's webpage.
Internet of Things (IoT) and Edge Computing
In the Internet of Things (IoT), Cyber-Physical Systems (CPS) and edge computing,
applications and architectures are characterized by inherently
networked and distributed processing of data-intensive tasks on small,
resource-constrained embedded devices. In such networks-of-systems
(NoS), computation and communication are tightly coupled. This brings
new challenges and opportunities for co-design of system devices,
networks, and the mapping of applications onto them. We aim to
develop novel network-level design and design automation approaches to
support such co-design of IoT, edge computing and networked CPS/embedded
systems. This includes research into novel application programming
models, application partitioning approaches, runtimes and middlewares, mapping tools, as well as fast and accurate
NoS/IoT simulators and design space exploration solutions.
- Zhuoran Zhao, Kamyar Mirzazad Barijough, Andreas Gerstlauer, "DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters,"
IEEE Transactions on Computer-Aided Design (TCAD), Special
Issue on Embedded Systems Week (ESWEEK) 2018, International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), vol. 37, no. 11, pp. 2348-2359, October 2018. (pre-print)
- Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer,
"Network-level Design Space Exploration of Resource-constrained Networks-of-Systems,"
ACM Transactions on Embedded Computing Systems (TECS), vol. 19, no. 4, pp. 22:1–22:26,
June 2020.
(preprint)
- Kamyar Mirzazad Barijough, Zhuoran Zhao, and Andreas Gerstlauer,
"Quality/Latency-Aware Real-time Scheduling of Distributed Streaming IoT Applications,"
ACM Transactions on Embedded Computer Systems (TECS), Special Issue on Embedded Systems Week (ESWEEK), International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), vol. 18, no. 5s, pp. 83:1–83:23,
October 2019.
(preprint)
Accelerator-Rich, Heterogeneous Multi-Core Architectures
With architectural innovations and technology scaling reaching
fundamental limits, energy efficiency is one of the primary design
concerns today. It is well-accepted that specialization and
heterogeneity can achieve both high performance and low power
consumption, but there are fundamental tradeoffs between flexibility
and specialization in determining the right mix of cores on a chip.
Furthermore, with increasing acceleration, communication between heterogeneous components is rapidly becoming the major bottleneck, where
architectural and runtime support for orchestration of data movement
and optimized mapping of applications is critical.
We study these questions through algorithm/architecture co-design
of specialized architectures and accelerators for various domains, as well as novel system architectures and tools for accelerator
integration and heterogeneous system design.
- Kishore Punniyamurthy and Andreas Gerstlauer,
"TAFE: Thread Address Footprint Estimation for Capturing Data/Thread Locality in GPU Systems,"
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), virtual conference,
October 2020.
- Mochamad Asri, Dhairya Malhotra, Jiajun Wang, George Biros, Lizy K. John, and Andreas Gerstlauer,
"Hardware Accelerator Integration Tradeoffs for High-Performance Computing: A Case Study of GEMM Acceleration in N-Body Methods,"
IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 32, no. 8, pp. 2035–2048,
February 2021.
(preprint)
- Ardavan Pedram, Robert van de Geijn, Andreas Gerstlauer,
"Codesign Tradeoffs for High-Performance, Low-Power Linear Algebra Architectures ,"
IEEE Transactions on Computers (TC), Special Issue on Energy Efficient Computing, vol. 61, no. 12, December 2012.
Machine Learning-Based Power and Performance Prediction
Early power and performance estimation is a key challenge in computer
system design today. Traditional simulation-based or purely analytical
methods are often too slow or inaccurate. We instead aim to apply
advanced machine learning techniques to synthesize models that can
accurately predict the power and performance of hardware or software
components in a target platform purely from statistics obtained while
performing high-level simulations or natively executing code on a host.
We study such learning-based approaches for modeling
of both software running on CPUs and hardware accelerators. In addition,
we investigate approaches for machine learning-based modeling and prediction of
workload behavior to aid in runtime optimization of systems.
- Erika S. Alcorta, Pranav Rama, Aswin Ramachandran, and Andreas Gerstlauer,
"Phase-Aware CPU Workload Forecasting,"
Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS), virtual conference,
July 2021.
- Xinnian Zheng, Lizy K. John, Andreas Gerstlauer,
"Accurate Phase-Level Cross-Platform Power and Performance Estimation,"
Proceedings of the ACM/IEEE Design Automation Conference (DAC),
Austin, TX, June 2016. (best paper award)
- Dongwook Lee and Andreas Gerstlauer,
"Learning-Based, Fine-Grain Power Modeling of System-Level Hardware IPs,"
ACM Transactions on Design Automation of Electronic Systems (TODAES), vol. 23, no. 3, pp. 30:1–30:25,
February 2018.
- Wooseok Lee, Youngchun Kim, Jee Ho Ryoo, Dam Sunwoo, Andreas Gerstlauer, Lizy K. John,
"PowerTrain: A Learning-based Calibration of McPAT Power Models,"
Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED),
Rome, Italy, July 2015.
Source-Level Simulation and Host-Compiled Modeling
Simulations remain one of the primary mechanisms for early validation
and exploration of software-intensive systems with complex, dynamic
multi-core and multi-processor interactions. With traditional virtual
platforms becoming too inaccurate or slow, we are investigating
alternative, fast yet accurate source-level and host-compiled
simulation approaches. In such models, fast functional source code is
back-annotated with statically estimated target metrics and natively
compiled and executed on a simulation host. So-called host-compiled
models extend pure source-level approaches by wrapping back-annotated
code into lightweight models of operating systems and processors that
can be further integrated into standard, SystemC-based
transaction-level modeling (TLM) backplanes for co-simulation with
other system components.
- Zhuoran Zhao, Andreas Gerstlauer, Lizy K. John,
"Source-Level Performance, Energy, Reliability, Power and Thermal (PERPT) Simulation,"
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 36, no. 2, pp. 299-312, February 2017.
- Oliver Bringmann, Wolfgang Ecker, Andreas Gerstlauer, Ajay Goyal,
Daniel Mueller-Gritschneder, Prasanth Sasidharan, Simranjit Sing,
"The Next Generation of Virtual Prototyping: Ulta-fast Yet Accurate Simulation of HW/SW Systems,"
Proceedings of the Design, Automation and Test in Europe (DATE) Conference,
Grenoble, France, March 2015.
- Parisa Razaghi, Andreas Gerstlauer,
"Host-Compiled Multi-Core System Simulation for Early Real-Time
Performance Evaluation,"
ACM Transactions on Embedded Computer Systems (TECS),
Special Issue on Virtual Prototyping of Parallel and Embedded Systems (ViPES),
vol. 13, no. 5s, November 2014.
- Andreas Gerstlauer, Haobo Yu, Daniel D. Gajski,
"RTOS
Modeling for System-Level Design,"
in Design, Automation, and Test in Europe: The Most
Influential Papers of 10 Years DATE, edited by Rudy
Lauwereins and Jan Madsen,
Springer, Netherlands, ISBN 978-1-4020-6487-6, March 2008.
Approximate Computing
Approximate computing has emerged as a novel paradigm for achieving
significant energy savings by trading off computational precision and
accuracy in inherently error-tolerant applications, such as machine
learning, recognition, synthesis and signal processing systems. This
introduces a new notion of quality into the design process. We are
exploring such approaches at various levels. At the hardware level,
we have studied fundamentally achievable quality-energy (Q-E)
tradeoffs in core arithmetic and logic circuits applicable to a wide
variety of applications. The on-going goal is fold such insights into
formal analysis and synthesis techniques for automatic generation of
Q-E optimized hardware and software systems.
- Seogoo Lee, Lizy K. John, Andreas Gerstlauer,
"High-Level Synthesis of Approximate Hardware under Joint Precision and Voltage Scaling,"
Proceedings of the Design, Automation and Test in Europe (DATE) Conference,
Lausanne, Switzerland, March 2017.
- Jin Miao, Andreas Gerstlauer, Michael Orshansky,
"Approximate Logic Synthesis under General Error Magnitude and Frequency Constraints,"
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD),
San Jose, CA, November 2013.
- Jin Miao, Ku He, Andreas Gerstlauer, Michael Orshansky,
"Modeling and Synthesis of Quality-Energy Optimal Approximate Adders,"
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD),
San Jose, CA, November 2012.
- Ku He, Andreas Gerstlauer, Michael Orshansky,
"Circuit-Level Timing-Error Acceptance for Design of Energy-Efficient DCT/IDCT-based Systems,"
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), vol. 23, no. 6, June 2013.
System Compilation and Synthesis
The key to automation of any design process (synthesis) is a formal
design methodology with well-defined, semantically sound models and
transformation. At the system level, concurrent models of computation
(MoCs) for specification of system behavior are transformed into
instances of models of architectures (MoAs), which are then further
synthesized into heterogeneous hardware and software. Within this
context, we are investigating algorithms and tools for system-level
synthesis of widely-used parallel
programming MoCs onto MPCSoC platforms all the way down to
final hardware and software implementations. Overall, this is aimed at
establishing a complete system compiler that can automatically and
optimally map parallel application models onto heterogeneous
multi-processor/multi-core platforms, which include FPGAs and other
hardware components.
- Jing Lin, Andreas Gerstlauer, Brian L. Evans,
"Communication-Aware Heterogeneous Multiprocessor Mapping for Real-Time Streaming Systems,"
Journal of Signal Processing Systems,
vol. 69, no. 3, December 2012.
- Jens Gladigau, Andreas Gerstlauer, Christian Haubelt, Martin Streubühr, Jürgen Teich,
"Automatic System-Level Synthesis: From Formal Application Models to Generic Bus-Based MPSoCs,"
Transactions on High-Performance Embedded Architectures and Compilers (Transactions on HiPEAC), vol. 5, no. 4, 2011.
- Dongwook Lee, Hyungman Park, Andreas Gerstlauer,
"Synthesis of Optimized Hardware Transactors from Abstract Communication Specifications,"
Proceedings of the IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS),
Tampere, Finland, October 2012.
|