At its core, research in my group has traditionally been concerned with electronic system-level design (ESL/SLD) of embedded computer systems, where a specific focus has been on system-level design automation methodologies, technologies and tools. Many results continue to flow into the development of the System-on-Chip Design Enviroment (SCE), which realizes an automated design flow for synthesis of high-level system specifications down to highly heterogeneous multi-processor and multi-core systems-on-chip (MPCSoCs) spanning across hardware and software boundaries. A commercial derivate of SCE, called called SER (Specify-Explore-Refine), has been developed and deployed for use in suppliers of space-electronic components for the Japanese Aerospace Exploration Agency (JAXA). Both SCE and SER are based on the SpecC system-level design language (SLDL), which has been cited as major reference for the development of SystemC, the leading, industry-standard SLDL today. We use both languages for research and teaching in my group.

More recently, I have become interested in emerging design challenges at the boundaries of embedded, general-purpose and high-performance computing, where energy efficiency is one of the primary and unifying design drivers. Embedded systems have traditionally been the domain of application-specific design. However, with costs of full-custom design becoming prohibitive in many ways, modern embedded systems demand reuse of programmable or reconfigurable MPCSoC platforms. At the same time, general-purpose systems are expected to integrate tens to hundreds of cores on a single chip, where power concerns drive the need for ever increasing specialization and heterogeneity. As such, traditional boundaries between embedded and general-purpose computing are blurring. This creates fundamentally new design challenges and research opportunities that we aim to investigate in my group. Existing system-level design tools, such as SCE, can thereby evolve into system compilers that can aid both in automated exploration of the architecture design space as well as in automatic mapping of high-level, parallel application programming models onto such heterogeneous platforms.

Learning-Based Power and Performance Estimation

Early power and performance estimation is a key challenge in computer system design today. Traditional simulation-based or purely analytical methods are often too slow or inaccurate. We instead aim to apply advanced machine learning techniques to synthesize models that can accurately predict the power and performance of hardware or software components in a target platform purely from statistics obtained while simulating or natively executing functional-only code on a vastly different host. We study such learning-based approaches for modeling of both software and white- or black-box hardware IPs, as well as calibration of existing models against post-silicon measurements.

Source-Level Simulation and Host-Compiled Modeling

Simulations remain one of the primary mechanisms for early validation and exploration of software-intensive systems with complex, dynamic multi-core and multi-processor interactions. With traditional virtual platforms becoming too inaccurate or slow, we are investigating alternative, fast yet accurate source-level and host-compiled simulation approaches. In such models, fast functional source code is back-annotated with statically estimated target metrics and natively compiled and executed on a simulation host. So-called host-compiled models extend pure source-level approaches by wrapping back-annotated code into lightweight models of operating systems and processors that can be further integrated into standard, SystemC-based transaction-level modeling (TLM) backplanes for co-simulation with other system components.

Approximate Computing

Approximate computing has emerged as a novel paradigm for achieving significant energy savings by trading off computational precision and accuracy in inherently error-tolerant applications, such as machine learning, recognition, synthesis and signal processing systems. This introduces a new notion of quality into the design process. We are exploring such approaches at various levels. At the hardware level, we have studied fundamentally achievable quality-energy (Q-E) tradeoffs in core arithmetic and logic circuits applicable to a wide variety of applications. The on-going goal is fold such insights into formal analysis and synthesis techniques for automatic generation of Q-E optimized hardware and software systems.

Accelerator-Rich, Heterogeneous Multi-Core Architectures

With architectural innovations and technology scaling reaching fundamental limits, energy efficiency is one of the primary design concerns today. It is well-accepted that specialization and heterogeneity can achieve both high performance and low power consumption, but there are fundamental tradeoffs between flexibility and specialization in determining the right mix of cores on a chip. We study these questions through algorithm/architecture co-design of specialized architectures and accelerators for various domains, including a novel, extremely energy-efficient Linear Algebra Processor (LAP), as well as novel architectures and tools for accelerator integration and heterogeneous system design.

System Compilation and Synthesis

The key to automation of any design process (synthesis) is a formal design methodology with well-defined, semantically sound models and transformation. At the system level, concurrent models of computation (MoCs) for specification of system behavior are transformed into instances of models of architectures (MoAs), which are then further synthesized into heterogeneous hardware and software. Within this context, we are investigating algorithms and tools for system-level synthesis of widely-used parallel programming MoCs onto MPCSoC platforms all the way down to final hardware and software implementations. Overall, this is aimed at establishing a complete system compiler that can automatically and optimally map parallel application models onto heterogeneous multi-processor/multi-core platforms, which include FPGAs and other hardware components.