Saturday, September 16

The Third Annual IEEE Workshop on Workload Characterization



Sunday, September 17

In conjunction with ICCD 2000:
The Second Annual Workshop on Hardware Support for Objects and Microarchitectures for Java



Monday, September 18

9:00-9:15 

Opening Remarks

Craig Chase, ICCD 2000 General Chair, The University of Texas at Austin
Sandip Kundu, ICCD 2000 Program Chair, Intel Corporation


9:15-10:00 

Keynote Address: On the road to a mobile information society

Dirk Friebel
Nokia GmbH - Nokia Research Center
Everyone is becoming increasingly mobile. The wide adoption of technologies such as mobile phones, laptop computers and personal digital assistants are good examples of this. We expect immediate access to information from a wide variety of online sources. These market dynamics are driving us towards a future where users will be able to handle all types of information and services.

Mobile Information society is:
· Mobility and the Internet are the drivers
· Work is no longer a place, it is wherever you need to be
· Mobile access to information and services solves real business problems and will improve the quality of life
· Solutions which enable mobility are available today
· Nokia has the global presence and the key core competencies in mobility and in the enabling technologies required for mobility
· Nokia will lead the way to the Mobile Information Society "And putting the Internet in everyone?s pocket"
 

10:30-12:00 

Session 1.1: New Architectures
Session Chair: Mauricio Breternitz, Motorola

This session is about architectures for new paradigms. The first paper illustrates the performance degradation of secure information processing and its mitigation with microarchitecture techniques. The second describes novel instruction-set architecture for accelerating both secure and multimedia information processing. The third paper describes an efficient simulator generator for new instruction-set architecture.

1.1.1. Architectural Impact of Secure Socket Layer on Internet Servers

K. Kant and R. Iyer, Intel, USA
P. Mohapatra, Michigan State University, USA
1.1.2. Fast Subword Permutation Instructions Using Omega and Flip Network Stages
Xiao Yang and Ruby B. Lee, Princeton University, USA
1.1.3. Sleipnir - An Instruction Level Simulator Generator
Tor Jeremiassen, Lucent Technologies, USA


Session 1.2: Fault-Simulation and ATPG at Different Design Levels
Session Chair: Nur Touba, The University of Texas at Austin

The first paper of this session proposes a new algorithm for dynamic fault grouping and simulation scheduling for transient fault simulation of analog circuits. In the second paper, a novel approach is presented for computing feedback polynomials and seeds of an LFSR such that a pseudo-exhaustive test is obtained. Finally, in the last paper a functional level test pattern generator combining genetic algorithms and symbolic methods is described.

1.2.1. Analog Transient Concurrent Fault Simulation with Dynamic Fault Grouping

Junwei Hou and Abhijit Chatterjee, Georgia Institute of Technology, USA
1.2.2. Pseudoexhaustive TPG with a Provably Low Number of LFSR Seeds
Dimitri Kagaris and Spyros Tragoudas, Southern Illinois University, USA
1.2.3. An Application of Genetic Algorithms and BDDs to Functional Testing
Fabrizio Ferrandi and Donatella Sciuto, Politecnico di Milano, Italy
Alessandro Fin and Franco Fummi, Universita? di Verona, Italy


Session 1.3: Advanced Design Techniques
Session Chair: Ken Shepard, Columbia University

In this session, we consider design and analysis techniques for advanced digital integrated circuits. The first paper describes skewed-static logic for low-power and high-performance. The second paper described an analysis framework for estimating power-supply integrity, while the last paper considers a local clocking mechanism based on a tunable delay line that calibrates itself form a low frequency global clock.

1.3.1. High-Performance, Low-Power Skewed Static Logic in Very Deep Submicron Technology

Chulwoo Kim, Jaesik Lee, Kwang-Hyun Baek, Eric Martina, and Sung-Mo Kang, University of Illinois at Urbana-Champaign, USA
1.3.2. Estimation of Inductive and Resistive Switching Noise on Power Supply Network in Deep Sub-micron CMOS Circuits
Shiyou Zhao, Kaushik Roy, and Chengkok Koh, Purdue University, USA
1.3.3. Self-Calibrating Clocks for Globally Asynchronous Locally Synchronous Systems
S.W. Moore, G.S. Taylor, P.A. Cunningham, R.D. Mullins, and P. Robinson, University of Cambridge, United Kingdom


12:00-1:00 

Lunch

Sponsored by Intel
 

1:00-3:00 

Session 2.1: Improving CPU Performance
Session Chair: Brian Grayson, Motorola

This session has four papers that show how to optimize CPU performance by improving interactions with the memory system. The techniques presented include a space-efficient value prediction strategy, followed by a dynamic program-flow-driven data placement mechanism and improved instruction cache. The third paper presents an architectural extension to support high performance heap data allocation. The final paper, presents an analysis of efficient ways to use a software controlled primary cache system for scientific computations.

2.1.1. Hybridizing and Coalescing Load Value Predictors

Martin Burtscher, University of Colorado Boulder USA
Benjamin G. Zorn, Microsoft Corporation, USA
2.1.2. A 2-way Thrashing-Avoidance Cache (TAC): An Efficient Instruction Cache Scheme for Object-Oriented Languages
Yul Chu and M. R. Ito, University of British Columbia, Canada
2.1.3. Architectural Support for Dynamic Memory Management
J. Morris Chang, Witawas Srisa-an and Chia-Tien Dan Lo, Illinois Institute of Technology, USA
2.1.4. SCIMA: Software Controlled Integrated Memory Architecture for High Performance Computing
Masaaki Kondo, Hideki Okawara, and Hiroshi Nakamura, The University of Tokyo, Japan
Taisuke Boku, University of Tsukuba, Japan


Session 2.2: Parasitic Modeling, Analysis and Optimization
Session Chair: Tom Dillinger, Sun Microsystems

This session begins with an exposition of new developments in the area of signal integrity. The first paper uses the idea of timing windows to perform static timing analysis in the presence of crosstalk. Next, a discussion on the design of output drivers under ground bounce considerations is presented. The last two papers are related to parasitic modeling: the third paper in the session builds a simple yet elegant model for spiral inductors, while the last presents a comparative analysis of parallel implementations of several capacitance extraction algorithms.

2.2.1. Worst Delay Estimation in Crosstalk Aware Static Timing Analysis

Tong Xiao and Malgorzata Marek-Sadowska, University of California, Santa Barbara, USA
2.2.2. Analysis and Optimization of Ground Bounce in Digital CMOS Circuits
Payam Heydari and Massoud Pedram, University of Southern California, USA
2.2.3. An Efficient and Accurate Model for RF/Microwave Spiral Inductors Using Microstrip Lines Theory
Nasser Masoumi, S. Safavi-Naeini, and Mohamed I. Elmasty, University of Waterloo, Canada
2.2.4. Comparative Study of Parallel Algorithms for 3-D Capacitance Extraction on Distributed Memory Multiprocessors
Yanhong Yuan and Prithviraj Banerjee, Northwestern University, USA


Session 2.3: Low Power and Arithmetic
Session Chair: Margarida Jacome, The University of Texas at Austin

This session presents new techniques for low power microprocessor design, a high-performance adder design, and a performance comparison of signal processing algorithms on commercial SIMD and VLIW microprocessors.

2.3.1. A Novel Low-Power Microprocessor

Rolf Hakenes and Yiannos Manoli, Institute of Microelectronics, University of Saarland
2.3.2. A Power Perspective of Value Speculation for Superscalar Microprocessors
Rafael Moreno, Luis Piñuel, Silvia del Pino, Francisco Tirado, Universidad Complutense de
Madrid, Spain.
2.3.3. Multilevel Reverse-Carry Adder
Javier D. Bruguera, University of Santiago de Compostela, Spain
Tomas Lang, University of California at Irvine, USA
2.3.4. Evaluating Signal Processing and Multimedia Applications on SIMD, VLIW and Superscalar Architectures
Deependra Talla, Lizy John, Viktor Lapinskii, and Brian Evans, The University of Texas at Austin, USA


3:30-5:00 

Session 3.1: Servers and Parallelism
Session Chair: Ruby Lee, Princeton University

This session presents papers involving an optimized index and data buffering mechanism for database workloads, an analysis of sharing patterns in multiprocessors, and power-performance trade-offs in multithreaded microprocessors.

3.1.1. Unified Index and Data Fine-Granularity Buffering: Approach and Implementation

Qiang Cao and Josep Torrellas, University of Illinois, USA
H. V. Jagadish, University of Michigan, USA
3.1.2. Analysis of Shared Memory Misses and Reference Patterns
Jeffrey B. Rothman, Lyris Technologies, Inc., USA
Alan Jay Smith, University of California, Berkeley, USA
3.1.3. Power-Sensitive Multithreaded Architecture
John S. Seng and Dean M. Tullsen, University of California, San Diego, USA
George Z. N. Cai, Intel, USA


Session 3.2: Circuit Optimization and Analysis
Session Chair: Shervin Hojat, IBM

This session presents papers on circuit optimizations for improved performance. The first paper performs a set of concurrent optimizations to modify a circuit to meet a performance objective, using a Lagrangian multiplier based method. The second paper also performs a set of concurrent optimizations, except that these are more closely related to device fabrication: it considers the possibility of varying channel lengths and oxide thickness to improve performance. The third paper presents a clustering-based approach to selecting a set of buffers for a library. Finally, the session closes with a description of a full-chip power analysis methodology for a state-of-the-art microprocessor.

3.2.1. Delay Constrained Optimization by Simultaneous Fanout Tree Construction, Buffer Insertion and Gate Sizing

I-Min Liu and Adnan Aziz, The University of Texas at Austin, USA
3.2.2. Application-based, Transistor-level Full-chip Power Analysis for 700 MHz PowerPC Microprocessor
Yi-Kan Cheng, Magma Design Automation, USA
David Bearden and Kanti Suryadevara, Motorola, USA
3.2.3. Buffer Library Selection (short paper)
Charles J. Alpert, R. Gopal Gandham, Jose L. Neves, and Stephen T. Quay, IBM, USA
3.2.4. High-Performance Low-Power CMOS Circuits Using Multiple Channel Length and Multiple Oxide Thickness (short paper)
Naran Sirisantana, Liqiong Wei, and Kaushik Roy, Purdue University, USA


Session 3.3: Logic Circuit Families
Session Chair: Shyh-Jye Jou, National Central University

This session considers novel circuit approaches for digital logic. The first paper describes high-performance, low-power logic using current-mode threshold gates, while the second paper considers design techniques based on skewed CMOS. The last paper describes a technique for speeding up static and dynamic logic.

3.3.1. Current-Mode Threshold Logic Gates

S. Bobba and I.N. Hajj, University of Illinois at Urbana-Champaign, USA
3.3.2. Skewed CMOS: Noise-Immune High-Performance Low-Power Static Circuit Family
Dinesh Somasekhar, Intel, USA
Alexandre Solomatnikov and Kaushik Roy, Purdue University, USA
3.3.3. Output Prediction Logic: A High-Performance CMOS Design Technique
Larry McMurchie, Su Kio, Gin Yee, Tyler Thorp, and Carl Sechen, University of Washington, USA


5:00-6:30 

Poster Session



Tuesday, September 19

9:00-10:00 

Keynote Address: The Future of Populist Parallelism

Greg Pfister
Senior Technical Staff Member
IBM Advanced Technology & Architecture, Server Design
This keynote covers past and future history of clusters. It will extrapolate the future from an observable cycle of cluster creation that has recurred several times in the past, combining that with the pieces and the glue that will be available in the future.
 

10:30-12:00 

Session 4.1: Intelligent Memory
Session Chair: Steven Reinhardt, University of Michigan

This session focuses on issues relating to DRAM-based memory systems, buffering and prefetching policies, as well as trade-offs involved in processors-in-memory under multiprogramming workloads.

4.1.1. A Study of Channeled DRAM Memory Architectures

Lars Friebe, University of Hannover, Germany
Yoshikazu Yabe, NEC Corp., Japan
Masato Motomura, NEC Corp., Japan
4.1.2. DRAM-Page Based Prediction and Prefetching
Haifeng Yu and Gershon Kedem Duke University, USA
4.1.3. Reducing Cost and Tolerating Defects in Page-Based Intelligent Memory
Mark Oskin, Diana Keen, Justin Hensley, Lucian-Vlad Lita, and Frederic T. Chong, University of California, Davis, USA


Session 4.2: Processor Microarchitecture
Session Chair: Steve Furber, The University of Manchester

This session presents new state-of-the-art techniques in cache memory, instruction streaming buffer, and branch decoupled processor design.

4.2.1. A selective temporal and aggressive spatial cache system based on time interval

Jung-Hoon Lee, Jang-Soo Lee, and Shin-Dug Kim, Yonsei University, Korea
4.2.2. Design of Instruction Stream Buffer with Trace Support for X86 Processors
Jih-Ching Chiu, I-Huan Huang, and Chung-Ping Chung, National Chiao Tung University, Taiwan
4.2.3. A Trace Based Evaluation of Speculative Branch Decoupling
Anshuman Nadkarni and Akhilesh Tyagi, Iowa State University, USA


Session 4.3: Digital Logic Techniques
Session Chair: Barbara Chappell, Intel

This session considers novel circuit techniques applied to digital ICs. The first paper provides an interesting technique for using the DRAM cell array to do addition. The second paper presents a fixed-width array multiplier and a Booth multiplier for digital signal processing applications, while the last paper considers a dynamic flip-flop design for low power.

4.3.1. An Adder Using Charge Sharing and Its Application in DRAMs

Hak-soo Yu, Songjun Lee, and Jacob Abraham, The University of Texas at Austin, USA
4.3.2. Fixed-Width Multiplier for DSP Applications
Shyh-Jye Jou and Hui-Hsuan Wang, National Central University, Taiwan
4.3.3. Dynamic Flip-Flop with Improved Power
Nikola Nedovic, Vojin G. Oklobdzija, University of California, Davis, USA


12:00-1:00 

Lunch

Sponsored by Texas Instruments
 

1:00-3:00 

Session 5.1: Embedded Processors: Architecture and System-design Issues
Session Chair: Ricardo Gonzales, Tensilica

This session has four papers on special purpose processors covering embedded, mobile, low power and DSP processors.

5.1.1. Mobile Processors

Farinaz Koushanfar and Miodrag Potkonjak, University of California, Los Angeles, USA
Jan Rabaey, University of California, Berkeley, USA
5.1.2. AMULET3: A 100 MIPS Asynchronous Embedded Processor
S.B. Furber, D.A. Edwards, and J.D. Garside, The University of Manchester, UK
5.1.3. Xtensa with User Defined DSP Coprocessor Micro-architectures
Gulbin Ezer, Tensilica, Inc., USA
5.1.4. Predictive Strategies for Low-Power RTOS Scheduling
Pavan Kumar and Mani Srivastava, University of California, Los Angeles, USA


Session 5.2: Floorplanning and Partitioning
Session Chair: Tim Burks, Magma Design Automation

Attend this session to obtain insights into new (and old) approaches to floorplanning and partitioning! The first two papers deal with the problem of floorplanning, one using the idea of B* trees, and another utilizing a hierarchical method based on partitioning. The third paper in the session presents a comparative evaluation of several existing multi-way partitioning algorithms, with detailed experimental results. Rounding off the session is a paper that designs a datapath using concurrent one-dimensional floorplanning.

5.2.1. Rectilinear Block Placement Using B*-Trees

Guang-Ming Wu, Yun-Chih Chang, and Yao-Wen Chang, National Chiao Tung University, Taiwan
5.2.2. Fast Hierarchical Floorplanning With Congestion and Timing Control
Abhishek Ranjan, Kiarash Bazargan, and Majid Sarrafzadeh, Northwestern University, USA
5.2.3. An Evaluation of Move-Based Multi-Way Partitioning Algorithms
E. Yarack, Silicon Graphics, USA
J. Carletta, University of Akron, USA
5.2.4. Assignment-Space Exploration Approach to Concurrent Data-path/Floorplan Synthesis
Koji Ohashi, Mineo Kaneko, and Satoshi Tayu, Japan Advanced Institute of Science and Technology, Japan


Session 5.3: Basic Algorithms in Verification and Test
Session Chair: Yatin Hoskote, Intel

This session reports on work aimed at improving the basic infrastructure for verification and test. The first paper deals with how to improve the performance of a SAT solver for the common case when constraints are dynamically added and removed. The second paper deals with the problem of finding good variable orders for word-level decision diagrams. In the third paper sensitivity levels of test patterns are defined and used to guide simulation based ATPG for combinational circuits. Finally, in the fourth paper, it is investigated in which way a previously introduced compaction technique for scan test sets affects the quality of the tests.

5.3.1. On Solving Stack-Based Incremental Satisfiability Problems

Joonyoung Kim, Jesse Whittemore, and Karem Sakallah, University of Michigan, USA
5.3.2. Efficient Dynamic Minimization of Word-Level DDs based on Lower Bound Computation
Wolfgang Guenther and Rolf Drechsler, University of Freiburg, Germany
Stefan Hoereth, Siemens AG, Germany
5.3.3. Sensitivity Levels of Test Patterns and Their Usefulness in Simulation-Based Test Generation
Irith Pomeranz and Sudhakar M. Reddy, University of Iowa, USA
5.3.4. On Test Application Time and Defect Detection Capabilities of Test Sets for Scan Designs
Irith Pomeranz and Sudhakar M. Reddy, University of Iowa, USA


3:30-5:30 

Session 6.1: Special Session
Advancements in DSP Architecture
Session Chair: Jim Bondi, Texas Instruments
Organizer: Nagaraj NS, Texas Instruments

This session has three papers describing various architecture aspects of the latest DSP processor designs from Texas Instruments.

6.1.1. Effective hardware based two way Loop Cache for high performance low power processors

Tim Anderson and Sanjive Agarwala, Texas Instruments, USA
6.1.2. A multi-level memory system architecture for high-performance DSP applications
Sanjive Agarwala, Charles Fuoco, Tim Anderson, and Dave Comisky, Texas Instruments, USA
6.1.3. A scalable high performance DMA architecture for high-performance DSP applications
Dave Comisky, Sanjive Agarwala, and Charles Fuoco, Texas Instruments, USA


Session 6.2: Advanced Architectural Design and Synthesis
Session Chair: Edward Grochowski, Intel

This session describes new and innovative approaches to architectural design and synthesis. The first paper discusses a compilation methodology for pipeline reconfigurable architecture. Next, a processor design system for complex pipelined designs is presented. The third paper uses a symbolic framework to solve the binding problem for embedded VLIW ASIPs. The final paper describes a system that develops methods to take designs from a software description and interface them with hardware using a C++ framework.

6.2.1. A Fast and Efficient Compiler for Pipeline Reconfigurable Architectures

Srihari Cadambi and Seth Copen Goldstein, Carnegie Mellon University, USA
6.2.2. PEAS-III: An ASIP Design Environment
Makiko Itoh, Shigeaki Higaki, Yoshinori Takeuchi, Akira Kitajima, and Masaharu Imai, Osaka University, Japan
Jun Sato, Tsuruoka National College of Technology, Japan
Akichika Shiomi, Shizuoka University, Japan
6.2.3. Symbolic Binding for VLIW ASIPs (short paper)
Satish Pillai and Margarida Jacome, The University of Texas at Austin, USA
6.2.4. Interfacing Hardware and Software Using C++ Class Libraries (short paper)
Dinesh Ramanathan and Rajesh Gupta, University of California, Irvine, USA
Ray Roth, CynApps Inc., USA


Session 6.3: Application and Case Studies in Test and Verification
Session Chair: Carl Pixley, Motorola

In this session practical aspects of applying modern test and verification techniques are discussed. The session starts with a paper on applying formal property verification in industry to ensure a design satisfies some desired properties. The second paper deals with verifying that the actual detailed implementation has not introduced any new bugs. This is followed by a paper dealing with how to correct/report any such bug. The final paper describes a set of production level procedures used to identify and verify the test structure and behavior of the BIST hardware as implemented for IBM?s TestBench test generation system.

6.3.1. Formal Verification of an Industrial System-on-a-chip

Hoon Choi, Myung-Kyoon Yim, Jae-Young Lee, Byeon-Whee Yun, and Yun-Tae Lee, Samsung Electronics, Korea
6.3.2. Equivalence Checking Using a Structural SAT-Solver, BDDs, and Simulation
Viresh Paruthi and Andreas Kuehlmann, IBM, USA
6.3.3. Efficient Design Error Correction of Digital Circuits
Dirk W. Hoffmann and Thomas Kropf, University of Tübingen, Germany
6.3.4. An Automatic Validation Methodology for Logic BIST in High Performance VLSI Design
Michael Cogswell, James Sage, Don Pearl, and Alan Troidl, IBM, USA


5:30-6:30 

Poster Session
 

7:00-9:00 

Banquet

Speaker: TBD



Wednesday, September 20

9:00-10:00 

DSP Tutorial

Bryan Ackland, Lucent Technologies
Where are DSP architectures heading? How can high performance and low power be achieved? How to exploit parallelism in high end DSPs?
 

10:30-12:00 

Session 7.1: Logic Optimization
Session Chair: Chin-Long Wey, Michigan State University

This session considers techniques for the optimization of multi-level and two-level logic representations.

7.1.1. Efficient Logic Optimization Using Regular Extraction

Thomas Kutzschebauch, IBM T. J. Watson Research Center, USA
7.1.2. Binary and Multi-valued SPFD-based Wire Removal in PLA Networks
Subarnarekha Sinha, University of California, Berkeley, USA
Sunil P. Khatri, University of Colorado at Boulder, USA
Robert K. Brayton and A. Sangiovanni-Vencentelli, University of California, Berkeley, USA
7.1.3. Minimization of Ordered Pseudo Kronecker Decision Diagram
Per Lindgren, Lulea University of Technology, Sweden;
Rolf Drechsler and Brend Becker, Albert-Ludwigs University, Germany


Session 7.2: High Level Specification and Synthesis
Session Chair: Pranav Ashar, NEC

This session describes recent activity in the area of high level synthesis and design. The first paper addresses the productivity gap between the promise and reality of behavioral synthesis and considers how best it may be integrated within current design flows. The second paper attacks the problem of interfacing IP?s operating at different clock frequencies and the final paper of the session is related to multi-level communication synthesis.

7.2.1. Rethinking Behavioral Synthesis for a Better Integration within Existing Design Flows

Wander Oliveira Cesario and Ahmed Amine Jerraya, TIMA Laboratory, France
Zoltan Sugar and Imed Moussa, Arexsys, France
7.2.2. Synthesis and Optimization of Interface Hardware between IP?s Operating at Different Clock Frequencies
Bong-Il Park, In-Cheol Park, and Chong-Min Kyung, KAIST, Korea
Hoon Choi, Samsung Electronics, Korea
7.2.3. Multi-level Communication Synthesis of Heterogeneous Multilanguage Specification
F. Hessel, TIMA & PUCRS, Brazil
P. Coste, G. Nicolescu, P. LeMarrec, N. Zergainoh, and A. Jerraya, TIMA Laboratory, France