Home
Research Publications
Teaching SEAL CV Software Funding Activities

Research Projects 

The mission of Software Evolution and Analysis Laboratory is to improve developer productivity and software reliability during  evolution of large software systems.  Representative papers are available on my publications page.

Analysis and Automation of Systematic Software Modifications

Extension of existing software often requires systematic and pervasive edits—programmers apply similar, but not identical, enhancements, refactorings, and bug fixes to many similar methods. The vision of this research is to produce a novel program analysis, transformation, and validation framework. A novel automatic edit-script generation approach will learn abstract, context-aware program transformations from example edits. A novel edit-script application algorithm automatically identifies code locations that require similar edits, and then transforms each location with a concrete edit that our algorithm customizes to the particular context. Programmers may also apply edit-scripts on-demand by specifying an edit location. Dynamic and static analysis validate edits by testing the transformed code.

This project is sponsored by National Science Foundation CAREER Award CCF-1149391: Analysis and Automation of Systematic Software Modifications.

Keywords: Code Transformation; Refactoring; Static and Dynamic Analysis; Experimental Evaluation
Sydit

CHIME: Analytical Support for Investigating Software Modifications in Collaborative Development Environment

CHIME During collaborative software development, developers need to analyze past and present software modifications made by other programmers in various tasks such as carrying out a peer code reviews, bug investigations, and change impact analysis. CHIME project addresses the following fundamental questions about software modifications: (1) what is a concise and explicit representation of a program change? (2) how do we automatically extract the differences between two program versions into meaningful high-level representations? (3) How can we significantly improve developer productivity in investigating, searching, and monitoring software modifications made by other developers?

This CHIME project is sponsored by National Science Foundation Grant CCF-1117902: Analytical Support for Investigating Software Modifications in Collaborative Development Environment.

Keywords: Program Differencing; Code Change Analysis; Empirical Studies; Mining Software Archives; Collaborative Software Development

ReARCH: An Empirical Investigation into the Role of Refactoring during Software Evolution

windows7rearch Should we refactor code or just keep adding features? While code decay causes multi-million dollar loss in the form of cancelled projects or operational failures, there is not much help provided to software practioners to answer this question. It is widely believed that refactoring improves software quality and developer productivity. Yet, few empirical studies quantitatively validate refactoring benefits. This lack of empirical basis makes it difficult for software practitioners to justify refactoring investment or to decide when to refactor software. Our goal is to investigate the role of refactoring during software evolution both quantitatively and qualitatively. In particular, we propose to quantify the cost and benefits of refactoring and its relationship to various software metrics. We investigate the rationale, benefits, and challenges of refactoring from developers' perspective and investigate social and technical issues surrounding refactorings.

Keywords: Refactoring; Empirical Studies; Software Evolution; Modularity; Technical Debt

BRACE: Ensuring Correctness of Cyber Physical Systems

simonsays Developing software for cyber-physical systems is challenging because correct execution depends not only on the logical state but also on physical state.  The fact that physical states are transient and difficult to observe further complicates the process.  Developers must repeatedly rerun the system and continuously tweak the hardware and software to achieve the desired behavior.  This process of manually aligning the logical and physical states of the system is extremely labor intensive, and also lacks the rigor and repeatability we expect of well-designed systems. BRACE is a framework that aims to alleviate these challenges by enabling joint assertions over both the cyber (logical) and physical properties of the system.  This framework provides a middleware assertion library that simplifies the tedious process of examining system states by introducing new forms of assertions, catered to the unique demands of cyber-physical systems.

Keywords: Cyber-Physicsal System; Software Correctness; Runtime Verification; Middleware

Code Duplication and Software Forking

It has been long believed that duplicated code fragments indicate poor software quality and factoring out the commonality among them improves software quality; thus, previous studies focused on measuring the percentage of code clones and interpreted a large (or increasing) number as an indicator for poor quality. On the other hand, we investigated how and why duplicated code is actually created and maintained using two empirical analyses. we used an edit capture and replay approach to gather insights into copy and paste programming practices. To extend this type of change-centric analysis to programs without edit logs, we developed a clone genealogy analysis that tracks individual clones over multiple versions. By focusing on how code clones actually evolve, we found that clones are not inherently bad and that we need better support for managing clones.

Keywords: Code Duplication; Software Forking; Software Reuse
clone