My research interests include bioinformatics, communications, machine learning and signal processing.
Recent research thrusts include:

Structured lowrank matrix decomposition with applications to bioinformatics.
In many problems (e.g., recommender systems), one seeks to represent a data matrix by the product of two
matrices: one capturing meaningful information and the other specifying how that information is combined
to generate the data. We consider the matrix decomposition that arises in the context of discovery of
genetic variations; there, the observed matrix contains noisy samples of the product of an informative
matrix having discretevalued entries and a matrix with rows that are standard unit basis. We developed
and analyzed structurallyconstrained iterative algorithms for finding this decomposition.
Extensive benchmarking demonstrates superiority of the proposed framework over
stateoftheart techniques.

Supervision and smoothness in clustering. In many applications, we
are given partial or uncertain relations between objects that need to be clustered. We developed the
softconstraint
semisupervised affinity propagation (SCSSAP) algorithm which adds supervision to affinity
propagation (AP) without strictly enforcing instancelevel constraints. This formulation is
particularly advantageous in the presence of noisy labels or noisy constraints since the penalty
parameter of SCSSAP can be tuned to express our confidence in instancelevel constraints. In the
presence of label and constraint noise, SCSSAP outperforms AP as well as existing semisupervised
clustering algorithms.

Basecalling algorithms for nextgeneration DNA sequencing systems.
Novel sequencing technologies will provide significant improvements in many aspects of human condition, ultimately
leading towards the understanding, diagnosis, treatment, and prevention of diseases. Reliable decisionmaking in
such downstream applications is predicated upon accurate basecalling, i.e., precise identification of the order of
nucleotides from noisy sequencing data. We explored tradeoffs between performance and complexity of basecalling,
developing algorithms that have both better accuracy and
are orders of magnitude faster than
stateoftheart methods, and are capable of processing
and generating "soft" (probabilistic) basecall information.
 Sensor and cognitive radio networks. For a rational use of the sensor network resources,
the fusion center typically schedules only a subset of the available sensors for transmission, and the selected sensors
transmit only partial information. Recently, we showed that in a variety of scenarios sensor selection can be cast as
maximization of a submodular function over uniform matroids. It turns out that a computationally efficient greedy sensor
selection algorithm is guaranteed to perform within (11/e) of the optimal solution. We extended these results to the
scenario where the communication between sensors and the fusion center is unreliable.
 Algorithms for solving integer leastsquares problems in communications.
Complexity of the optimal signal processing in communication systems is a main obstacle to delivering the high data rates
promised by theory. In a probabilistic setting, common to communications, it is possible to exploit statistics of the problem
in order to design fast algorithms and analyze their complexity. An example of our work in this area includes the analysis of
the expected complexity of an algorithm for optimal decoding in highdimensional communication systems.
The results reveal that
optimal decoding, hitherto considered infeasible, can in fact often be implemented in practice. Recently, we
extended these results to the sparsityaware setting.
More details can be found in Projects and Publications.
