My research interests include bioinformatics, communications, machine learning and signal processing.
Recent research thrusts include:
More details can be found in Projects and Publications.
Structured low-rank matrix decomposition with applications to bioinformatics.
In many problems (e.g., recommender systems), one seeks to represent a data matrix by the product of two
matrices: one capturing meaningful information and the other specifying how that information is combined
to generate the data. We consider the matrix decomposition that arises in the context of discovery of
genetic variations; there, the observed matrix contains noisy samples of the product of an informative
matrix having discrete-valued entries and a matrix with rows that are standard unit basis. We developed
and analyzed structurally-constrained iterative algorithms for finding this decomposition.
Extensive benchmarking demonstrates superiority of the proposed framework over
Supervision and smoothness in clustering. In many applications, we
are given partial or uncertain relations between objects that need to be clustered. We developed the
semi-supervised affinity propagation (SCSSAP) algorithm which adds supervision to affinity
propagation (AP) without strictly enforcing instance-level constraints. This formulation is
particularly advantageous in the presence of noisy labels or noisy constraints since the penalty
parameter of SCSSAP can be tuned to express our confidence in instance-level constraints. In the
presence of label and constraint noise, SCSSAP outperforms AP as well as existing semi-supervised
Base-calling algorithms for next-generation DNA sequencing systems.
Novel sequencing technologies will provide significant improvements in many aspects of human condition, ultimately
leading towards the understanding, diagnosis, treatment, and prevention of diseases. Reliable decision-making in
such downstream applications is predicated upon accurate base-calling, i.e., precise identification of the order of
nucleotides from noisy sequencing data. We explored trade-offs between performance and complexity of base-calling,
developing algorithms that have both better accuracy and
are orders of magnitude faster than
state-of-the-art methods, and are capable of processing
and generating "soft" (probabilistic) base-call information.
- Sensor and cognitive radio networks. For a rational use of the sensor network resources,
the fusion center typically schedules only a subset of the available sensors for transmission, and the selected sensors
transmit only partial information. Recently, we showed that in a variety of scenarios sensor selection can be cast as
maximization of a submodular function over uniform matroids. It turns out that a computationally efficient greedy sensor
selection algorithm is guaranteed to perform within (1-1/e) of the optimal solution. We extended these results to the
scenario where the communication between sensors and the fusion center is unreliable.
- Algorithms for solving integer least-squares problems in communications.
Complexity of the optimal signal processing in communication systems is a main obstacle to delivering the high data rates
promised by theory. In a probabilistic setting, common to communications, it is possible to exploit statistics of the problem
in order to design fast algorithms and analyze their complexity. An example of our work in this area includes the analysis of
the expected complexity of an algorithm for optimal decoding in high-dimensional communication systems.
The results reveal that
optimal decoding, hitherto considered infeasible, can in fact often be implemented in practice. Recently, we
extended these results to the sparsity-aware setting.