**Instructor:** Constantine Caramanis

e-mail: constantine@utexas.edu

Office: TBA

Office Hours: TBA

**Lectures:**

Time: Wednesday 2:30-4:30 PM,

Location: 351 Meyer

This course develops some of the fundamental tools for large scale data analysis and machine learning. It is a course about data and computation, and thus focuses attention here. Specifically, the focus will fall on large scale linear algebra, including iterative, approximate and randomized algorithms for basic linear algebra computations and matrix functions. An important step along the way, will be to develop some of the basic results from Matrix Perturbation Theory. A fundamental companion to this will be the development of concentration inequalities for matrices. Throughout, these tools will be developed with a view towards applications in machine learning, such as clustering, dimensionality reduction, mixture models, etc.

Good familiarity with Linear Algebra (e.g., facility with SVD, spectral theorem, matrix norms, etc.) is important, as we will freely use concepts,
tools and techniques from linear algebra; the same goes for Probability and Stochastic Processes.

The grade for the course will be determined by the final written project. If you decide to take this class, I expect that we will discuss
ideas for the written project before the 5th class. Projects could be of an expository, research or computational nature.
Also, I will give out problem sets periodically,
but these are not to be turned in, and are just extra exercises to help you practice material covered in class, or that fill in holes that were left during the lecture.

The course will be taught from a collection of sources; some books, but mainly research papers. These will be referenced and linked to (when possible) below.

Here is a brief summary of what we covered in class, and the main references related to the material. I will continue filling this in as we go along.

There are several excellent references. The main references we are using here are: Matrix Computations, by Golub and Van Loan, and also Matrix Perturbation Theory, by Stewart and Sun. An excellent linear algebra background book is Linear Algebra Done Right, by Axler.

The main reference for today's class is Stewart and Sun.

Part 1 : review of invariant subspaces;

Part 2 : the approximation theorem;

Part 3 : the perturbation theorem for general matrices;

Part 4 : two sin-theta theorems for Hermitian matrices.

I have also posted the (hand-written) notes for this class here.

This lecture finishes the main perturbation theory material. In particular, we covered the so-called approximation theorem, which says: how close is an invariant subspace of A to a given invariant subspace? The result relies critically on the separation between spectra of sub matrices, as measured by the smallest singular value of the (non-singular) operator T introduced in Lecture 3. We then showed that this notion of separation is Lipschitz continuous, which then allows us to immediately give perturbation theorems, which basically say: How close is an invariant subspace of A+E to an invariant subspace of A? Finally, we specialize everything to the setting of Hermitian matrices. Life is easier here because the existence of an invariant subspace is immediate, and so one can move directly to bounding the distance between subspaces. Thus we develop part of the so-called direct bounds, due to Davis and Kahan. It is these results that we use for spectral clustering. See also Problem Set 2.

The main reference for today's class is Stewart and Sun.

The Power Method, and Lanczos Algorithm, part 1/2 , Lanczos Algorithm, part 2/2.

These are the most popular iterative techniques for computing the top (and bottom) singular vectors of a matrix. There is a focus on understanding convergence rates, as well as quality of early termination. In the case of Lanczos, the important notion of Krylov subspaces (important in their own right, because of applications in other areas such as conjugatae gradient) is also developed.

I have also posted the (hand-written) notes for these three lectures here.

Matrix Concentration Inequalities, part 1/3,

Matrix Concentration Inequalities, part 2/3 ,

Matrix Concentration Inequalities, part 3/3.