This dissertation was presented to the Faculty of the Graduate School of The University of Texas at Austin in partial fulfillment of the requirements for the degree of

Ph.D. in Electrical Engineering


Perceptually Based Methods for Robust Image Hashing 


Vishal Monga, Ph.D.E.E.

The University of Texas at Austin, August 2005


Prof. Brian L. Evans


Dissertation - Defense Slides


Hash functions are frequently called message digest functions. Their purpose is to extract a short binary string from a large digital message. A key feature of conventional cryptographic (and other) hashing algorithms such as message digest 5 (MD5) and secure hash algorithm 1 (SHA-1) is that they are extremely sensitive to the message; i.e., changing even one bit of the input message will change the output dramatically. However, multimedia data such as digital images undergo various manipulations such as compression and enhancement. An image hash function should instead take into account the changes in the visual domain and produce hash values based on the image's visual appearance. Such a function would facilitate comparisons and searches in large image databases. Other applications of a perceptual hash lie in content authentication and watermarking.

This dissertation proposes a unifying framework for multimedia signal hashing. The problem of media hashing is divided into two stages. The first stage extracts media-dependent intermediate features that are robust under incidental modifications while being different for perceptually distinct media with high probability. The second stage performs a media-independent clustering of these features to produce a final hash.

This dissertation focuses on feature extraction from natural images such that the extracted features are largely invariant under perceptually insignificant modifications to the image (i.e. robust). An iterative geometry preserving feature detection algorithm is developed based on an explicit modeling of the human visual system via end-stopped wavelets. For the second stage, I show that the decision version of the feature clustering problem is NP-complete. Then, for any perceptually significant feature extractor, I develop polynomial time clustering algorithms based on a greedy heuristic.

Existing algorithms for image/media hashing exclusively employ either cryptographic or signal processing methods. A pure signal processing approach achieves robustness to perceptually insignificant distortions but compromises security which is desirable in applications for multimedia protection. Likewise pure cryptographic techniques while secure, completely ignore the requirement of being robust to incidental modifications of the media. The primary contribution of this dissertation is a joint signal processing and cryptography approach to building robust as well as secure image hashing algorithms. The ideas proposed in this dissertation can also be applied to other problems in multimedia security, e.g. watermarking and data hiding.


For more information contact: Vishal Monga <>