Perceptual Hashing Methods

Unfortunately, most of the existing digital signature based schemes remain vulnerable to incidental modifications. This severely limits their practical utility in robust content authentication applications. The focus of our research is to address this issue by constructing a content descriptor that we refer to as a perceptual image hash. Such a hash function takes as input a large digital image and outputs a fixed length binary vector known as the hash value. We require this hash value to be invariant under changes to the image that are perceptually insignificant, whereas on perceptually distinct inputs the hash values need to be approximately independent and hence different with high probability.

Traditional cryptographic hashes such as SHA-1 and MD5 produce uniformly distributed (or perfectly randomized) hash values. They are not applicable in the aforementioned multimedia applications because they are extremely sensitive to the message being hashed; i.e., even a one bit change in the input (media) will change the output hash dramatically. Likewise, content-based feature extraction methods, developed from a signal processing perspective, are known to be robust but not secure. The ultimate goal of secure media hashing research is to make the job of an adversary computationally infeasible; i.e., it should be nearly impossible for a malicious attacker to come up with ways to tamper the image content (in a reasonable amount of time) and defeat the authentication scheme.

Mail comments about this page to bevans@ece.utexas.edu.