We have developed a two-stage perceptual hashing framework that is well suited to applications in image database search and authentication. The first stage iteratively applies a feature detector to extract visually significant image features. These features are close in a distance metric for perceptually identical images and separated by a large distance for images that have different visual content. The feature vector forms the intermediate hash. The second stage clusters these features such that perceptually identical images are mapped to the same cluster with very high probability while minimizing the likelihood of collision for perceptually distinct inputs. The feature extraction stage has been implemented in MATLAB, whereas the clustering is implemented in C.
Within this framework, we have developed two approaches, with each having a deterministic and a randomized version, to yield four algorithms. Randomness is applied to clustering step to enable secure hashing. Randomization is of great importance in an adversarial setup where a malicious attacker may try to generate inputs that defeat the hash algorithm.
A significant benefit of our framework is that the second (clustering) step is media independent. Hence, an appropriate feature detector may be applied in the first step to make the framework applicable to other media data sets such as audio, specific classes of images, and documents. We have applied our framework for natural images.
In our most recent work, we have extended the feature extraction step to handle geometric attacks like rotations, translation and scaling on the image. In particular, we developed two classes of algorithms that
Mail comments about this page to firstname.lastname@example.org.