Proc. IEEE Int. Conference on Image Processing,
Sep. 27-30, 2015, Quebec City, Canada.
Paper won a Top 10% Paper Award.
Full-Reference Visual Quality Assessment for Synthetic Images:
A Subjective Study
Debarati Kundu
and
Brian L. Evans
Embedded Signal Processing Laboratory,
Wireless Networking and Communications Group,
The University of Texas at Austin,
Austin, TX 78712 USA
debarati@utexas.edu -
bevans@ece.utexas.edu
Paper Draft -
Poster (PDF) -
Poster (PowerPoint) -
Table I: Correlation Scores -
ESPL Synthetic Image Database
Abstract
Measuring visual quality, as perceived by human observers, is
becoming increasingly important in the many applications in which
humans are the ultimate consumers of visual information.
For assessing subjective quality of natural images, such as those
taken by optical cameras, significant progress has been made for
several decades.
To aid in the benchmarking of objective image quality assessment
(IQA) algorithms, many natural image databases have been annotated
with subjective ratings of the images by human observers.
Similar information, however, is not readily available for synthetic
images commonly found in video games and animated movies.
In this paper, our primary contributions are
- conducting subjective tests on our publicly available ESPL
Synthetic Image Database, and
- evaluating the performance of more than 20 full reference IQA
algorithms for natural images on the synthetic image database.
The ESPL Synthetic Image Database contains 500 distorted images
(20 distorted images for each of the 25 original images) in
1920 x 1080 format.
After collecting 26000 individual human ratings, we compute the
differential mean opinion score (DMOS) for each image to evaluate
IQA algorithm performance.
Questions and Answers
The following is a summary of the questions that arose during
the poster presentation by Debarati Kundu and her answers:
- What method did you use for subjective evaluation?
We used Single Stimulus Continuous Quality Scale method. The sequence
of images was randomized for every session and every subject. The
testing phase was preceded by a short training phase. Even in the
testing phase, some images which were at the beginning of the session
were repeated (without informing the observer) at the end in order to
allow the time needed for stabilization of the scores.
- Why do you think Structural Similarity (SSIM) index is doing
that well for your database?
This is because we have lightly distorted images compared to the other
standard natural image databases like LIVE and TID. Many observers
found the lightly blurred image to be more visually acceptable than
the corresponding pristine image. These "inversion" of the scores led
us to conjecture that visual difference does not always correspond to
visual annoyance, especially for synthetic images, which are subjected
to a higher degree of cinematographic processing (for animation
sequences).
- What differences did you observe in the statistical properties of
natural and synthetic scenes?
We found that the empirical distributions of the pixels in synthetic
scenes, both in spatial and transform domain can be modeled by
Generalized Gaussian distributions, with some difference in the shape
and scale parameters. The exact degree of this difference can be found
in our 2014 Asilomar paper.
COPYRIGHT NOTICE: All the documents on this server
have been submitted by their authors to scholarly journals or conferences
as indicated, for the purpose of non-commercial dissemination of
scientific work.
The manuscripts are put on-line to facilitate this purpose.
These manuscripts are copyrighted by the authors or the journals in which
they were published.
You may copy a manuscript for scholarly, non-commercial purposes, such
as research or instruction, provided that you agree to respect these
copyrights.
Last Updated 10/03/15.