Abstract

In this work we present novel visual summarization methods, which are based on the visual perception of human beings, in particular on the Gestalt Laws. These laws define theories about how people perceive the world around them and the simplification of the visual stimuli without loss of meaning. To the best of our knowledge, many computer vision methods developed in the past are limited to purely technical aspects and omit psychological theories, such as Gestalt theory. With this work, we want to contribute to counteracting this fact. The most important method that was developed during this dissertation and that can be used for summarizing data is the Gestalt Interest Points (GIP) algorithm. The algorithm is fast and highly effective because it extracts very little but well-selected image information and thereby creates very compact semantic summaries of images. The GIP algorithm was the foundation for the Gestalt Regions of Interest (GROI) method. With the GROI images we improved the accuracy of a CNN for the domain of makeup-robust face recognition.Training a CNN with GROI images clearly outperforms the accuracy of a CNN trained with raw pixel imagesfor the domain of makeup-robust face recognition. Additionally, our presented method is more robust against over-fitting than the conventional approach, training a CNN from raw pixel images. The biggest advantage of the GROI method is that it is possible to summarize the semantic content of images more compactly than from whole images. This is a very important argument in particular in big data domains such as face recognition.

Reference

Hörhan, M. (2020). Novel visual media summarization methods for retrieval applications [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2020.84123