Abstract

Multimedia information retrieval: That is the desire to make computers see, hear and understand like humans do. Is it possible to give perception to machines, to make them understand facial expressions, hummed melodies, stock charts and ECG curves? If yes, the computer would become an even more valuable companion in business and private life. Think of the possibilities in, for example, healthcare, home security, online customer support or market analysis. This book explains what is possible in multimedia information retrieval today and what is not. We introduce the basic concepts, explain why the first step is always summarization and the second classification, which is essentially applying human understanding of some context on the summary. We group and discuss the various methods that have been proposed for the summarization of audio, visual and other media information. In classification, we build on today's psychological understanding of human cognition. Successfully, we transfer concepts of human similarity perception on machine classification. We cluster machine learning methods by their approach, model and process. On top of that, we link back from the state of the art methods of multimedia information retrieval to human cognition: We propose artificial neural structures for the building blocks of media summarization and classification. The result is a balanced introduction into the field that starts from graduate IT knowledge and ends at the current frontiers of multimedia research.

Reference

Eidenberger, H. (2012). Handbook of Multimedia Information Retrieval. atpress. http://hdl.handle.net/20.500.12708/23542