Abstract

In contrast to the existing approachesfor document analysis and understanding this paperrepresents a system that considers a logical rolefor graphic content in predominantly textual, borndigital PDF documents. This work was inspired bythe idea of using structural graphic objects in orderto clarify the logical layout even of complex mostlygraphic documents. Based on visual cognition,geometric features and spatial relations, theproposed statistical method distinguishes illustrativegraphic objects from structural graphic objects. Weperformed evaluation on two document domains- newspapers and technical manuals - and foundthe results to be reliable. We propose usinglogical information about the graphic content to bea new step towards domain-independent documentunderstanding systems

Reference

Gabdulkhakova, A., Hassan, T., & Kropatsch, W. (2013). Logical Layout Recovery: approach for graphic-based features. In W. Kropatsch, F. Torres Garcia, & G. Ramachandran (Eds.), Proceedings of the 18th Computer Vision Winter Workshop 2013 (pp. 47–54). Prip 186/3. http://hdl.handle.net/20.500.12708/54741