School Program

IDIPS 2013

Each day is dedicated to one main area of Document Image Processing, namely:

Digital Image Fundamentals (Monday),
Denoising (Tuesday),
Document Image Preprocessing (Wednesday),
Recognition (Thursday),
Historical Document Images (Friday).

During morning, 2 lectures will be given, while in the afternoons there will be panel discussions with the program committee members and the invited speakers, students' presentations and evaluation tasks.

The evenings will include dinner at a local restaurant and a special event:

on Monday, a local group will dance greek dancing at the central square.
on Tuesday Manos will give a fishing course, followed by a fishing contest.
on Wednesday afternoon (16.00 h) there will be an excursion to Thimena island. We are going to swim and have dinner there.
on Thursday after dinner, there will be ouzo and tavli.
on Friday, the city hall will organize a goodbye dinner for us. The student awards will be presented during the closing ceremony.

Image Acquisition and Digitalization

Irene Stathi

This lecture discusses the todays and future concepts and techniques for digital imaging in the field of culture. Libraries and archives initiate imaging programs to meet real or perceived needs. The utility of digital images is most likely ensured when the needs of users are clearly defined, the attributes of the documents are known, and the technical infrastructure to support conversion, management, and delivery of content is appropriate to the needs of each project.

Digital imagesare electronic snapshots taken of a scene or scanned from documents, such as photographs, manuscripts, printed texts, and artwork (images and moving images). Digital image capture must take into consideration the technical processes involved in converting from analog to digital representation as well as the attributes of the source documents themselves: physical size and presentation, level of detail, tonal range, and presence of color. Documents may also be characterized by the production process used to create them, including manual, machine, photographic, and more recently, electronic means. Digital images’ aesthetics will be finally examined; a considerable amount of digitized sources refer to works of art and a high quality reproduction is always required.

Image Basics

A.Skodras

The basics of digital image processing will be overviewed in the course of this two-hour session. More specifically, the following topics will be addressed: image enhancement in the spatial and frequency domains; image segmentation; color image processing (color models and transformations); image compression.

Denoising Techniques for Document Images

Nikolaos Mitianoudis

Document Images are acquired by scanning or taking photos of historical or modern documents with the main aim to preserve and consequently to extract and understand their content. Document Images are bound to contain several degradations, i.e. artifacts or noise from various independent factors and possible motion or out-of-focus blur. These degradations may be induced during the digitization stage of the document (blur) or are the result of the documentʼs ageing and human use over the years (dirt, stains, fingerprints and ink dissipation). It is crucial that these degradations are alleviated to the greatest possible extent to improve the further processing of the document image, including page layout analysis, optical character recognition and document image retrieval via text or image queries. In this talk, we will cover all the essential image enhancement, denoising and deconvolution techniques. The main aim will be to demonstrate how these generic techniques are calibrated to enhance the content of document images and remove the aforementioned degradations.

Document Image Binarisation

Ioannis Pratikakis

Document image binarisation is the initial step of most document image analysis and understanding systems that converts a grey scale image into a binary image aiming to distinguish text areas from background areas. Binarisation plays a key role in document processing since its performance affects drastically the degree of success in subsequent character segmentation and recognition. Although for modern machine printed documents, binarisation is considered as a problem that has already been solved, when processing degraded document images, binarisation is not an easy task. This advocates the intensive contemporary research efforts for the binarisation of historical document images.

This tutorial aims to survey the main trends on the document image binarisation process combining the theory with hands-on demonstrations. Last but not least, the presentation of established evaluation measures for document image binarisation methods will be addressed as the means to study the algorithmic behaviour by providing qualitative and quantitative performance indication.

Document Image Normalization and Segmentation

Basilis Gatos

The document image processing steps of normalization and segmentation will be analyzed in details including representative algorithms and applications. Document digitization with either flatbed scanners or camera-based systems results in document images which often suffer from skew, warping and perspective distortions that deteriorate the performance of current OCR approaches. On the other hand, document orientation is not known a-priori and as a result the text may be rotated by 90 degrees or even be upside down. To this end, a document image normalization step is imperative in order to restore text areas horizontally aligned without any distortions as well as in 0 angle. This step also includes slant correction for the case of handwritten documents where there is often a deviation of the near-vertical strokes from the vertical direction. Segmentation is also a major step in the document image processing pipeline. During this step, the main document components (text/graphic areas, text lines, words and characters) are automatically extracted.

System Level Issues in Document Image Analysis

Daniel Lopresti

Document image analysis techniques are not used in isolation, but rather to solve specific tasks. The nature of the task places demands on the degree of automation that is required, the minimum acceptable accuracy level, and the kinds of errors that can be tolerated. This has implications for performance evaluation and, ultimately, the success of the system. Here we survey system level issues in the design and application of document image analysis techniques, including both traditional approaches as well as the powerful CAVIAR (Computer Assisted Visual InterActive Recognition) paradigm espoused by Professor George Nagy.

Recognition of textual and graphical patterns

Improving OCR by Using Language Resources

Estathios Stamatatos

The application of OCR software to document images is not always very accurate. Depending on the quality of the document image, the used fonts, the accuracy of the OCR software etc. there may be mistakes such as replacement of an “m” with the letters “in” or replacement of a “b” with the digit “6”. This lecture will focus on the available techniques to handle this kind of errors. More specifically, it will show how the OCR output can be improved by using language resources including dictionaries and large text corpora. Moreover, language modeling techniques will be presented and it will be demonstrated how they can be used to detect sequences of words that are not likely to happen in natural language documents. Such techniques can guide error detection and correction by suggesting the most likely replacement of certain words according to their context.

Digitisation of Historical Docs: Challenges and Methods

Apostolos Antonacopoulos

The lecture will cover the background issues, challenges and methods in the analysis of historical documents.

The lecture is broadly divided in two parts. The first part starts with an examination of the different motivations and other institutional factors that influence technical decisions. The types of documents typically encountered are discussed next with the challenges and possibilities they offer for digitisation and full- text conversion. Focussing on the needs of major content-holding institutions, the remainder of the first part presents in detail the different stages in full-text conversion. In each of the stages (scanning, image enhancement, segmentation, OCR and post-processing) the challenges and possibilities for improvement are examined.

The second part of the lecture comprises a more technical description of the state-of-the-art in the analysis of historical documents. Major past and current initiatives will be mentioned and individual methods will be described for each stage in the processing, analysis and recognition of historical documents. Finally, as an essential aspect in measuring and making progress, ways of performance evaluation of historical document analysis methods will be presented.

Word Spotting

Ergina Kavallieratou

Word spotting is an alternative solution to OCR, in cases OCR could not be applied e.g. when there is degradation in the document image or the training of an OCR system for a specific language is difficult. In such cases instead of recognizing the characters of the text, a specific word or phrase is localized in a document image collection.

In this lecture, we will see how OCR was inspired in the first place. Techniques of word spotting developed the last decade. Methodologies that were implemented for word spotting in document images, depending on the problems that they had to deal with, will be presented. Finally the current issues in word spotting will be analyzed and other appropriate uses of word spotting will be mentioned.

Recognition of textual and graphical patterns

Josep Llados

Documents contain two main categories of information, namely text and graphics. The recognition of text is solved by Optical Character Recognition (OCR). OCR is a mature research topic, and comercial software exists offering high levels of performace in documents having traditional Manhattan layouts (text blocks are strictly oriented horizontally or vertically). Graphics recognition deals with the interpretation of graphical parts (symbols, logos, lines in tables or forms, etc.). In this lecture we will review the main techniques addressed to recognize both sources of information. The lecture will be structured in two blocks. First, basic OCR techniques will be reviewed. We will review the traditional shape descriptors and classifiers used in the literature to recognize machine printed text. In the second part we will review the main graphics recognition techniques such as vectorization and symbol recognition. Finally, we will address the problem of documents combining text and graphics, and review the problem of text-graphics separation.