Evaluation Data For Binarization Algorithms
The ideal way of evaluation should be able to decide, for each pixel, if it has finally succeeded the right color (black or white) after the binarization. This is an easy task for a human observer but very difficult for a computer to perform it automatically for all the pixels of several images.
Two sets of 150 document images each were built up by using image mosaicing superimposing techniques for blending. In more detail, we used as target images pdfs and resized all the noisy images to A4 size. Then, we used two different techniques for the blending: the maximum intensity and the image averaging approaches.
At the maximum intensity technique (max_int), the new image was constructed by picking up for each pixel in the new image, the darkest corresponding pixel of the two images.
These synethetic images are here.
At the image averaging technique (ave_int), each pixel in the new image is the
average of the two corresponding ones in the original images.
These synethetic images are here.
You can find more information about the noise in the images, the techniques and existed results in the paper:
The database has also been used in the contests:ICFHR 2010 Contest: Quantitative Evaluation of Binarization Algorithms
ICDAR 2011: Specific Document Analysis Algorithm Contributions in End-to-End Applications
For any remarks please contact: kavallieratou at aegean.gr