Abstract
Segmentation of text from badly degraded document images is a very challenging task due to the high inter/intravariation between the document background and the foreground text of different document images. We propose a novel document image binarization technique that addresses these issues by using adaptive image contrast. The combination of the local image contrast and the local image gradient that is tolerant to text and background variation caused by different types of document degradations is called as The adaptive image contrast. In the system, an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then converted into binary and combined with Canny’s edge map to identify the text stroke edge pixels. Then document text is segmented by a local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. The proposed method is robust, simple and involves parameter tuning at its minimum. It has been tested on datasets that are used in the recent document image binarizationcontest (DIBCO) 2009 & 2011 and handwritten-DIBCO 2010 and achieves accuracies of 93.5%, 87.8%, and 92.03%, respectively, that are higher than or close to that of the bestperforming methods reported in the three contests. The Bickley diary dataset experiments that consists of several challenging bad quality document images also show the superior performance of the system over the others.