Word Extraction and Character Segmentation from Text Lines of Unconstrained Handwritten Bangla Document Images

Journal of Intelligent Systems 20 (3):227-260 (2011)
  Copy   BIBTEX

Abstract

In this paper, a novel approach for word extraction and character segmentation from the handwritten Bangla document images is reported. At first, a modified Run Length Smoothing Algorithm, called Spiral Run Length Smearing Algorithm, is applied for the extraction of words from the text lines of unconstrained handwritten Bangla document images. This technique has helped to overcome some of the drawbacks of standard horizontal and vertical RLSA techniques. SRLSA technique has been applied on the Bangla handwritten document image database CMATERdb1.1.1 and the success rate of the word extraction is found to be 86.01%. In the second part of the work, we have presented a useful solution to the problem on how best word images of handwritten Bangla script can be segmented into constituent characters. Moreover, the technique can segment the words having discontinuity in Matra, a prominent feature of Bangla script. It also optimizes the trade-off between under/over segmentation as Matra region and segmentation points are estimated more precisely. As a result, better word segmentation accuracy is achieved with minimal data loss. Here, a success rate of 92.48% is observed on a dataset of 750 handwritten Bangla words which is 3.35% higher than that of our earlier techniques.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,386

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

The feasibility of segmentation of protolanguage.István Zachar - 2011 - Interaction Studies 12 (1):1-35.

Analytics

Added to PP
2017-01-11

Downloads
32 (#488,786)

6 months
4 (#800,606)

Historical graph of downloads
How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references