OFF-LINE HANDWRITTEN BASIC CHARACTER DATABASES FOR MAJOR INDIAN SCRIPTS

 

 

We have developed two image databases of handwritten isolated basic characters -- one for Devanagari and the other for Bangla at the CVPR Unit of Indian Statistical Institute. Samples of the present databases were collected by distributing several standard application forms among different groups of population of the state of West Bengal in India. Data collected through such forms are not evenly distributed among the character classes, and so a specially designed form consisting of 2-dimensional array of rectangular boxes had been used for this data collection purpose. Subjects were requested to write one single basic character per box. No other restriction was imposed on the writers. The purpose of data collection was not disclosed to them so that they could produce samples reflecting their natural handwriting styles. In approximately 60% cases, the same subject was asked to write on both types of forms on two different occasions using his/her own writing instrument. In case writing instrument was not available with the subject, it was supplied at random from a set of different types of such instruments. All the above forms were printed on papers of different brands and the samples have been collected over a span of more than two years. The above filled-in forms were scanned at 300 d.p.i. resolution using a state-of-the-art HP flatbed scanner. These are stored as grayscale images using 1 byte per pixel. A software was used for extraction of isolated characters from individual boxes. Since such a software is bound to produce some erroneous results, all the TIF files of isolated character images were checked manually through their thumbnail view and manual extraction (using an image editor) was done whenever certain error in automatic extraction was detected.

 

Ref:- U. Bhattacharya, M. Shridhar and S. K. Parui, On Recognition of Handwritten Bangla Characters, Proceedings of the 5th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP) 2006, held at Madurai, India during 13-16 December, 2006, Springer-Verlag .

 

 

NB: These databases of handwritten basic characters will soon be made available free of cost for academic purposes.

 

 

 

Back to Ujjwal's main page