Our database of Bangla offline handwritten basic characters consists of 37,858 sample images. These samples are not evenly distributed over 50 possible classes. The main reason of the non-uniform distribution is that a major part of the data have been collected using several standard forms in which entries are proper nouns and there are several characters in the Bangla alphabet which are rarely used in proper nouns. However, this problem has been partially tackled by asking the subjects to reproduce Bangla alphabet set on a form consisting of equal rectangles, one for each basic character. The training set consists of 500 samples for each class. Test set consists of remaining samples. Also, we do not explicitly provide any validation set of samples.
Ref:- U. Bhattacharya, M. Sridhar, S. K. Parui, P. K. Sen and B. B. Chaudhuri, "Offline recognition of handwritten Bangla characters - an efficient two-stage approach", communicated to an Int. Journal.