Ref:- (i) Chandan Biswas, Partha Sarathi Mukherjee, Koyel Ghosh, Ujjwal Bhattacharya, Swapan K. Parui, A Hybrid Deep Architecture for Robust Recognition of Text Lines of Degraded Printed Documents. Proc. of ICPR, pp. 3174-3179, 2018.
ISIDDI consists of images of 535 pages scanned from 15 old Bangla printed books. These books had been collected from three different sources, viz. (i) Indian Statistical Institute library, (ii) Old Book Market at College Street, Kolkata, India and (iii) a Public Library of India. A number of pages of these books obtained from sources (i) and (ii) and affected by various types of degradations had been initially identified before their scanning. We scanned these degraded document pages using a flatbed scanner at 300 dpi and stored them as color images in both uncompressed TIF and JPG formats. Similarly, we have identified several pages of printed Bangla containing one or multiple types of degradations from the above archive and downloaded them. Since these books were originally written over a long period of time of the past history, both of their forms of Bangla language and font of the printed script vary widely.
(ii) A. Chaudhury, P. S. Mukherjee, S. Das, C. Biswas, and U. Bhattacharya, A Deep OCR for Degraded Bangla Documents, ACM Trans. Asian Low-Resour. Lang. Inf. Process., 2022 (available online).
ISIDDI2: It consists of 139 scanned pages of severely degraded old documents.
Application form for obtaining "ISI Degraded_Document_Image_Database"