iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/s11042-022-12673-x
A deep learning based system for writer identification in handwritten Arabic historical manuscripts | Multimedia Tools and Applications Skip to main content
Log in

A deep learning based system for writer identification in handwritten Arabic historical manuscripts

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Determining the writer or transcriber of historical Arabic manuscripts has always been a major challenge for researchers in the field of humanities. With the development of advanced techniques in pattern recognition and machine learning, these technologies have been applied to automate the extraction of paleographical features in order to solve this issue. This paper presents a baseline system for writer identification, tested on a Historical Arabic dataset of 11610 single and double folio images. These texts were extracted from a unique collection of 567 Historical Arabic Manuscripts available at the Balamand Digital Humanities Center. A survey has been conducted on the available Arabic datasets and previously proposed techniques and algorithms. The Balamand dataset presents an important challenge due to the geo-historical identity of manuscripts and their physical conditions. An advanced Deep Learning system was developed and tested on three different Latin and Arabic datasets: ICDAR19, ICFHR20 and KHATT, before testing it on the Balamand dataset. The system was compared with many other systems and it has yielded a state-of-the-art performance on the new challenging images with 95.2% mean Average Precision (mAP) and 98.1% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Mention the Arabic corpora in the domain of OCR and handwritten recognition.

  2. These manuscripts were digitized by the Saint Joseph of Damascus Manuscript Conservation Center (http://www.balamandmonastery.org.lb/index.php/about-the-center) and the Digital Humanities Centre (http://iohanes.uob-dh.org/?q=en/tags/digital-humanities).

  3. The total number of digitized pages exceeds the number of photos.

  4. “A statement providing information regarding the date, place, agency, or reason for production of the manuscript or other object” [29]

  5. A frame made of cardboard or occasionally of wood on which cords of various thickness could be stretched, corresponding to the text frame lines and guidelines [17].

References

  1. Abdelhaleem A, Droby A, Asi A, Kassis M, Al Asam R, El-sanaa J (2017) Wahd: a database for writer identification of arabic historical documents. In: 2017 1st International workshop on arabic script analysis and recognition (ASAR), pp 64–68. IEEE

  2. Abdleazeem S, El-Sherif E (2008) Arabic handwritten digit recognition. Int J Doc Anal Recogn (IJDAR) 11:127–141

    Article  Google Scholar 

  3. Asi A, Abdalhaleem A, Fecker D, Märgner V, El-Sana J (2017) On writer identification for arabic historical manuscripts. Int J Doc Anal Recogn (IJDAR) 20:173–187

    Article  Google Scholar 

  4. Awaida S, Mahmoud S (2011) Writer identification of arabic handwritten digits. In: First international workshop on frontiers in arabic handwritng recognition, 2010

  5. Awaida SM, Mahmoud SA (2012) State of the art in off-line writer identification of handwritten text and survey of writer identification of arabic text. Educ Res Rev 7:445

    Article  Google Scholar 

  6. Bausi A, Borbone PG, Briquel-Chatonnet F, Buzi P, Gippert J, Macé C, Melissakēs Z, Parodi LE, Witakowski W, Sokolinski E (2015) Comparative Oriental manuscript studies: an introduction. COMSt

  7. Chammas M, Makhoul A, Demerjian J (2020) Writer identification for historical handwritten documents using a single feature extraction method. In: 19th IEEE International conference on machine learning and applications (ICMLA 2020)

  8. Chandra K, Kapoor G, Kohli R, Gupta A (2016) Improving software quality using machine learning. In: 2016 international conference on innovation and challenges in cyber security (ICICCS-INBUSH), pp 115–118. IEEE

  9. Chaurasia P, Kohli R, Garg A (2014) Biometrics minutiae detection and feature extraction. LAP LAMBERT Academic Publishing

  10. Chen S, Wang Y, Lin C-T, Ding W, Cao Z (2019) Semi-supervised feature learning for improving writer identification. Inform Sci 482:156–170

    Article  MathSciNet  Google Scholar 

  11. Christlein V, Bernecker D, Honig F, Angelopoulou E (2014) Writer identification and verification using GMM supervectors. IEEE Winter Conference on Applications of Computer Vision

  12. Christlein V, Bernecker D, Hönig F, Maier A, Angelopoulou E (2017) Writer identification using GMM supervectors and Exemplar-SVMs. Pattern Recogn 63:258–267

    Article  Google Scholar 

  13. Christlein V, Gropp M, Fiel S, Maier A (2017) Unsupervised feature learning for writer identification and writer retrieval. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR)

  14. Christlein V, Maier A (2018) Encoding CNN activations for writer recognition. In: 2018 13th IAPR international workshop on document analysis systems (DAS)

  15. Christlein V, Nicolaou A, Seuret M, Stutzmann D, Maier A (2019) ICDAR 2019 competition on image retrieval for historical handwritten documents. arXiv [cs.CV]

  16. Dé roche FÇO, Rossi VS (2012) The manuscripts in Arabic characters. Viella

  17. Déroche F et al (2005) Islamic codicology. An Introduction to the Study of Manuscripts in Arabic Script

  18. Djeddi C, Souici-Meslati L (2011) Artificial immune recognition system for arabic writer identification. In: International symposium on innovations in information and communications technology, pp 159–165. IEEE

  19. Fecker D, Asi A, Pantke W, Märgner V, El-Sana J, Fingscheidt T (2014) Document writer analysis with rejection for historical arabic manuscripts. In: 2014 14th international conference on frontiers in handwriting recognition, pp 743–748. IEEE

  20. Fecker D, Asit A, Märgner V, El-Sana J, Fingscheidt T (2014) Writer identification for historical arabic documents. In: 2014 22nd International conference on pattern recognition, pp 3050–3055. IEEE

  21. Fiel S, Sablatnig R (2015) Writer identification and retrieval using a convolutional neural network. Computer Analysis of Images and Patterns, 26–37

  22. Hannad Y, Siddiqi I, Djeddi C, El-Kettani ME-Y (2019) Improving arabic writer identification using score-level fusion of textural descriptors. IET Biometr 8:221–229

    Article  Google Scholar 

  23. Lai S, Zhu Y, Jin L (2020) Encoding pathlet and sift features with bagged vlad for historical writer identification. IEEE Trans Inform Forens Secur 15:3553–3566

    Article  Google Scholar 

  24. Mahmoud SA, Ahmad I, Al-Khatib WG, Alshayeb M, Parvez MT, Märgner V, Fink GA (2014) Khatt: an open arabic offline handwritten text database. Pattern Recogn 47:1096–1112

    Article  Google Scholar 

  25. Mahmoud SA, Ahmad I, Alshayeb M, Al-Khatib WG, Parvez MT, Fink GA, Märgner V, El Abed H (2012) Khatt: Arabic offline handwritten text database. In: 2012 International conference on frontiers in handwriting recognition, pp 449–454. IEEE

  26. Malisiewicz T, Gupta A, Efros AA Ensemble of exemplar-SVMs for object detection and beyond. In: 2011 International conference on computer vision, vol 2011

  27. Nguyen HT, Nguyen CT, Ino T, Indurkhya B, Nakagawa M (2019) Text-independent writer identification using convolutional neural network. Pattern Recogn Lett 121:104–112

    Article  Google Scholar 

  28. Pechwitz M, Maddouri S, Märgner V, Ellouze N, Amiri H (2002) Ifn/enit: database of handwritten arabic words

  29. P5: Guidelines for electronic text encoding and interchange. https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-colophon.html. Accessed December 10th 2021

  30. Rehman A, Naz S, Razzak MI (2019) Writer identification using machine learning approaches: a comprehensive review. Multimed Tools Appl 78:10889–10931

    Article  Google Scholar 

  31. Seuret M, Nicolaou A, Maier A, Christlein V, Stutzmann D (2020) Icfhr 2020 competition on image retrieval for historical handwritten fragments. In: 2020 17th International conference on frontiers in handwriting recognition (ICFHR), pp 216–221. IEEE

  32. Slimane F, Awaida S, Mezghani A, Parvez MT, Kanoun S, Mahmoud SA, Märgner V (2014) Icfhr2014 competition on arabic writer identification using ahtid/mw and khatt databases. In: 2014 14th international conference on frontiers in handwriting recognition, pp 797–802. IEEE

  33. The Arabic Manuscripts in the Antiochian Orthodox Monasteries in Lebanon volume 1–2. University of Balamand

Download references

Acknowledgements

This research is funded by the EIPHI Graduate School (contract “ANR-17-EURE-0002”). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Quadro RTX 6000 GPU used for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michel Chammas.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chammas, M., Makhoul, A., Demerjian, J. et al. A deep learning based system for writer identification in handwritten Arabic historical manuscripts. Multimed Tools Appl 81, 30769–30784 (2022). https://doi.org/10.1007/s11042-022-12673-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12673-x

Keywords

Navigation