An End-to-End deep learning system for writer identification in handwritten Arabic manuscripts

Chammas, Michel; Makhoul, Abdallah; Demerjian, Jacques; Dannaoui, Elie

doi:10.1007/s11042-023-17303-8

An End-to-End deep learning system for writer identification in handwritten Arabic manuscripts

Published: 06 December 2023

Volume 83, pages 54569–54589, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Michel Chammas ORCID: orcid.org/0000-0003-3261-9417^1,2,
Abdallah Makhoul³,
Jacques Demerjian^4,5 &
…
Elie Dannaoui¹

219 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

The extraction of paleographical features is an important task to study the identity of the text in the Historical Manuscripts. One of the major features is the identification of the writer or copyist. Many researchers have worked on an automated system for writer identification, and with the development of deep learning techniques many approaches have been proposed. Most of the previous studies have developed a multi-steps system, while very few of them performed an End-to-End approach. Most of the systems rely on a pre-processing step to prepare the data in order to facilitate recognition. This paper presents an End-to-End deep learning system for writer identification, tested on four different datasets: ICDAR19 and ICFHR20 (Latin datasets), KHATT and Balamand (Arabic datasets). The system is based on the Deep-TEN approach using a customized ResNet-50 network for features and local descriptor extraction with an integration of a NetVLAD end-layer to compute and encode the global descriptor. It was compared with our state-of-the-art system, winner of ICFHR20 HisFrag competition, and showed an interesting performance on all datasets without any pre-processing techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 11

Fig. 12

A deep learning based system for writer identification in handwritten Arabic historical manuscripts

Article 07 April 2022

Writer Retrieval and Writer Identification in Greek Papyri

Impact of the CNN Patch Size in the Writer Identification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The Balamand Historical Manuscripts data that support the findings of this study are available from the Digital Humanities Center, University of Balamand but restrictions apply to the availability of these data, which were used under licence for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of the Digital Humanities Center, University of Balamand. The ICDAR19, ICFHR20 and KHATT datasets generated during and/or analysed during the current study are available in the following repositories: ICDAR19 (https://clamm.irht.cnrs.fr/icdar2019-hdrc-ir/), ICFHR20 (https://lme.tf.fau.de/competitions/hisfragir20-icfhr-2020-competition-on-image-retrieval-for-historical-handwritten-fragments/), KHATT (http://khatt.ideas2serve.net/index.php).

Notes

http://pavone.uob-dh.org/
Mention the Arabic corpora in the domain of OCR and handwritten recognition.
These manuscripts were digitized by Saint Joseph of Damascus Manuscript Conservation Center (http://www.balamandmonastery.org.lb/index.php/about-the-center) and the Digital Humanities Centre (http://iohanes.uob-dh.org/?q=en/tags/digital-humanities).
The total number of digitized pages exceeds the number of photos.
“A statement providing information regarding the date, place, agency, or reason for production of the manuscript or other object” [23]
A frame made of cardboard or occasionally of wood on which cords of various thickness could be stretched, corresponding to the text frame lines and guidelines [20].
https://github.com/michelchammas/BalamandArabicHistoricalDataset

References

Group of authors (1991) Arabic manuscripts in the Antiochian Orthodox Monasteries in Lebanon. Balamand University Publications Series: Arabic manuscripts. https://iohanes.uob-dh.org/?q=en/content/al-makhtutat-al-arabiyah-fi-al-adyirah-al-orthodoxiyah-al-intakiyah-fi-lubnan-v-1-arabic
Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2016) Netvlad: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5297–5307
Asi A, Abdalhaleem A, Fecker D, Märgner V, El-Sana J (2017) On writer identification for arabic historical manuscripts. Int J Doc Anal Recognit (IJDAR) 20:173–187
Article Google Scholar
Bausi A, Borbone PG, Briquel-Chatonnet F, Buzi P, Gippert J, Macé C, Melissakēs Z, Parodi LE, Witakowski W, Sokolinski E (2015) Comparative oriental manuscript studies: an introduction. COMSt
Buduroh M, Pudjiastuti T (2017) Colophon in the hikayat pandawa manuscript. In: Cultural dynamics in a globalized world, Routledge, pp 517–521
Bulacu M, Schomaker L (2007) Text-independent writer identification and verification using textural and allographic features. IEEE Trans Pattern Anal Mach Intell 29:701–717
Article Google Scholar
Chammas M, Dannaoui E (2020) Towards adaptive corpora for digital humanists: new approach to digital scholarly editions
Chammas M, Makhoul A, Demerjian J (2020) Writer identification for historical handwritten documents using a single feature extraction method. In: 19th IEEE international conference on machine learning and applications (ICMLA 2020)
Chammas M, Makhoul A, Demerjian J, Dannaoui E (2022) A deep learning based system for writer identification in handwritten arabic historical manuscripts. Multimed Tools Appl 1–16
Chandra K, Kapoor G, Kohli R, Gupta A (2016) Improving software quality using machine learning. In: 2016 international conference on innovation and challenges in cyber security (ICICCS-INBUSH), pp 115–118
Chaurasia P, Kohli R, Garg A (2014) Biometrics minutiae detection and feature extraction. LAP LAMBERT Academic Publishing
Chen S, Wang Y, Lin C-T, Ding W, Cao Z (2019) Semi-supervised feature learning for improving writer identification. Inf Sci 482:156–170
Article MathSciNet Google Scholar
Christlein V, Bernecker D, Maier A, Angelopoulou E (2015) Offline writer identification using convolutional neural network activation features. In: German conference on pattern recognition, Springer, pp 540–552
Christlein V, Gropp M, Fiel S, Maier A (2017a). Unsupervised feature learning for writer identification and writer retrieval. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR)
Christlein V, Michel V, Bunke H (2017) Handwriting identification using scale-invariant feature transform and universal background model. Pattern Recognit Lett 92:1–8
Google Scholar
Christlein V, Nicolaou A, Seuret M, Stutzmann D, Maier A (2019) ICDAR 2019 competition on image retrieval for historical handwritten documents, arXiv [cs.CV]
Cilia ND, De Stefano C, Fontanella F, Marrocco C, Molinara M, Di Freca AS (2020) An end-to-end deep learning system for medieval writer identification. Pattern Recognit Lett 129:137–143
Article Google Scholar
Cilia ND, De Stefano C, Fontanella F, Marrocco C, Molinara M, Freca ASd (2020) An experimental comparison between deep learning and classical machine learning approaches for writer identification in medieval documents. J Imag 6:89
Article Google Scholar
Colavizza G, Ehrmann M, Bortoluzzi F (2019) Index-driven digitization and indexation of historical archives. Front Digit Humanit 6:4
Article Google Scholar
Déroche F et al (2005) Islamic codicology: an introduction to the study of manuscripts in arabic script
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37:1904–1916
Article Google Scholar
He S, Schomaker L (2021) GR-RNN: global-context residual recurrent neural networks for writer identification. Pattern Recognit 117:107975
Article Google Scholar
Initiative TE (2022) P5: guidelines for electronic text encoding and interchange. TEI Element Colophon. https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-colophon.html
Jordan S, Seuret M, Král P, Lenc L, Martínek J, Wiermann B, Schwinger T, Maier A, Christlein V (2020) Re-ranking for writer identification and writer retrieval. In: International workshop on document analysis systems, Springer, pp 572–586
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Larson RR (2010) Introduction to information retrieval
Liang D, Wu M (2020). A multi-patch deep learning system for text-independent writer identification. In: International conference on security, privacy and anonymity in computation, communication and storage, Springer, pp 409–419
Liang D, Wu M, Hu Y (2021) Offline writer identification using convolutional neural network and vlad descriptors. In: International conference on artificial intelligence and security, Springer, pp 253–264
Mahmoud SA, Ahmad I, Al-Khatib WG, Alshayeb M, Parvez MT, Märgner V, Fink GA (2014) KHATT: an open arabic offline handwritten text database. Pattern Recognit 47:1096–1112
Article Google Scholar
Mahmoud SA, Ahmad, I, Alshayeb M, Al-Khatib, WG, Parvez MT, Fink, GA, Märgner V, El Abed, H (2012) KHATT: arabic offline handwritten text database. In: 2012 international conference on frontiers in handwriting recognition, pp 449–454
Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-SVMs for object detection and beyond. In: 2011 international conference on computer vision
Marinai S, Gori M, Soda G (2005) Artificial neural networks for document analysis and recognition. IEEE Trans Pattern Anal Mach Intell 27:23–35
Article Google Scholar
Ngo TT, Nguyen HT, Nakagawa M (2021) A-vlad: an end-to-end attention-based neural network for writer identification in historical documents. In: International conference on document analysis and recognition, Springer, pp 396–409
Nguyen HT, Nguyen CT, Ino T, Indurkhya B, Nakagawa M (2019) Text-independent writer identification using convolutional neural network. Pattern Recognit Lett 121:104–112
Article Google Scholar
Nicolaou A, Dey S, Christlein V, Maier A, Karatzas D (2018) Non-deterministic behavior of ranking-based metrics when evaluating embeddings. In: International workshop on reproducible research in pattern recognition, Springer, pp 71–82
Rasoulzadeh S, BabaAli B (2020) Writer identification and writer retrieval based on netvlad with re-ranking. arXiv:2012.06186
Rehman A, Naz S, Razzak MI (2019) Writer identification using machine learning approaches: a comprehensive review. Multimed Tools Appl 78:10889–10931
Article Google Scholar
Ren S He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28
D ’e roche F c cO, Rossi VS (2012) The manuscripts in Arabic characters. Viella
Saleem S, Mohsin Abdulazeez A (2021) Hybrid trainable system for writer identification of arabic handwriting. Comput Mater Contin
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Semma A, Hannad Y, Siddiqi I, Djeddi C, El Youssfi El Kettani M (2021) Writer identification using deep learning with fast keypoints and harris corner detector. Expert Syst Appl 184:115473
Seuret A, Chum O, Christlein V, Michel V, Bunke H (2020) ICDAR 2020 competition on historical document writer identification. Int J Doc Anal Recognit (IJDAR) 23:511–526
Google Scholar
Seuret M, Nicolaou A, Maier A, Christlein V, Stutzmann D (2020b) ICFHR 2020 competition on image retrieval for historical handwritten fragments. In: 2020 17th international conference on frontiers in handwriting recognition (ICFHR), pp 216–221
Sun X, Nasrabadi NM, Tran TD (2019) Supervised deep sparse coding networks for image classification. IEEE Trans Image Process 29:405–418
Article MathSciNet Google Scholar
Uhlíř Z (2008) Digitization is not only making images: manuscript studies and digital processing of manuscripts. Knygotyra 51:148–162
Article Google Scholar
Wang Z, Maier A, Christlein V (2021) Towards end-to-end deep learning-based writer identification. INFORMATIK 2020
Xiao F, Kuang R, Ou Z, Xiong B (2019) Deepmen: multi-model ensemble network for b-lymphoblast cell classification. In: ISBI 2019 C-NMC challenge: classification in cancer cell imaging, Springer, pp 83–93
Yang W, Jin L, Liu M (2016) Deepwriterid: An end-to-end online text-independent writer identification system. IEEE Intell Syst 31:45–53
Article Google Scholar
Zhang H, Xue J, Dana K (2017) Deep ten: texture encoding network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 708–717
Zhang X-Y, Xie G-S, Liu C-L, Bengio Y (2016) End-to-end online writer identification with recurrent neural network. IEEE Trans Human-Mach Syst 47:285–292
Article Google Scholar

Download references

Acknowledgements

This research is funded by the EIPHI Graduate School (contract “ANR-17-EURE-0002”). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Quadro RTX 6000 GPU used for this research.

Author information

Authors and Affiliations

Digital Humanities Center, University of Balamand, El-Koura, Lebanon
Michel Chammas & Elie Dannaoui
Computer Science Department, Faculty of Arts and Sciences, University of Balamand, El-Koura, Lebanon
Michel Chammas
Femto-ST Institute, UMR CNRS 6174, Université de Bourgogne Franche-Comté, Montbéliard, France
Abdallah Makhoul
LaRRIS, Faculty of Sciences, Lebanese University, Fanar, Lebanon
Jacques Demerjian
Computer Science & IT Department, Faculty of Arts and Sciences, Holy Spirit University of Kaslik (USEK), Jounieh, Lebanon
Jacques Demerjian

Authors

Michel Chammas
View author publications
You can also search for this author in PubMed Google Scholar
Abdallah Makhoul
View author publications
You can also search for this author in PubMed Google Scholar
Jacques Demerjian
View author publications
You can also search for this author in PubMed Google Scholar
Elie Dannaoui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michel Chammas.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chammas, M., Makhoul, A., Demerjian, J. et al. An End-to-End deep learning system for writer identification in handwritten Arabic manuscripts. Multimed Tools Appl 83, 54569–54589 (2024). https://doi.org/10.1007/s11042-023-17303-8

Download citation

Received: 03 October 2022
Revised: 11 August 2023
Accepted: 22 September 2023
Published: 06 December 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11042-023-17303-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An End-to-End deep learning system for writer identification in handwritten Arabic manuscripts

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A deep learning based system for writer identification in handwritten Arabic historical manuscripts

Writer Retrieval and Writer Identification in Greek Papyri

Impact of the CNN Patch Size in the Writer Identification

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An End-to-End deep learning system for writer identification in handwritten Arabic manuscripts

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A deep learning based system for writer identification in handwritten Arabic historical manuscripts

Writer Retrieval and Writer Identification in Greek Papyri

Impact of the CNN Patch Size in the Writer Identification

Explore related subjects

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation