Abstract
Digital documents are usually degraded during the scanning process due to the contents of the backside of the scanned manuscript. This is often caused by the show-through effect, i.e. the backside image that interferes with the main front side picture due to the intrinsic transparency of the paper. This phenomenon is one of the degradations that one would like to remove especially in the field of Optical Character Recognition (OCR) or document digitalization which require denoised texts as inputs. In this paper, we first propose a novel and general nonlinear model for canceling the show-through phenomenon. A nonlinear blind source separation algorithm is used for this purpose based on a new recursive and extendible structure. However, the results are restricted due to a blurring effect that appears during the scanning process due to the light transfer function of the paper. Consequently, for improving the results, we introduce a refined separating architecture for simultaneously removing the show-through and blurring effects.
Similar content being viewed by others
References
Larsson, L.O., Trollsås P.O.: Print-through as an ink/paper interaction effect in newsprint, the fundamental properties of paper related to its uses. In : Bolam, F. (ed.) Trans. of the Cambridge Symposium, vol. 1, pp. 600–612. London (1976)
Nishida, H., Suzuki, T.: A multi-scale approach to restoring scanned colour documents with show-through effects. In: Proceedings Seventh International Conference on Document Analysis and Recognition, vol. 1, pp. 584–588 (2003)
Wang, Q., Xia, T., Li, L., Tan, C.L.: Document image enhancement using directional wavelet. In: Proceedings IEEE Conference Computer Vision Pattern Recognition, vol. 2, pp. 534–539 (2003)
Knox, K.: Show-through correction for two-sided documents. United States Patent 5,832,137 (1998, Nov)
Wang, Q., Tan, C.L.: Matching of double-sided document images to remove interference. IEEE CVPR2001 (2001, Dec)
Tonazzini, A., Salerno, E., Bedini, L. (2007) Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique. Int. J. Doc. Anal. IJDAR(10) (1), 17–25
Ophir B., Malah D.: Show-through cancellation in scanned images using blind source separation techniques. IEEE International Conference on Image Processing 3, 233–236 (2007)
Tonazzini, A., Salerno, E., Mochi, M., Bedini, L.: Bleed-through removal from degraded documents using a color decorrelation method. In: Proceedings Document Analysis Systems VI: 6th International Workshop, Springer-Verlag GmbH, vol. 3163 of Lecture Notes in Computer Science, pp. 229–240 (2004)
Almeida, M.S.C., Almeida, L.B.: Separating nonlinear image mixtures using a physical model trained with ICA. In: Proceedings 2006 IEEE International Workshop Machine Learning for Signal Processing, Maynooth, Ireland (2006, Sep)
Sharma, G.: Show-through cancellation in scans of duplex printed documents. In: IEEE Transactions on Image Processing, vol. 10, no. 5, pp. 736–754 (2001, May)
Farhang-Boroujeny B.: Adaptive Filters, Theory and applications. Wiley , London (1998)
Castro, P., Almeida, R.J., Caldas Pinto, J.R.: Restoration of double-sided ancient music documents with bleed-through. In: CIARP 2007, pp. 940–949 (2007)
Almeida L.B.: Separating a real-life nonlinear image mixture. J. Mach. Learn. Res. 6, 1199–1232 (2005)
Jutten C., Babaie-Zadeh M., Hosseini S.: Three easy ways for separating nonlinear mixtures? . Signal Process. 84(2), 217–229 (2004)
Hosseini S., Jutten C.: On the separability of nonlinear mixtures of temporally correlated sources. IEEE Signal Process. Lett. 10(2), 43–46 (2003)
Almeida, L.B.: Linear and nonlinear ICA based on mutual information. In: Proceedings IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (AS-SPCC), pp. 117–122. Lake Louise, Canada (2000, Oct)
Almeida L.B.: MISEP—linear and nonlinear ICA based on mutual information. J. Mach. Learn. Res. 4, 1297–1318 (2003)
Tonazzini A., Bedini L., Salerno E.: A markov model for blind image separation by a mean-field EM algorithm. IEEE Trans. Image Process. 26(2), 473–482 (2006)
Wolf C.: Document ink bleed-through removal with two hidden markov random fields and a single observation field. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 431–447 (2010)
Hosseini S., Jutten J., Pham D.T.: Markovian source separation. IEEE Trans. Signal Process. 51(12), 3009–3019 (2003)
Moussaoui S., Hauksdottir H., Schmidt F., Jutten C., chanussot J., Brie D., Doute S., Benediktsson J.: On the decomposition of mars hyperspectral data by ICA and bayesian positive source separation. Neurocomputing 71(10–12), 2194–2208 (2008)
Hosseini, S., Deville, Y.: Blind maximum likelihood separation of a linear-quadratic Mixture. In: Proceedings 5th International Conference on Independent Component Analysis and Blind Source Separation (ICA’04), pp. 694–701. Granada, Spain (2004, Sep)
Hosseini, S., Deville, Y.: Correction to: blind maximum likelihood separation of a linear-quadratic mixture. available at http://arxiv.org/abs/1001.0863
Merrikh-Bayat, F., Babaie-Zadeh, M., Jutten, C.: A nonlinear blind source separation solution for removing the show-through effect in the scanned documents. In: 16th European Signal Processing Conference (EUSIPCO-2008). Lausanne, Switzerland (2008, Aug)
Jutten C., Hérault J.: Blind separation of sources, part I: an adaptive algorithm based on neuromimetic architecture. Signal Process. 24, 1–10 (1991)
Hosseini, S., Deville, Y.: Blind separation of linear-quadratic mixtures of real sources using a recurrent structure. In: Proceedings IWANN, vol. 2, pp. 241–248. Mao, Menorca, Spain (2003, June)
Tonazzini, A., Gerace, I.: Bayesian MRF-based blind source separation of convolutive mixtures of images. In: EUSIPCO. Antalya, Turkey (2005, Sep)
Babaie-Zadeh, M.: On Blind Source Separation in Convolutive and Nonlinear Mixtures. Ph.D. thesis, INP Grenoble (2002)
Babaie-Zadeh M., Jutten C.: A general approach for mutual information minimization and its application to blind Source separation. Signal Process. (Elsevier) 85(5), 975–995 (2005)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work has been partially funded by the Iran Telecom Research Center (ITRC), by the Iran National Science Foundation (INSF), and also by the Center for International Research and Collaboration (ISMO) and the French Embassy in Tehran, Iran, in the framework of a Gundi-Shapour collaboration program.
Rights and permissions
About this article
Cite this article
Merrikh-Bayat, F., Babaie-Zadeh, M. & Jutten, C. Linear-quadratic blind source separating structure for removing show-through in scanned documents. IJDAR 14, 319–333 (2011). https://doi.org/10.1007/s10032-010-0131-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-010-0131-7