Abstract
Current automatic transcription technology applied to media contents is an important medium that not only allows generating subtitles, but also enables data search and retrieval capabilities over multimedia streams. Among others, one of the most important challenges that transcription systems have to deal with is speaker accent variability. In this work we study the importance of accent variability for three broad varieties of Portuguese: African Portuguese, Brazilian Portuguese and European Portuguese. Then, we propose a multi-variety transcription system based on the combination of variety identification followed by specific variety-dependent transcription systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Huang, C., Chen, T., Li, S., Chang, E., Zhou, J.L.: Analysis of speaker variability. In: Proc. European Conference on Speech Communication and Technology, Denmark, vol. 2, pp. 1377–1380 (2001)
Huang, C., Chang, E., Chen, T.: Accent Issues in Large Vocabulary Continuous Speech Recognition. Microsoft Research China Technical Report, MSR-TR-2001-69 (2001)
Wang, Z., Schultz, T., Waibel, A.: Comparison of acoustic model adaptation techniques on non-native speech. In: Proc. ICASSP 2003, pp. 540–543 (2003)
Humphries, J.J., Woodland, P.C., Pearce, D.: Using accent-specific pronunciation modelling for robust speech recognition. In: Proc, Fourth International Conference on Spoken Language, ICSLP, vol. 4, pp. 2324–2327 (1996)
Neto, J., Meinedo, H., Viveiros, M., Cassaca, R., Martins, C., Caseiro, D.: Broadcast news subtitling system in Portuguese. In: Proc. ICASSP 2008, Las Vegas, USA (2008)
Lewis, M.P.: Ethnologue: Languages of the World, 16th edn., SIL International, (May 2009), http://www.ethnologue.com/
Abad, A., Trancoso, I., Neto, N., Viana, M.C.: Porting an European Portuguese broadcast news recognition system to Brazilian Portuguese. In: Proc. Interspeech 2009, Brighton, UK (2009)
Koller, O., Abad, A., Trancoso, I., Viana, C.: Exploiting variety-dependent phones in portuguese variety identification applied to broadcast news transcription. In: Proc. Interspeech 2010, Makuhari, Japan (2010)
Rouas, J., Trancoso, I., Viana, C., Abreu, M.: Language and variety verification on broadcast news for Portuguese. Speech Communnication 50(11-12), 965–979 (2008)
Meinedo, H., Abad, A., Pellegrini, T., Trancoso, I., Neto, J.: The L2F Broadcast News Speech Recognition System. In: Proc. Fala 2010, Vigo, Spain (2010)
Abad, A., Neto, J.: Incorporating acoustical modeling of phone transitions in an hybrid ANN/HMM speech recognizer. In: Proc. Interspeech 2008, Brisbane, Australia, pp. 2394–2397 (2008)
Caseiro, D., Trancoso, I.: A specialized on-the-fly algorithm for lexicon and language model composition. IEEE Transactions on Audio, Speech and Lang. Proc. 14(4) (2005)
Caseiro, D., Trancoso, I., Oliveira, L., Viana, C.: Grapheme-to-phone using finite state transducers. In: Proc. 2002 IEEE Workshop on Speech Synthesis, Santa Monica, CA, USA (2002)
Zissman, M.A.: Comparison of Four Approaches to Automatic Language Identification of Telephone Speech. IEEE Transactions on Speech and Audio Processing 4(1) (1996)
Koller, O., Abad, A., Trancoso, I.: Exploiting variety-dependent phones in Portuguese variety identification. In: Odyssey 2010: The Speaker and Language Recognition Workshop (2010)
Berkling, K., Arai, T., Barnard, E.: Analysis of Phoneme-Based features for language identification. In: Proc. ICASSP, vol. 1, pp. 289–292 (1994)
Campbell, W.M., Campbell, J.P., Reynolds, D.A., Singer, E., Torres-Carrasquillo, P.A.: Support vector machines for speaker and language recognition. Computer Speech and Language 20(2-3), 210–229 (2006)
Torres-Carrasquillo, P.A., Singer, E., Kohler, M.A., Greene, R.J., Reynolds, D.A., Deller Jr., J.R.: Approaches to Language Identification using Gaussian Mixture Models and Shifted Delta Cepstral Features. In: Proc. ICSLP 2002, Denver, Colorado, pp. 89–92 (2002)
Campbell, W.M.: A covariance kernel for svm language recognition. In: Proc. ICASSP 2008, pp. 4141–4144 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Abad, A., Meinedo, H., Trancoso, I., Neto, J. (2012). Transcription of Multi-variety Portuguese Media Contents. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-28885-2_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28884-5
Online ISBN: 978-3-642-28885-2
eBook Packages: Computer ScienceComputer Science (R0)