default search action
INTERSPEECH 2004: Lisbon, Portugal
- 8th International Conference on Spoken Language Processing, INTERSPEECH-ICSLP 2004, Jeju Island, Korea, October 4-8, 2004. ISCA 2004
Plenary Talks
- Chin-Hui Lee:
From decoding-driven to detection-based paradigms for automatic speech recognition. - Hyun-Bok Lee:
In search of a universal phonetic alphabet - theory and application of an organic visible speech-. - Jacqueline Vaissière:
From X-ray or MRU data to sounds through articulatory synthesis: towards an integrated view of the speech communication process.
Speech Recognition - Adaptation
- Sreeram Balakrishnan, Karthik Visweswariah, Vaibhava Goel:
Stochastic gradient adaptation of front-end parameters. 1-4 - Antoine Raux, Rita Singh:
Maximum - likelihod adaptation of semi-continuous HMMs by latent variable decomposition of state distributions. 5-8 - Chao Huang, Tao Chen, Eric Chang:
Transformation and combination of hiden Markov models for speaker selection training. 9-12 - Brian Kan-Wing Mak, Roger Wend-Huu Hsiao:
Improving eigenspace-based MLLR adaptation by kernel PCA. 13-16 - Nikos Chatzichrisafis, Vassilios Digalakis, Vassilios Diakoloukas, Costas Harizakis:
Rapid acoustic model development using Gaussian mixture clustering and language adaptation. 17-20 - Karthik Visweswariah, Ramesh A. Gopinath:
Adaptation of front end parameters in a speech recognizer. 21-24 - Diego Giuliani, Matteo Gerosa, Fabio Brugnara:
Speaker normalization through constrained MLLR based transforms. 2893-2896 - Xiangyu Mu, Shuwu Zhang, Bo Xu:
Multi-layer structure MLLR adaptation algorithm with subspace regression classes and tying. 2897-2900 - Georg Stemmer, Stefan Steidl, Christian Hacker, Elmar Nöth:
Adaptation in the pronunciation space for non-native speech recognition. 2901-2904 - Xuechuan Wang, Douglas D. O'Shaughnessy:
Robust ASR model adaptation by feature-based statistical data mapping. 2905-2908 - Zhaobing Han, Shuwu Zhang, Bo Xu:
A novel target-driven generalized JMAP adaptation algorithm. 2909-2912 - Brian Mak, Simon Ka-Lung Ho, James T. Kwok:
Speedup of kernel eigenvoice speaker adaptation by embedded kernel PCA. 2913-2916 - Hyung Bae Jeon, Dong Kook Kim:
Maximum a posteriori eigenvoice speaker adaptation for Korean connected digit recognition. 2917-2920 - Wei Wang, Stephen A. Zahorian:
Vocal tract normalization based on spectral warping. 2921-2924 - Koji Tanaka, Fuji Ren, Shingo Kuroiwa, Satoru Tsuge:
Acoustic model adaptation for coded speech using synthetic speech. 2925-2928 - Motoyuki Suzuki, Hirokazu Ogasawara, Akinori Ito, Yuichi Ohkawa, Shozo Makino:
Speaker adaptation method for CALL system using bilingual speakers' utterances. 2929-2932 - Shinji Watanabe:
Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task. 2933-2936 - Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang:
Speaker clustering of speech utterances using a voice characteristic reference space. 2937-2940 - Young Kuk Kim, Hwa Jeon Song, Hyung Soon Kim:
Performance improvement of connected digit recognition using unsupervised fast speaker adaptation. 2941-2944 - Hyung Soon Kim, Hwa Jeon Song:
Simultaneous estimation of weights of eigenvoices and bias compensation vector for rapid speaker adaptation. 2945-2948 - Matthias Wölfel:
Speaker dependent model order selection of spectral envelopes. 2949-2952 - Enrico Bocchieri, Michael Riley, Murat Saraclar:
Methods for task adaptation of acoustic models with limited transcribed in-domain data. 2953-2956 - Atsushi Fujii, Tetsuya Ishikawa, Katsunobu Itou, Tomoyosi Akiba:
Unsupervised topic adaptation for lecture speech retrieval. 2957-2960 - Haibin Liu, Zhenyang Wu:
Mean and covariance adaptation based on minimum classification error linear regression for continuous density HMMs. 2961-2964 - Goshu Nagino, Makoto Shozakai:
Design of ready-made acoustic model library by two-dimensional visualization of acoustic space. 2965-2968
Spoken Language Identification, Translation and Retrieval I
- Jean-Luc Gauvain, Abdelkhalek Messaoudi, Holger Schwenk:
Language recognition using phone latices. 25-28 - Mark A. Huckvale:
ACCDIST: a metric for comparing speakers' accents. 29-32 - Michael Levit, Allen L. Gorin, Patrick Haffner, Hiyan Alshawi, Elmar Nöth:
Aspects of named entity processing. 33-36 - Josep Maria Crego, José B. Mariño, Adrià de Gispert:
Finite-state-based and phrase-based statistical machine translation. 37-40 - Tanja Schultz, Szu-Chen Stan Jou, Stephan Vogel, Shirin Saleem:
Using word latice information for a tighter coupling in speech translation systems. 41-44 - Teruhisa Misu, Tatsuya Kawahara, Kazunori Komatani:
Confirmation strategy for document retrieval systems with spoken dialog interface. 45-48 - Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh:
Multilayer subword units for open-vocabulary spoken document retrieval. 1553-1556 - Yoshiaki Itoh, Kazuyo Tanaka, Shi-wook Lee:
An efficient partial matching algorithm toward speech retrieval by speech. 1557-1560 - Celestin Sedogbo, Sébastien Herry, Bruno Gas, Jean-Luc Zarader:
Language detection by neural discrimination. 1561-1564 - Ricardo de Córdoba, Javier Ferreiros, Valentín Sama, Javier Macías Guarasa, Luis Fernando D'Haro, Fernando Fernández Martínez:
Language identification techniques based on full recognition in an air traffic control task. 1565-1568 - John H. L. Hansen, Umit H. Yapanel, Rongqing Huang, Ayako Ikeno:
Dialect analysis and modeling for automatic classification. 1569-1572 - Emmanuel Ferragne, François Pellegrino:
Rhythm in read british English: interdialect variability. 1573-1576 - Pascale Fung, Yi Liu, Yongsheng Yang, Yihai Shen, Dekai Wu:
A grammar-based Chinese to English speech translation system for portable devices. 1577-1580 - Gökhan Tür:
Cost-sensitive call classification. 1581-1584 - Mikko Kurimo, Ville T. Turunen, Inger Ekman:
An evaluation of a spoken document retrieval baseline system in finish. 1585-1588 - Hui Jiang, Pengfei Liu, Imed Zitouni:
Discriminative training of naive Bayes classifiers for natural language call routing. 1589-1592 - Nicolas Moreau, Hyoung-Gook Kim, Thomas Sikora:
Phonetic confusion based document expansion for spoken document retrieval. 1593-1596 - Euisok Chung, Soojong Lim, Yi-Gyu Hwang, Myung-Gil Jang:
Hybrid named entity recognition for question-answering system. 1597-1600 - Jitendra Ajmera, Iain McCowan, Hervé Bourlard:
An online audio indexing system. 1601-1604 - Eric Sanders, Febe de Wet:
Histogram normalisation and the recognition of names and ontology words in the MUMIS project. 1605-1608 - Rui Amaral, Isabel Trancoso:
Improving the topic indexation and segmentation modules of a media watch system. 1609-1612 - Melissa Barkat-Defradas, Rym Hamdi, Emmanuel Ferragne, François Pellegrino:
Speech timing and rhythmic structure in arabic dialects: a comparison of two approaches. 1613-1616 - Hsin-Min Wang, Shih-Sian Cheng:
METRIC-SEQDAC: a hybrid approach for audio segmentation. 1617-1620 - Jen-Wei Kuo, Yao-Min Huang, Berlin Chen, Hsin-Min Wang:
Statistical Chinese spoken document retrieval using latent topical information. 1621-1624 - Masahiko Matsushita, Hiromitsu Nishizaki, Seiichi Nakagawa, Takehito Utsuro:
Keyword recognition and extraction by multiple-LVCSRs with 60, 000 words in speech-driven WEB retrieval task. 1625-1628 - Ruiqiang Zhang, Gen-ichiro Kikui, Hirofumi Yamamoto, Frank K. Soong, Taro Watanabe, Eiichiro Sumita, Wai Kit Lo:
Improved spoken language translation using n-best speech recognition hypotheses. 1629-1632 - Kakeung Wong, Man-Hung Siu:
Automatic language identification using discrete hidden Markov model. 1633-1636 - Bowen Zhou, Daniel Déchelotte, Yuqing Gao:
Two-way speech-to-speech translation on handheld devices. 1637-1640 - Hervé Blanchon:
HLT modules scalability within the NESPOLE! project. 1641-1644
Linguistics, Phonology, and Phonetics
- Midam Kim:
Correlation between VOT and F0 in the perception of Korean stops and affricates. 49-52 - Aude Noiray, Lucie Ménard, Marie-Agnès Cathiard, Christian Abry, Christophe Savariaux:
The development of anticipatory labial coarticulation in French: a pionering study. 53-56 - Melvyn John Hunt:
Speech recognition, sylabification and statistical phonetics. 57-60 - Jilei Tian:
Data-driven approaches for automatic detection of syllable boundaries. 61-64 - Anne Cutler, Dennis Norris, Núria Sebastián-Gallés:
Phonemic repertoire and similarity within the vocabulary. 65-68 - Sameer Maskey, Alan W. Black, Laura Tomokiya:
Boostrapping phonetic lexicons for new languages. 69-72 - Mirjam Broersma, K. Marieke Kolkman:
Lexical representation of non-native phonemes. 1241-1244 - Jong-Pyo Lee, Tae-Yeoub Jang:
A comparative study on the production of inter-stress intervals of English speech by English native speakers and Korean speakers. 1245-1248 - Emi Zuiki Murano, Mihoko Teshigawara:
Articulatory correlates of voice qualities of god guys and bad guys in Japanese anime: an MRI study. 1249-1252 - Sorin Dusan:
Effects of phonetic contexts on the duration of phonetic segments in fluent read speech. 1253-1256 - Qiang Fang:
A study on nasal coda los in continuous speech. 1257-1260 - Hua-Li Jian:
An improved pair-wise variability index for comparing the timing characteristics of speech. 1261-1264 - Hua-Li Jian:
An acoustic study of speech rhythm in taiwan English. 1265-1268 - Sung-A. Kim:
Language specific phonetic rules: evidence from domain-initial strengthening. 1269-1272 - Hansang Park:
Spectral characteristics of the release bursts in Korean alveolar stops. 1273-1276 - Rob van Son, Olga Bolotova, Louis C. W. Pols, Mietta Lennes:
Frequency effects on vowel reduction in three typologically different languages (dutch, finish, Russian). 1277-1280 - Julia Abresch, Stefan Breuer:
Assessment of non-native phones in anglicisms by German listeners. 1281-1284 - Sunhee Kim:
Phonology of exceptions for for Korean grapheme-to-phoneme conversion. 1285-1289 - Shigeyoshi Kitazawa, Shinya Kiriyama:
Acoustic and prosodic analysis of Japanese vowel-vowel hiatus with laryngeal effect. 1289-1293 - Kimiko Tsukada:
A cross-linguistic acoustic comparison of unreleased word-final stops: Korean and Thai. 1293-1296 - Taehong Cho, Elizabeth K. Johnson:
Acoustic correlates of phrase-internal lexical boundaries in dutch. 1297-1300 - Taehong Cho, James M. McQueen:
Phonotactics vs. phonetic cues in native and non-native listening: dutch and Korean listeners' perception of dutch and English. 1301-1304 - Svetlana Kaminskaia, François Poiré:
Comparing intonation of two varieties of French using normalized F0 values. 1305-1308 - Mira Oh, Kee-Ho Kim:
Phonetic realization of the suffix-suppressed accentual phrase in Korean. 1309-1312 - H. Timothy Bunnell, James B. Polikoff, Jane McNicholas:
Spectral moment vs. bark cepstral analysis of children's word-initial voiceles stops. 1313-1316 - Nobuaki Minematsu:
Pronunciation assessment based upon the compatibility between a learner's pronunciation structure and the target language's lexical structure. 1317-1320 - Kenji Yoshida:
Spread of high tone in akita Japanese. 1321-1324
Biomedical Applications of Speech Analysis
- Juan Ignacio Godino-Llorente, María Victoria Rodellar Biarge, Pedro Gómez Vilda, Francisco Díaz Pérez, Agustín Álvarez Marquina, Rafael Martínez-Olalla:
Biomechanical parameter fingerprint in the mucosal wave power spectral density. 73-76 - Cheolwoo Jo, Soo-Geon Wang, Byung-Gon Yang, Hyung-Soon Kim, Tao Li:
Classification of pathological voice including severely noisy cases. 77-80 - Qiang Fu, Peter Murphy:
A robust glottal source model estimation technique. 81-84 - Hiroki Mori, Yasunori Kobayashi, Hideki Kasuya, Hajime Hirose, Noriko Kobayashi:
F0 and formant frequency distribution of dysarthric speech - a comparative study. 85-88 - Hideki Kawahara, Yumi Hirachi, Masanori Morise, Hideki Banno:
Procedure "senza vibrato": a key component for morphing singing. 89-92 - Claudia Manfredi, Giorgio Peretti, Laura Magnoni, Fabrizio Dori, Ernesto Iadanza:
Thyroplastic medialisation in unilateral vocal fold paralysis: assessing voice quality recovering. 93-96 - Gernot Kubin, Martin Hagmüller:
Voice enhancement of male speakers with laryngeal neoplasm. 541-544 - Jong Min Choi, Myung-Whun Sung, Kwang Suk Park, Jeong-Hun Hah:
A comparison of the perturbation analysis between PRAAT and computerize speech lab. 545-548
Robust Speech Recognition on AURORA
- Ji Ming, Baochun Hou:
Evaluation of universal compensation on Aurora 2 and 3 and beyond. 97-100 - Hugo Van hamme:
PROSPECT features and their application to missing data techniques for robust speech recognition. 101-104 - Hugo Van hamme, Patrick Wambacq, Veronique Stouten:
Accounting for the uncertainty of speech estimates in the context of model-based feature enhancement. 105-108 - Hans-Günter Hirsch, Harald Finster:
Applying the Aurora feature extraction schemes to a phoneme based recognition task. 109-112 - Zhipeng Zhang, Tomoyuki Ohya, Sadaoki Furui:
Evaluation of tree-structured piecewise linear transformation-based noise adaptation on AURORA2 database. 113-116 - Tor André Myrvoll, Satoshi Nakamura:
Online minimum mean square error filtering of noisy cepstral coefficients using a sequential EM algorithm. 117-120 - Akira Sasou, Kazuyo Tanaka, Satoshi Nakamura, Futoshi Asano:
HMM-based feature compensation method: an evaluation using the AURORA2. 121-124 - Xuechuan Wang, Douglas D. O'Shaughnessy:
Noise adaptation for robust AURORA 2 noisy digit recognition using statistical data mapping. 125-128 - Benjamin J. Shannon, Kuldip K. Paliwal:
MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition. 129-132 - Muhammad Ghulam, Takashi Fukuda, Junsei Horikawa, Tsuneo Nitta:
A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASR. 133-136 - José C. Segura, Ángel de la Torre, Javier Ramírez, Antonio J. Rubio, M. Carmen Benítez:
Including uncertainty of speech observations in robust speech recognition. 137-140 - Takeshi Yamada, Jiro Okada, Nobuhiko Kitawaki:
Integration of n-best recognition results obtained by multiple noise reduction algorithms. 141-144 - Panji Setiawan, Sorel Stan, Tim Fingscheidt:
Revisiting some model-based and data-driven denoising algorithms in Aurora 2 context. 145-148 - Guo-Hong Ding, Bo Xu:
Exploring high-performance speech recognition in noisy environments using high-order taylor series expansion. 149-152 - Wing-Hei Au, Man-Hung Siu:
A robust training algorithm based on neighborhood information. 153-156 - Siu Wa Lee, Pak-Chung Ching:
In-phase feature induction: an effective compensation technique for robust speech recognition. 157-160 - Jeff Siu-Kei Au-Yeung, Man-Hung Siu:
Improved performance of Aurora 4 using HTK and unsupervised MLLR adaptation. 161-164 - Shang-nien Tsai, Lin-Shan Lee:
A new feature extraction front-end for robust speech recognition using progressive histogram equalization and multi-eigenvector temporal filtering. 165-168
Spoken / Multimodal Dialogue System
- Christian Fügen, Hartwig Holzapfel, Alex Waibel:
Tight coupling of speech recognition and dialog management - dialog-context dependent grammar weighting for speech recognition. 169-172 - Akinobu Lee, Keisuke Nakamura, Ryuichi Nisimura, Hiroshi Saruwatari, Kiyohiro Shikano:
Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs. 173-176 - Hironori Oshikawa, Norihide Kitaoka, Seiichi Nakagawa:
Speech interface for name input based on combination of recognition methods using syllable-based n-gram and word dictionary. 177-180 - Imed Zitouni, Minkyu Lee, Hui Jiang:
Constrained minimization technique for topic identification using discriminative training and support vector machines. 181-184 - Jason D. Williams, Steve J. Young:
Characterizing task-oriented dialog using a simulated ASR chanel. 185-188 - Takashi Konashi, Motoyuki Suzuki, Akinori Ito, Shozo Makino:
A spoken dialog system based on automatic grammar generation and template-based weighting for autonomous mobile robots. 189-192 - Akinori Ito, Takanobu Oba, Takashi Konashi, Motoyuki Suzuki, Shozo Makino:
Noise adaptive spoken dialog system based on selection of multiple dialog strategies. 193-196 - Mikko Hartikainen, Markku Turunen, Jaakko Hakulinen, Esa-Pekka Salonen, J. Adam Funk:
Flexible dialogue management using distributed and dynamic dialogue control. 197-200 - Keith Houck:
Contextual revision in information seeking conversation systems. 201-204 - Ian M. O'Neill, Philip Hanna, Xingkun Liu, Michael F. McTear:
Cross domain dialogue modelling: an object-based approach. 205-208 - Hirohiko Sagawa, Teruko Mitamura, Eric Nyberg:
A comparison of confirmation styles for error handling in a speech dialog system. 209-212 - Fan Yang, Peter A. Heeman:
Using computer simulation to compare two models of mixed-initiative. 213-216 - Fan Yang, Peter A. Heeman, Kristy Hollingshead:
Towards understanding mixed-initiative in task-oriented dialogues. 217-220 - Peter Wolf, Joseph Woelfel, Jan C. van Gemert, Bhiksha Raj, David Wong:
Spokenquery: an alternate approach to chosing items with speech. 221-224 - Shona Douglas, Deepak Agarwal, Tirso Alonso, Robert M. Bell, Mazin G. Rahim, Deborah F. Swayne, Chris Volinsky:
Mining customer care dialogs for "daily news". 225-228 - Jens Edlund, Gabriel Skantze, Rolf Carlson:
Higgins - a spoken dialogue system for investigating error handling techniques. 229-232 - Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Hua Cheng, Hauke Schmidt, Harry Bratt, Rohit Mishra, Stanley Peters, Sandra Upson, Elizabeth Shriberg, Carsten Bergmann, Lin Zhao:
A conversational dialogue system for cognitively overloaded users. 233-236 - Gerhard Hanrieder, Stefan W. Hamerich:
Modeling generic dialog applications for embedded systems. 237-240 - Matthew N. Stuttle, Jason D. Williams, Steve J. Young:
A framework for dialogue data collection with a simulated ASR channel. 241-244 - Shimei Pan:
A multi-layer conversation management approach for information seeking applications. 245-248 - Thomas K. Harris, Roni Rosenfeld:
A universal speech interface for appliances. 249-252 - Keita Hayashi, Yuki Irie, Yukiko Yamaguchi, Shigeki Matsubara, Nobuo Kawaguchi:
Speech understanding, dialogue management and response generation in corpus-based spoken dialogue system. 253-256 - Fernando Fernández Martínez, Valentín Sama, Luis Fernando D'Haro, Rubén San Segundo, Ricardo de Córdoba, Juan Manuel Montero:
Implementation of dialog applications in an open-source voiceXML platform. 257-260 - Chun Wai Lau, Bin Ma, Helen Mei-Ling Meng, Yiu Sang Moon, Yeung Yam:
Fuzzy logic decision fusion in a multimodal biometric system. 261-264 - Peter Poller, Norbert Reithinger:
A state model for the realization of visual perceptive feedback in smartkom. 265-268 - Akemi Iida, Yoshito Ueno, Ryohei Matsuura, Kiyoaki Aikawa:
A vector-based method for efficiently representing multivariate environmental information. 269-272 - Ioannis Toptsis, Shuyin Li, Britta Wrede, Gernot A. Fink:
A multi-modal dialog system for a mobile robot. 273-276 - Niels Ole Bernsen, Laila Dybkjær:
Structured interview-based evaluation of spoken multimodal conversation with h.c. andersen. 277-280
Speech Recognition - Search
- Miroslav Novak, Vladimír Bergl:
Memory efficient decoding graph compilation with wide cross-word acoustic context. 281-284 - Dongbin Zhang, Limin Du:
Dynamic beam pruning strategy using adaptive control. 285-288 - Takaaki Hori, Chiori Hori, Yasuhiro Minami:
Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition. 289-292 - Peng Yu, Frank Torsten Bernd Seide:
A hybrid word / phoneme-based approach for improved vocabulary-independent search in spontaneous speech. 293-296 - Lubos Smídl, Ludek Müller:
Keyword spotting for highly inflectional languages. 297-300 - Frédéric Tendeau:
Optimizing an engine network that allows dynamic masking. 301-304
Spoken Dialogue and Systems
- Katsutoshi Ohtsuki, Nobuaki Hiroshima, Yoshihiko Hayashi, Katsuji Bessho, Shoichi Matsunaga:
Topic structure extraction for meeting indexing. 305-308 - Sophie Rosset, Lori Lamel:
Automatic detection of dialog acts based on multilevel information. 309-312 - Gina-Anne Levow:
Identifying local corrections in human-computer dialogue. 313-316 - Peter Reichl, Florian Hammer:
Hot discussion or frosty dialogue? towards a temperature metric for conversational interactivity. 317-320 - Stephanie Seneff, Chao Wang, I. Lee Hetherington, Grace Chung:
A dynamic vocabulary spoken dialogue interface. 321-324 - Matthias Denecke, Kohji Dohsaka, Mikio Nakano:
Learning dialogue policies using state aggregation in reinforcement learning. 325-328
Speech Perception
- Keren B. Shatzman:
Segmenting ambiguous phrases using phoneme duration. 329-332 - Shuichi Sakamoto, Yôiti Suzuki, Shigeaki Amano, Tadahisa Kondo, Naoki Iwaoka:
A compensation method for word-familiarity difference with SNR control in intelligibility test. 333-336 - Takashi Otake, Yoko Sakamoto, Yasuyuki Konomi:
Phoneme-based word activation in spoken-word recognition: evidence from Japanese school children. 337-340 - Belynda Brahimi, Philippe Boula de Mareüil, Cédric Gendrot:
Role of segmental and suprasegmental cues in the perception of maghrebian-acented French. 341-344 - Hiroaki Kato, Yoshinori Sagisaka, Minoru Tsuzaki, Makiko Muto:
Effect of speaking rate on the acceptability of change in segment duration. 345-348 - Kiyoko Yoneyama:
A cross-linguistic study of diphthongs in spoken word processing in Japanese and English. 349-352
Multi-Lingual Speech-to-Speech Translation
- Alex Waibel:
Speech translation: past, present and future. 353-356 - Gen-ichiro Kikui, Toshiyuki Takezawa, Seiichi Yamamoto:
Multilingual corpora for speech-to-speech translation research. 357-360 - Hermann Ney:
Statistical machine translation and its challenges. 361-364 - John Lee, Stephanie Seneff:
Translingual grammar induction. 365-368 - Youngjik Lee, Jun Park, Seung-Shin Oh:
Usability considerations of speech-to-speech translation system. 369-372 - Gianni Lazzari, Alex Waibel, Chengqing Zong:
Worldwide ongoing activities on multilingual speech to speech translation. 373-376
Speech Recognition - Large Vocabulary
- Dominique Fohr, Odile Mella, Christophe Cerisara, Irina Illina:
The automatic news transcription system: ANTS, some real time experiments. 377-380 - Bhuvana Ramabhadran, Olivier Siohan, Geoffrey Zweig:
Use of metadata to improve recognition of spontaneous speech and named entities. 381-384 - Janne Pylkkönen, Mikko Kurimo:
Duration modeling techniques for continuous speech recognition. 385-388 - Tanel Alumäe:
Large vocabulary continuous speech recognition for estonian using morpheme classes. 389-392 - Zhaobing Han, Shuwu Zhang, Bo Xu:
Combining agglomerative and tree-based state clustering for high accuracy acoustic modeling. 393-396 - William S.-Y. Wang, Gang Peng:
Parallel tone score association method for tone language speech recognition. 397-400 - Jing Zheng, Horacio Franco, Andreas Stolcke:
Effective acoustic modeling for rate-of-speech variation in large vocabulary conversational speech recognition. 401-404 - L. Sarada Ghadiyaram, Hemalatha Nagarajan, Nagarajan Thangavelu, Hema A. Murthy:
Automatic transcription of continuous speech using unsupervised and incremental training. 405-408 - Jan Nouza, Dana Nejedlová, Jindrich Zdánský, Jan Kolorenc:
Very large vocabulary speech recognition system for automatic transcription of czech broadcast programs. 409-412 - Olivier Siohan, Bhuvana Ramabhadran, Geoffrey Zweig:
Speech recognition error analysis on the English MALACH corpus. 413-416 - Rong Zhang, Alexander I. Rudnicky:
A frame level boosting training scheme for acoustic modeling. 417-420 - Rong Zhang, Alexander I. Rudnicky:
Optimizing boosting with discriminative criteria. 421-424 - Xianghua Xu, Qiang Guo, Jie Zhu:
Restructuring HMM states for speaker adaptation in Mandarin speech recognition. 425-428 - Mike Matton, Mathias De Wachter, Dirk Van Compernolle, Ronald Cools:
A discriminative locally weighted distance measure for speaker independent template based speech recognition. 429-432 - Yohei Itaya, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura:
Deterministic annealing EM algorithm in parameter estimation for acoustic model. 433-436 - Frantisek Grézl, Martin Karafiát, Jan Cernocký:
TRAP based features for LVCSR of meting data. 437-440 - Frank K. Soong, Wai Kit Lo, Satoshi Nakamura:
Optimal acoustic and language model weights for minimizing word verification errors. 441-444 - Atsushi Sako, Yasuo Ariki:
Structuring of baseball live games based on speech recognition using task dependant knowledge. 445-448 - Zhengyu Zhou, Helen M. Meng:
A two-level schema for detecting recognition errors. 449-452 - In-Jeong Choi, Nam-Hoon Kim, Su Youn Yoon:
Large vocabulary continuous speech recognition based on cross-morpheme phonetic information. 453-456 - Changxue Ma:
Automatic phonetic base form generation based on maximum context tree. 457-460 - Gustavo Hernández Ábrego, Lex Olorenshaw, Raquel Tato, Thomas Schaaf:
Dictionary refinements based on phonetic consensus and non-uniform pronunciation reduction. 1697-1700 - Abdelkhalek Messaoudi, Lori Lamel, Jean-Luc Gauvain:
Transcription of arabic broadcast news. 1701-1704 - Takahiro Shinozaki, Sadaoki Furui:
Spontaneous speech recognition using a massively parallel decoder. 1705-1708 - Tanja Schultz, Qin Jin, Kornel Laskowski, Yue Pan, Florian Metze, Christian Fügen:
Issues in meeting transcription - the ISL meeting transcription system. 1709-1712 - Katsutoshi Ohtsuki, Nobuaki Hiroshima, Shoichi Matsunaga, Yoshihiko Hayashi:
Multi-pass ASR using vocabulary expansion. 1713-1716 - Vlasios Doumpiotis, William Byrne:
Pinched lattice minimum Bayes risk discriminative training for large vocabulary continuous speech recognition. 1717-1720 - Izhak Shafran, William Byrne:
Task-specific minimum Bayes-risk decoding using learned edit distance. 1945-1948 - Rong Zhang, Alexander I. Rudnicky:
Apply n-best list re-ranking to acoustic model combinations of boosting training. 1949-1952 - Do Yeong Kim, Srinivasan Umesh, Mark J. F. Gales, Thomas Hain, Philip C. Woodland:
Using VTLN for broadcast news transcription. 1953-1956 - Andreas Stolcke, Chuck Wooters, Ivan Bulyko, Martin Graciarena, Scott Otterson, Barbara Peskin, Mari Ostendorf, David Gelbart, Nikki Mirghafori, Tuomo W. Pirinen:
From switchboard to meetings: development of the 2004 ICSI-SRI-UW meeting recognition system. 1957-1960 - Anand Venkataraman, Andreas Stolcke, Wen Wang, Dimitra Vergyri, Jing Zheng, Venkata Ramana Rao Gadde:
An efficient repair procedure for quick transcriptions. 1961-1964 - Yao Qian, Tan Lee, Frank K. Soong:
Tone information as a confidence measure for improving Cantonese LVCSR. 1965-1968
Speech Science
- Danielle Duez:
Temporal variables in parkinsonian speech. 461-464 - Olov Engwall:
Speaker adaptation of a three-dimensional tongue model. 465-468 - Nicole Cooper, Anne Cutler:
Perception of non-native phonemes in noise. 469-472 - Hideki Kawahara, Hideki Banno, Toshio Irino, Jiang Jin:
Intelligibility of degraded speech from smeared STRAIGHT spectrum. 473-476 - Young-Ik Kim, Rhee Man Kil:
Sound source localization based on zero-crosing peak-amplitude coding. 477-480 - Sachiyo Kajikawa, Laurel Fais, Shigeaki Amano, Janet F. Werker:
Adult and infant sensitivity to phonotactic features in spoken Japanese. 481-484 - Phil D. Green, James Carmichael:
Revisiting dysarthria assessment intelligibility metrics. 485-488 - Valter Ciocca, Tara L. Whitehill, Joan K.-Y. Ma:
The effect of intonation on perception of Cantonese lexical tones. 489-492 - Toshiko Isei-Jaakkola:
Maximum short quantity in Japanese and finish in two perception tests with F0 and db variants. 493-496 - Paavo Alku, Matti Airas, Brad H. Story:
Evaluation of an inverse filtering technique using physical modeling of voice production. 497-500 - Hui-ju Hsu, Janice Fon:
Positional and phonotactic effects on the realization of taiwan Mandarin tone 2. 501-504 - Karl Schnell, Arild Lacroix:
Speech production based on lossy tube models: unit concatenation and sound transitions. 505-508 - Qin Yan, Saeed Vaseghi, Dimitrios Rentzos, Ching-Hsiang Ho:
Modelling and ranking of differences across formants of british, australian and american accents. 509-512 - Tatsuya Kitamura, Satoru Fujita, Kiyoshi Honda, Hironori Nishimoto:
An experimental method for measuring transfer functions of acoustic tubes. 513-516 - Takuya Tsuji, Tokihiko Kaburagi, Kohei Wakamiya, Jiji Kim:
Estimation of the vocal tract spectrum from articulatory movements using phoneme-dependent neural networks. 517-520 - Kunitoshi Motoki, Hiroki Matsuzaki:
Computation of the acoustic characteristics of vocal-tract models with geometrical perturbation. 521-524 - P. Vijayalakshmi, M. Ramasubba Reddy:
Analysis of hypernasality by synthesis. 525-528 - Abdellah Kacha, Francis Grenez, Frédéric Bettens, Jean Schoentgen:
Adaptive long-term predictive analysis of disordered speech. 529-532 - Slobodan Jovicic, Sandra Antesevic, Zoran Saric:
Phoneme restoration in degraded speech communication. 533-536 - Maria Marinaki, Constantine Kotropoulos, Ioannis Pitas, Nikolaos Maglaveras:
Automatic detection of vocal fold paralysis and edema. 537-540
Novel Features in ASR
- Yasuhiro Minami, Erik McDermott, Atsushi Nakamura, Shigeru Katagiri:
A theoretical analysis of speech recognition based on feature trajectory models. 549-552 - Zhijian Ou, Zuoying Wang:
Discriminative combination of multiple linear predictions for speech recognition. 553-556 - Davood Gharavian, Seyed Mohammad Ahadi:
Use of formants in stressed and unstressed continuous speech recognition. 557-560 - Konstantin Markov, Satoshi Nakamura, Jianwu Dang:
Integration of articulatory dynamic parameters in HMM/BN based speech recognition system. 561-564 - Leigh David Alsteris, Kuldip K. Paliwal:
ASR on speech reconstructed from short-time fourier phase spectra. 565-568
Spoken and Natural Language Understanding
- Robert Lieb, Tibor Fábián, Günther Ruske, Matthias Thomae:
Estimation of semantic confidences on lattice hierarchies. 569-572 - Fumiyo Fukumoto, Yoshimi Suzuki:
Learning subject drift for topic tracking. 573-576 - Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Barbara Peskin, Mary P. Harper, Yang Liu:
The ICSI-SRI-UW metadata extraction system. 577-580 - Mark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang:
Automatic detection of contrast for speech understanding. 581-584 - Nick Jui-Chang Wang, Jia-Lin Shen, Ching-Ho Tsai:
Integrating layer concept inform ation into n-gram modeling for spoken language understanding. 585-588 - Junyan Chen, Ji Wu, Zuoying Wang:
A robust understanding model for spoken dialogues. 589-592 - Chai Wutiwiwatchai, Sadaoki Furui:
Belief-based nonlinear rescoring in Thai speech understanding. 2129-2133 - Toshihiko Itoh, Atsuhiko Kai, Yukihiro Itoh, Tatsuhiro Konishi:
An understanding strategy based on plausibility score in recognition history using CSR confidence measure. 2133-2136 - Sangkeun Jung, Minwoo Jeong, Gary Geunbae Lee:
Speech recognition error correction using maximum entropy language model. 2137-2140 - Xiang Li, Juan M. Huerta:
Discriminative training of compound-word based multinomial classifiers for speech routing. 2141-2144 - Jihyun Eun, Changki Lee, Gary Geunbae Lee:
An information extraction approach for spoken language understanding. 2145-2148 - David Horowitz, Partha Lal, Pierce Gerard Buckley:
A maximum entropy shallow functional parser for spoken language understanding. 2149-2152 - Qiang Huang, Stephen J. Cox:
Mixture language models for call routing. 2153-2156 - Chung-Hsien Wu, Jui-Feng Yeh, Ming-Jun Chen:
Speech act identification using an ontology-based partial pattern tree. 2157-2160 - Ye-Yi Wang, Yun-Cheng Ju:
Creating speech recognition grammars from regular expressions for alphanumeric concepts. 2161-2164 - Isabel Trancoso, Paulo Araújo, Céu Viana, Nuno J. Mamede:
Poetry assistant. 2165-2168 - Tasuku Kitade, Tatsuya Kawahara, Hiroaki Nanjo:
Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers. 2169-2172 - Tomohiro Ohno, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki:
Robust dependency parsing of spontaneous Japanese speech and its evaluation. 2173-2176 - Wolfgang Minker, Dirk Bühler, Christiane Beuschel:
Strategies for optimizing a stochastic spoken natural language parser. 2177-2180 - Tzu-Lun Lee, Ya-Fang He, Yun-Ju Huang, Shu-Chuan Tseng, Robert Eklund:
Prolongation in spontaneous Mandarin. 2181-2184 - Yuki Irie, Shigeki Matsubara, Nobuo Kawaguchi, Yukiko Yamaguchi, Yasuyoshi Inagaki:
Speech intention understanding based on decision tree learning. 2185-2188 - Satanjeev Banerjee, Alexander I. Rudnicky:
Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants. 2189-2192 - Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh, Zhigang Deng, Sungbok Lee, Shrikanth S. Narayanan, Carlos Busso:
An acoustic study of emotions expressed in speech. 2193-2196 - Tatsuya Kawahara, Ian Richard Lane, Tomoko Matsui, Satoshi Nakamura:
Topic classification and verification modeling for out-of-domain utterance detection. 2197-2200 - So-Young Park, Yong-Jae Kwak, Joon-Ho Lim, Hae-Chang Rim, Soo-Hong Kim:
Partially lexicalized parsing model utilizing rich features. 2201-2204 - Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi:
Clustering similar nouns for selecting related news articles. 2205-2208 - Leonardo Badino:
Chinese text word-segmentation considering semantic links among sentences. 2209-2212 - Do-Gil Lee, Hae-Chang Rim:
Syllable-based probabilistic morphological analysis model of Korean. 2213-2216
Speaker Segmentation and Clustering
- Fabio Valente, Christian Wellekens:
Scoring unknown speaker clustering : VB vs. BIC. 593-596 - Qin Jin, Tanja Schultz:
Speaker segmentation and clustering in meetings. 597-600 - Lori Lamel, Jean-Luc Gauvain, Leonardo Canseco-Rodriguez:
Speaker diarization from speech transcripts. 601-604 - Xavier Anguera Miró, Javier Hernando Pericas:
Evolutive speaker segmentation using a repository system. 605-608 - Hagai Aronowitz, David Burshtein, Amihood Amir:
Speaker indexing in audio archives using test utterance Gaussian mixture modeling. 609-612 - Antoine Raux:
Automated lexical adaptation and speaker clustering based on pronunciation habits for non-native speech recognition. 613-616
Speech Processing in a Packet Network Environment
- Kuldip K. Paliwal, Stephen So:
Scalable distributed speech recognition using multi-frame GMM-based block quantization. 617-620 - Naveen Srinivasamurthy, Kyu Jeong Han, Shrikanth S. Narayanan:
Robust speech recognition over packet networks: an overview. 621-624 - Thomas Eriksson, Samuel Kim, Hong-Goo Kang, Chungyong Lee:
Theory for speaker recognition over IP. 625-628 - Wu Chou, Feng Liu:
Voice portal services in packet network and voIP environment. 629-632 - Peter Kabal, Colm Elliott:
Synchronization of speaker selection for centralized tandem free voIP conferencing. 633-636 - Akitoshi Kataoka, Yusuke Hiwasaki, Toru Morinaga, Jotaro Ikedo:
Measuring the perceived importance of time- and frequency-divided speech blocks for transmitting over packet networks. 637-640 - Moo Young Kim, W. Bastiaan Kleijn:
Comparison of transmitter - based packet-loss recovery techniques for voice transmission. 641-644
Acoustic Modeling
- Denis Jouvet, Ronaldo O. Messina:
Context dependent "long units" for speech recognition. 645-648 - Shinichi Yoshizawa, Kiyohiro Shikano:
Rapid EM training based on model-integration. 649-652 - Dominique Fohr, Odile Mella, Irina Illina, Christophe Cerisara:
Experiments on the accuracy of phone models and liaison processing in a French broadcast news transcription system. 653-656 - Jorge F. Silva, Shrikanth S. Narayanan:
A statistical discrimination measure for hidden Markov models based on divergence. 657-660 - Jan Stadermann, Gerhard Rigoll:
A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition. 661-664 - Dirk Knoblauch:
Data driven number-of-states selection in HMM topologies. 665-668 - Youngkyu Cho, Sung-a Kim, Dongsuk Yook:
Hybrid model using subspace distribution clustering hidden Markov models and semi-continuous hidden Markov models for embedded speech recognizers. 669-672 - Peder A. Olsen, Karthik Visweswariah:
Fast clustering of Gaussians and the virtue of representing Gaussians in exponential model format. 673-676 - Karen Livescu, James R. Glass:
Feature-based pronunciation modeling with trainable asynchrony probabilities. 677-680 - Hong-Kwang Jeff Kuo, Yuqing Gao:
Maximum entropy direct model as a unified model for acoustic modeling in speech recognition. 681-684 - Yu Zhu, Tan Lee:
Explicit duration modeling for Cantonese connected-digit recognition. 685-688 - Arthur Chan, Mosur Ravishankar, Alexander I. Rudnicky, Jahanzeb Sherwani:
Four-layer categorization scheme of fast GMM computation techniques in large vocabulary continuous speech recognition systems. 689-692 - Junho Park, Hanseok Ko:
Compact acoustic model for embedded implementation. 693-696 - Takatoshi Jitsuhiro, Satoshi Nakamura:
Increasing the mixture components of non-uniform HMM structures based on a variational Bayesian approach. 697-700 - Panu Somervuo:
Comparison of ML, MAP, and VB based acoustic models in large vocabulary speech recognition. 701-704 - Wolfgang Macherey, Ralf Schlüter, Hermann Ney:
Discriminative training with tied covariance matrices. 705-708 - Frank Diehl, Asunción Moreno:
Acoustic phonetic modeling using local codebook features. 709-712 - Gue Jun Jung, Su-Hyun Kim, Yung-Hwan Oh:
An efficient codebook design in SDCHMM for mobile communication environments. 713-716 - Makoto Shozakai, Goshu Nagino:
Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models. 717-720 - Myoung-Wan Koo, Ho-Hyun Jeon, Sang-Hong Lee:
Context dependent phoneme duration modeling with tree-based state tying. 721-724 - John Scott Bridle:
Towards better understanding of the model implied by the use of dynamic features in HMMs. 725-728
Prosody Modeling and Generation
- Jianfeng Li, Guoping Hu, Ren-Hua Wang:
Chinese prosody phrase break prediction based on maximum entropy model. 729-732 - Krothapalli Sreenivasa Rao, Bayya Yegnanarayana:
Intonation modeling for indian languages. 733-736 - Yu Zheng, Gary Geunbae Lee, Byeongchang Kim:
Using multiple linguistic features for Mandarin phrase break prediction in maximum-entropy classification framework. 737 - Ian Read, Stephen Cox:
Using part-of-speech for predicting phrase breaks. 741-744 - David Escudero Mancebo, Valentín Cardeñoso-Payo:
A proposal to quantitatively select the right intonation unit in data-driven intonation modeling. 745-748 - Jinfu Ni, Hisashi Kawai, Keikichi Hirose:
Formulating contextual tonal variations in Mandarin. 749-752 - Salma Mouline, Olivier Boëffard, Paul C. Bagshaw:
Automatic adaptation of the momel F0 stylisation algorithm to new corpora. 753-756 - Pablo Daniel Agüero, Klaus Wimmer, Antonio Bonafonte:
Joint extraction and prediction of fujisaki's intonation model parameters. 757-760 - Panagiotis Zervas, Nikos Fakotakis, George K. Kokkinakis, Georgios Kouroupetroglou, Gerasimos Xydas:
Evaluation of corpus based tone prediction in mismatched environments for greek tts synthesis. 761-764 - Ziyu Xiong, Juanwen Chen:
The duration of pitch transition phase and its relative factors. 765-768 - Yu Hu, Ren-Hua Wang, Lu Sun:
Polynomial regression model for duration prediction in Mandarin. 769-772 - Michelle Tooher, John G. McKenna:
Prediction of the glottal LF parameters using regression trees. 773-776 - Volker Dellwo, Bianca Aschenberner, Petra Wagner, Jana Dancovicova, Ingmar Steiner:
Bonntempo-corpus and bonntempo-tools: a database for the study of speech rhythm and rate. 777-780 - Wentao Gu, Keikichi Hirose, Hiroya Fujisaki:
Analysis of F0 contours of Cantonese utterances based on the command-response model. 781-784 - Marion Dohen, Hélène Loevenbruck:
Pre-focal rephrasing, focal enhancement and postfocal deaccentuation in French. 785-788 - Sridhar Krishna Nemala, Partha Pratim Talukdar, Kalika Bali, A. G. Ramakrishnan:
Duration modeling for hindi text-to-speech synthesis system. 789-792 - Nemala Sridhar Krishna, Hema A. Murthy:
A new prosodic phrasing model for indian language telugu. 793-796 - Oliver Jokisch, Michael Hofmann:
Evolutionary optimization of an adaptive prosody model. 797-800 - Gerasimos Xydas, Georgios Kouroupetroglou:
An intonation model for embedded devices based on natural F0 samples. 801-804 - Katerina Vesela, Nino Peterek, Eva Hajicová:
Prosodic characteristics of czech contrastive topic. 805-808
Multi-Sensor ASR
- Martin Graciarena, Federico Cesari, Horacio Franco, Gregory K. Myers, Cregg Cowan, Victor Abrash:
Combination of standard and throat microphones for robust speech recognition in highly noisy environments. 809-812 - Cenk Demiroglu, David V. Anderson:
Noise robust digit recognition using a glottal radar sensor for voicing detection. 813-816 - Dominik Raub, John W. McDonough, Matthias Wölfel:
A cepstral domain maximum likelihod beamformer for speech recognition. 817-820 - Naoya Mochiki, Tetsunori Kobayashi, Toshiyuki Sekiya, Tetsuji Ogawa:
Recognition of three simultaneous utterance of speech by four-line directivity microphone mounted on head of robot. 821-824 - Shigeki Sagayama, Okajima Takashi, Yutaka Kamamoto, Takuya Nishimoto:
Complex spectrum circle centroid for microphone-array-based noisy speech recognition. 825-828 - Larry P. Heck, Mark Z. Mao:
Automatic speech recognition of co-channel speech: integrated speaker and speech recognition approach. 829-832
Multi-Lingual Speech Processing
- José B. Mariño, Asunción Moreno, Albino Nogueiras:
A first experience on multilingual acoustic modeling of the languages spoken in morocco. 833-836 - Mónica Caballero, Asunción Moreno, Albino Nogueiras:
Data driven multidialectal phone set for Spanish dialects. 837-840 - Daniela Oria, Akos Vetek:
Multilingual e-mail text processing for speech synthesis. 841-844 - Harald Romsdorfer, Beat Pfister:
Multi-context rules for phonological processing in polyglot TTS synthesis. 845-848 - Leonardo Badino, Claudia Barolo, Silvia Quazza:
A general approach to TTS reading of mixed-language texts. 849-852 - Panayiotis G. Georgiou, Shrikanth S. Narayanan, Hooman Shirani Mehr:
Context dependent statistical augmentation of persian transcripts. 853-856
Speech Enhancement
- Cenk Demiroglu, David V. Anderson:
A soft decision MMSE amplitude estimator as a noise preprocessor to speech coder s using a glottal sensor. 857-860 - Rongqiang Hu, David V. Anderson:
Single acoustic-channel speech enhancement based on glottal correlation using non-acoustic sensor. 861-864 - Xianxian Zhang, John H. L. Hansen, Kathryn Hoberg Arehart, Jessica Rossi-Katz:
In-vehicle based speech processing for hearing impaired subjects. 865-868 - Sriram Srinivasan, W. Bastiaan Kleijn:
Speech enhancement using adaptive time-domain segmentation. 869-872 - Tomohiro Nakatani, Keisuke Kinoshita, Masato Miyoshi, Parham Zolfaghari:
Harmonicity based monaural speech dereverberation with time warping and F0 adaptive window. 873-876 - Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:
Dereverberation of speech signals based on linear prediction. 877-880
Speech and Affect
- Nick Campbell:
Perception of affect in speech - towards an automatic processing of paralinguistic information in spoken conversation. 881-884 - Noël Chateau, Valérie Maffiolo, Christophe Blouin:
Analysis of emotional speech in voice mail messages: the influence of speakers' gender. 885-888 - Chul Min Lee, Serdar Yildirim, Murtaza Bulut, Abe Kazemzadeh, Carlos Busso, Zhigang Deng, Sungbok Lee, Shrikanth S. Narayanan:
Emotion recognition based on phoneme classes. 889-892 - Peter Robinson, Tal Sobol Shikler:
Visualizing dynamic features of expressions in speech. 893-896 - Aijun Li, Haibo Wang:
Friendly speech analysis and perception in standard Chinese. 897-900 - Ailbhe Ní Chasaide, Christer Gobl:
Decomposing linguistic and affective components of phonatory quality. 901-904 - Dan-Ning Jiang, Lian-Hong Cai:
Classifying emotion in Chinese speech by decomposing prosodic features. 1325-1328 - Chen Yu, Paul M. Aoki, Allison Woodruff:
Detecting user engagement in everyday conversations. 1329-1332 - Takashi X. Fujisawa, Norman D. Cook:
Identifying emotion in speech prosody using acoustical cues of harmony. 1333-1336 - Jianhua Tao:
Context based emotion detection from text input. 1337-1340 - Atsushi Iwai, Yoshikazu Yano, Shigeru Okuma:
Complex emotion recognition system for a specific user using SOM based on prosodic features. 1341-1344 - Hoon-Young Cho, Kaisheng Yao, Te-Won Lee:
Emotion verification for emotion detection and unknown emotion rejection. 1345-1348 - Keikichi Hirose:
Improvement in corpus-based generation of F0 contours using generation process model for emotional speech synthesis. 1349-1352
Speech Features
- Rajesh Mahanand Hegde, Hema A. Murthy, Venkata Ramana Rao Gadde:
Continuous speech recognition using joint features derived from the modified group delay function and MFCC. 905-908 - Hua Yu:
Phase-space representation of speech. 909-912 - Hema A. Murthy, Rajesh Mahanand Hegde, Venkata Ramana Rao Gadde:
The modified group delay feature: a new spectral representation of speech. 913-916 - Oh-Wook Kwon, Te-Won Lee:
ICA-based feature extraction for phoneme recognition. 917-920 - Qifeng Zhu, Barry Y. Chen, Nelson Morgan, Andreas Stolcke:
On using MLP features in LVCSR. 921-924 - Barry Y. Chen, Qifeng Zhu, Nelson Morgan:
Learning long-term temporal features in LVCSR using neural networks. 925-928 - T. V. Sreenivas, G. V. Kiran, A. G. Krishna:
Neural "spike rate spectrum" as a noise robust, speaker invariant feature for automatic speech recognition. 929-932 - Yoshihisa Nakatoh, Makoto Nishizaki, Shinichi Yoshizawa, Maki Yamada:
An adaptive MEL-LPC analysis for speech recognition. 933-936 - Kentaro Ishizuka, Noboru Miyazaki, Tomohiro Nakatani, Yasuhiro Minami:
Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition. 937-940 - Carlos Toshinori Ishi:
A new acoustic measure for aspiration noise detection. 941-944 - Kris Demuynck, Oscar Garcia, Dirk Van Compernolle:
Synthesizing speech from speech recognition parameters. 945-948 - Marios Athineos, Hynek Hermansky, Daniel P. W. Ellis:
LP-TRAP: linear predictive temporal patterns. 949-952 - Xiang Li, Richard M. Stern:
Parallel feature generation based on maximizing normalized acoustic likelihood. 953-956 - Kun-Ching Wang:
An adaptive band-partitioning spectral entropy based speech detection in realistic noisy environments. 957-960 - Javier Ramírez, José C. Segura, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio:
Improved voice activity detection combining noise reduction and subband divergence measures. 961-964 - Kiyoung Park, Changkyu Choi, Jeongsu Kim:
Voice activity detection using global soft decision with mixture of Gaussian model. 965-968 - Thomas Kemp, Climent Nadeu, Yin Hay Lam, Josep Maria Sola i Caros:
Environmental robust features for speech detection. 969-972 - Kornel Laskowski, Qin Jin, Tanja Schultz:
Crosscorrelation-based multispeaker speech activity detection. 973-976 - Shang-nien Tsai:
Improved robustness of time-frequency principal components (TFPC) by synergy of methods in different domains. 977-980 - Li Deng, Yu Dong, Alex Acero:
A quantitative model for formant dynamics and contextually assimilated reduction in fluent speech. 981-984 - Gernot Kubin, Tuan Van Pham:
DWT-based classification of acoustic-phonetic classes and phonetic units. 985-988 - Yong-Choon Cho, Seungjin Choi:
Learning nonnegative features of spectro-temporal sounds for classification. 989-992
Language Modeling, Multimodal & Multilingual Speech Processing
- Sungyup Chung, Keikichi Hirose, Nobuaki Minematsu:
N-gram language modeling of Japanese using bunsetsu boundaries. 993-996 - Langzhou Chen, Lori Lamel, Jean-Luc Gauvain, Gilles Adda:
Dynamic language modeling for broadcast news. 997-1000 - Ren-Yuan Lyu, Dau-Cheng Lyu, Min-Siong Liang, Min-Hong Wang, Yuang-Chin Chiang, Chun-Nan Hsu:
A unified framework for large vocabulary speech recognition of mutually unintelligible Chinese "regionalects". 1001-1004 - Ielka van der Sluis, Emiel Krahmer:
The influence of target size and distance on the production of speech and gesture in multimodal referring expressions. 1005-1008 - Anurag Kumar Gupta, Tasos Anastasakos:
Dynamic time windows for multimodal input fusion. 1009-1012 - Raymond H. Lee, Anurag Kumar Gupta:
MICot : a tool for multimodal input data collection. 1013-1016 - Chakib Tadj, Hicham Djenidi, Madjid Haouani, Amar Ramdane-Cherif, Nicole Lévy:
Simulating multimodal applications. 1017-1020 - Jakob Schou Pedersen, Paul Dalsgaard, Børge Lindberg:
A multimodal communication aid for global aphasia patients. 1021-1024 - Hirofumi Yamamoto, Gen-ichiro Kikui, Yoshinori Sagisaka:
Mis-recognized utterance detection using hierarchical language model. 1025-1028 - Marko Moberg, Kimmo Pärssinen, Juha Iso-Sipilä:
Cross-lingual phoneme mapping for multilingual synthesis systems. 1029-1032 - Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno, Tsuyoshi Tasaki, Takeshi Yamaguchi:
Robot motion control using listener's back-channels and head gesture information. 1033-1036 - Sakriani Sakti, Arry Akhmad Arman, Satoshi Nakamura, Paulus Hutagaol:
Indonesian speech recognition for hearing and speaking impaired people. 1037-1040 - Mohsen A. Rashwan:
A two phase arabic language model for speech recognition and other language applications. 1041-1044 - Yuya Akita, Tatsuya Kawahara:
Language model adaptation based on PLSA of topics and speakers. 1045-1048 - Hans J. G. A. Dolfing, Pierce Gerard Buckley, David Horowitz:
Unified language modeling using finite-state transducers with first applications. 1049-1052 - Katsunobu Itou, Atsushi Fujii, Tomoyosi Akiba:
Effects of language modeling on speech-driven question answering. 1053-1056 - Abhinav Sethy, Shrikanth S. Narayanan, Bhuvana Ramabhadran:
Measuring convergence in language model estimation using relative entropy. 1057-1060
Detection and Classification in ASR
- Rongqing Huang, John H. L. Hansen:
High-level feature weighted GMM network for audio stream classification. 1061-1064 - Jindrich Zdánský, Petr David, Jan Nouza:
An improved preprocessor for the automatic transcription of broadcast news audio stream. 1065-1068