iBet uBet web content aggregator. Adding the entire web to your favor.

Link to original content: https://dblp.org/pid/39/3245-1.html

dblp: Shinji Watanabe 0001

default search action

combined dblp search
author search
venue search
publication search

ask others

Shinji Watanabe 0001

> Home > Persons

Person information

affiliation: Carnegie Mellon University, Pittsburgh, PA, USA
affiliation (former): Johns Hopkins University, Baltimore, MD, USA
affiliation (2012 - 2017): Mitsubishi Electric Research Laboratories, Cambridge, MA, USA
affiliation (2001 - 2011): NTT Communication Science Laboratories, Kyoto, Japan
affiliation (PhD 2006): Waseda University, Tokyo, Japan

Other persons with the same name

see FAQ

SPARQL queries

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j60]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/taslp/PrabhavalkarHSSW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/PrabhavalkarHSSW24
Rohit Prabhavalkar, Takaaki Hori, Tara N. Sainath, Ralf Schlüter, Shinji Watanabe:
End-to-End Speech Recognition: A Survey. IEEE ACM Trans. Audio Speech Lang. Process. 32: 325-351 (2024)
[j59]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/taslp/SaekiMLWTS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/SaekiMLWTS24
Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari:
Text-Inductive Graphone-Based Language Adaptation for Low-Resource Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1829-1844 (2024)
[j58]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/taslp/WuDWB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WuDWB24
Shih-Lun Wu, Chris Donahue, Shinji Watanabe, Nicholas J. Bryan:
Music ControlNet: Multiple Time-Varying Controls for Music Generation. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2692-2703 (2024)
[j57]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/YangCHLLWSCTHFCLCHTLLMWL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/YangCHLLWSCTHFCLCHTLLMWL24
Shu-Wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi Lee:
A Large-Scale Evaluation of Speech Foundation Models. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2884-2899 (2024)
[c417]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/HuangLYSCYWHHLR24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/HuangLYSCYWHHLR24
Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Yuexian Zou, Zhou Zhao, Shinji Watanabe:
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head. AAAI 2024: 23802-23804
[c416]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/HeCTRS0NML24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HeCTRS0NML24
Taiqi He, Kwanghee Choi, Lindia Tjuatja, Nathaniel Robinson, Jiatong Shi, Shinji Watanabe, Graham Neubig, David R. Mortensen, Lori S. Levin:
Wav2Gloss: Generating Interlinear Glossed Text from Speech. ACL (1) 2024: 568-582
[c415]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/PengS0024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/PengS0024
Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe:
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification. ACL (1) 2024: 10192-10209
[c414]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/AroraPCHSJDCSLL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/AroraPCHSJDCSLL24
Siddhant Arora, Ankita Pasad, Chung-Ming Chien, Jionghao Han, Roshan S. Sharma, Jee-weon Jung, Hira Dhamyal, William Chen, Suwon Shon, Hung-yi Lee, Karen Livescu, Shinji Watanabe:
On the Evaluation of Speech Foundation Models for Spoken Language Understanding. ACL (Findings) 2024: 11923-11938
[c413]
- view
  - electronic edition @ aclanthology.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/emnlp/LuSY024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/LuSY024
Yichen Lu, Jiaqi Song, Chao-Han Huck Yang, Shinji Watanabe:
FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model. EMNLP (Industry Track) 2024: 440-451
[c412]
- view
  - electronic edition @ aclanthology.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/emnlp/ChenZPLTSCML024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/ChenZPLTSCML024
William Chen, Wangyou Zhang, Yifan Peng, Xinjian Li, Jinchuan Tian, Jiatong Shi, Xuankai Chang, Soumi Maiti, Karen Livescu, Shinji Watanabe:
Towards Robust Speech Representation Learning for Thousands of Languages. EMNLP 2024: 10205-10224
[c411]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenWWDLSWCSWYP24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenWWDLSWCSWYP24
Hang Chen, Shilong Wu, Chenxi Wang, Jun Du, Chin-Hui Lee, Sabato Marco Siniscalchi, Shinji Watanabe, Jingdong Chen, Odette Scharenborg, Zhong-Qiu Wang, Bao-Cai Yin, Jia Pan:
Summary on the Multimodal Information-Based Speech Processing (MISP) 2023 Challenge. ICASSP Workshops 2024: 123-124
[c410]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WuCWJGR024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WuCWJGR024
Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-Weon Jung, François G. Germain, Jonathan Le Roux, Shinji Watanabe:
Improving Audio Captioning Models with Fine-Grained Audio Features, Text Embedding Supervision, and LLM Mix-Up Augmentation. ICASSP 2024: 316-320
[c409]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LeeCKW024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LeeCKW024
Younglo Lee, Shukjae Choi, Byeong-Yeol Kim, Zhongqiu Wang, Shinji Watanabe:
Boosting Unknown-Number Speaker Separation with Transformer Decoder-Based Attractor. ICASSP 2024: 446-450
[c408]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShakeelSPW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShakeelSPW24
Muhammad Shakeel, Yui Sudo, Yifan Peng, Shinji Watanabe:
Joint Optimization of Streaming and Non-Streaming Automatic Speech Recognition with Multi-Decoder and Knowledge Distillation. ICASSP Workshops 2024: 570-574
[c407]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChoiJ024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChoiJ024
Kwanghee Choi, Jee-Weon Jung, Shinji Watanabe:
Understanding Probe Behaviors Through Variational Bounds of Mutual Information. ICASSP 2024: 5655-5659
[c406]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KimCMY0R24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KimCMY0R24
Minsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe, Yong Man Ro:
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-Training and Multi-Modal Tokens. ICASSP 2024: 7970-7974
[c405]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MedinaTSE00M24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MedinaTSE00M24
Salvador Medina, Sarah L. Taylor, Carsten Stoll, Gareth Edwards, Alex Hauptmann, Shinji Watanabe, Iain A. Matthews:
PhISANet: Phonetically Informed Speech Animation Network. ICASSP 2024: 8225-8229
[c404]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YeoK0R24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YeoK0R24
Jeong Hun Yeo, Minsu Kim, Shinji Watanabe, Yong Man Ro:
Visual Speech Recognition for Languages with Limited Labeled Data Using Automatic Labels from Whisper. ICASSP 2024: 10471-10475
[c403]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/FutamiTKOA024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/FutamiTKOA024
Hayato Futami, Emiru Tsunoo, Yosuke Kashiwagi, Hiroaki Ogawa, Siddhant Arora, Shinji Watanabe:
Phoneme-Aware Encoding for Prefix-Tree-Based Contextual ASR. ICASSP 2024: 10641-10645
[c402]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Sudo0FP024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Sudo0FP024
Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Yifan Peng, Shinji Watanabe:
Contextualized Automatic Speech Recognition With Attention-Based Bias Phrase Boosted Beam Search. ICASSP 2024: 10896-10900
[c401]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShonKSH0L24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShonKSH0L24
Suwon Shon, Kwangyoun Kim, Prashant Sridhar, Yi-Te Hsu, Shinji Watanabe, Karen Livescu:
Generative Context-Aware Fine-Tuning of Self-Supervised Speech Models. ICASSP 2024: 11156-11160
[c400]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChangYCJLMSST0F24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChangYCJLMSST0F24
Xuankai Chang, Brian Yan, Kwanghee Choi, Jee-Weon Jung, Yichen Lu, Soumi Maiti, Roshan S. Sharma, Jiatong Shi, Jinchuan Tian, Shinji Watanabe, Yuya Fujita, Takashi Maekaku, Pengcheng Guo, Yao-Fei Cheng, Pavel Denisov, Kohei Saijo, Hsiu-Hsuan Wang:
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study. ICASSP 2024: 11481-11485
[c399]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/AroraS0K24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/AroraS0K24
Siddhant Arora, George Saon, Shinji Watanabe, Brian Kingsbury:
Semi-Autoregressive Streaming ASR with Label Context. ICASSP 2024: 11681-11685
[c398]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaekakuSCF024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaekakuSCF024
Takashi Maekaku, Jiatong Shi, Xuankai Chang, Yuya Fujita, Shinji Watanabe:
Hubertopic: Enhancing Semantic Representation of Hubert Through Self-Supervision Utilizing Topic Model. ICASSP 2024: 11741-11745
[c397]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Huang0NSHHMPW0P24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Huang0NSHHMPW0P24
Ruizhe Huang, Xiaohui Zhang, Zhaoheng Ni, Li Sun, Moto Hira, Jeff Hwang, Vimal Manohar, Vineel Pratap, Matthew Wiesner, Shinji Watanabe, Daniel Povey, Sanjeev Khudanpur:
Less Peaky and More Accurate CTC Forced Alignment by Label Priors. ICASSP 2024: 11831-11835
[c396]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/CornellJ0S24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/CornellJ0S24
Samuele Cornell, Jee-Weon Jung, Shinji Watanabe, Stefano Squartini:
One Model to Rule Them All ? Towards End-to-End Joint Speaker Diarization and Speech Recognition. ICASSP 2024: 11856-11860
[c395]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YanCAF024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YanCAF024
Brian Yan, Xuankai Chang, Antonios Anastasopoulos, Yuya Fujita, Shinji Watanabe:
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing. ICASSP 2024: 11941-11945
[c394]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HusseinYA0K24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HusseinYA0K24
Amir Hussein, Brian Yan, Antonios Anastasopoulos, Shinji Watanabe, Sanjeev Khudanpur:
Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization. ICASSP 2024: 11971-11975
[c393]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HusseinZKWYC00K24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HusseinZKWYC00K24
Amir Hussein, Dorsa Zeinali, Ondrej Klejch, Matthew Wiesner, Brian Yan, Shammur Absar Chowdhury, Ahmed Ali, Shinji Watanabe, Sanjeev Khudanpur:
Speech Collage: Code-Switched Audio Generation by Collaging Monolingual Corpora. ICASSP 2024: 12006-12010
[c392]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/JungSCR024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/JungSCR024
Jee-Weon Jung, Roshan S. Sharma, William Chen, Bhiksha Raj, Shinji Watanabe:
AugSumm: Towards Generalizable Speech Summarization Using Synthetic Labels from Large Language Models. ICASSP 2024: 12071-12075
[c391]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HuangLWHKWACSPS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HuangLWHKWACSPS24
Chien-Yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan S. Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-Yi Lee:
Dynamic-Superb: Towards a Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark For Speech. ICASSP 2024: 12136-12140
[c390]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KwakJNJJ0C24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KwakJNJJ0C24
Doyeop Kwak, Jaemin Jung, Kihyun Nam, Youngjoon Jang, Jee-Weon Jung, Shinji Watanabe, Joon Son Chung:
VoxMM: Rich Transcription of Conversations in the Wild. ICASSP 2024: 12551-12555
[c389]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenKOD024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenKOD024
William Chen, Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe:
Train Long and Test Long: Leveraging Full Document Contexts in Speech Processing. ICASSP 2024: 13066-13070
[c388]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaitiPCJC024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaitiPCJC024
Soumi Maiti, Yifan Peng, Shukjae Choi, Jee-Weon Jung, Xuankai Chang, Shinji Watanabe:
VoxtLM: Unified Decoder-Only Models for Consolidating Speech Recognition, Synthesis and Speech, Text Continuation Tasks. ICASSP 2024: 13326-13330
[c387]
- view
  - electronic edition @ ijcai.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/ijcai/Wang0024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/Wang0024
Zhong-Qiu Wang, Anurag Kumar, Shinji Watanabe:
Cross-Talk Reduction. IJCAI 2024: 5171-5180
[c386]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/WuSYTQLHB0J24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/WuSYTQLHB0J24
Yuning Wu, Jiatong Shi, Yifeng Yu, Yuxun Tang, Tao Qian, Yueqian Lin, Jionghao Han, Xinyi Bai, Shinji Watanabe, Qin Jin:
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm. ACM Multimedia 2024: 11279-11281
[c385]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/naacl/AroraFJPSKTL024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/naacl/AroraFJPSKTL024
Siddhant Arora, Hayato Futami, Jee-weon Jung, Yifan Peng, Roshan S. Sharma, Yosuke Kashiwagi, Emiru Tsunoo, Karen Livescu, Shinji Watanabe:
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions. NAACL-HLT 2024: 2754-2774
[i329]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-06806
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-06806
Jee-weon Jung, Roshan S. Sharma, William Chen, Bhiksha Raj, Shinji Watanabe:
AugSumm: towards generalizable speech summarization using synthetic labels from large language model. CoRR abs/2401.06806 (2024)
[i328]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-08835
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-08835
Jiyang Tang, Kwangyoun Kim, Suwon Shon, Felix Wu, Prashant Sridhar, Shinji Watanabe:
Improving ASR Contextual Biasing with Guided Attention. CoRR abs/2401.08835 (2024)
[i327]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-10449
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-10449
Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Yifan Peng, Shinji Watanabe:
Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search. CoRR abs/2401.10449 (2024)
[i326]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-12473
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-12473
Younglo Lee, Shukjae Choi, Byeong-Yeol Kim, Zhong-Qiu Wang, Shinji Watanabe:
Boosting Unknown-number Speaker Separation with Transformer Decoder-based Attractor. CoRR abs/2401.12473 (2024)
[i325]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-14271
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-14271
Wangyou Zhang, Jee-weon Jung, Shinji Watanabe, Yanmin Qian:
Improving Design of Input Condition Invariant Speech Enhancement. CoRR abs/2401.14271 (2024)
[i324]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-16658
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-16658
Yifan Peng, Jinchuan Tian, William Chen, Siddhant Arora, Brian Yan, Yui Sudo, Muhammad Shakeel, Kwanghee Choi, Jiatong Shi, Xuankai Chang, Jee-weon Jung, Shinji Watanabe:
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer. CoRR abs/2401.16658 (2024)
[i323]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-16812
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-16812
Takaaki Saeki, Soumi Maiti, Shinnosuke Takamichi, Shinji Watanabe, Hiroshi Saruwatari:
SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech Generation Leveraging NLP Evaluation Metrics. CoRR abs/2401.16812 (2024)
[i322]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-17230
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-17230
Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zakaria Aldeneh, Takuya Higuchi, Barry-John Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe:
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models. CoRR abs/2401.17230 (2024)
[i321]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-17619
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-17619
Jiatong Shi, Yueqian Lin, Xinyi Bai, Keyi Zhang, Yuning Wu, Yuxun Tang, Yifeng Yu, Qin Jin, Shinji Watanabe:
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and KiSing-v2. CoRR abs/2401.17619 (2024)
[i320]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-18045
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-18045
Yihan Wu, Soumi Maiti, Yifan Peng, Wangyou Zhang, Chenda Li, Yuyue Wang, Xihua Wang, Shinji Watanabe, Ruihua Song:
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition. CoRR abs/2401.18045 (2024)
[i319]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-00340
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-00340
Zakaria Aldeneh, Takuya Higuchi, Jee-weon Jung, Skyler Seto, Tatiana Likhomanenko, Stephen Shum, Ahmed Hussen Abdelaziz, Shinji Watanabe, Barry-John Theobald:
Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features? CoRR abs/2402.00340 (2024)
[i318]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-10427
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-10427
Muqiao Yang, Xiang Li, Umberto Cappellazzo, Shinji Watanabe, Bhiksha Raj:
Evaluating and Improving Continual Learning in Spoken Language Understanding. CoRR abs/2402.10427 (2024)
[i317]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-12654
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-12654
Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe:
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification. CoRR abs/2402.12654 (2024)
[i316]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-16021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-16021
Minsu Kim, Jee-weon Jung, Hyeongseop Rha, Soumi Maiti, Siddhant Arora, Xuankai Chang, Shinji Watanabe, Yong Man Ro:
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages. CoRR abs/2402.16021 (2024)
[i315]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-13169
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-13169
Taiqi He, Kwanghee Choi, Lindia Tjuatja, Nathaniel R. Robinson, Jiatong Shi, Shinji Watanabe, Graham Neubig, David R. Mortensen, Lori S. Levin:
Wav2Gloss: Generating Interlinear Glossed Text from Speech. CoRR abs/2403.13169 (2024)
[i314]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-09385
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-09385
Shu-Wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi Lee:
A Large-Scale Evaluation of Speech Foundation Models. CoRR abs/2404.09385 (2024)
[i313]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-13344
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-13344
Yui Sudo, Yosuke Fukumoto, Muhammad Shakeel, Yifan Peng, Shinji Watanabe:
Contextualized Automatic Speech Recognition with Dynamic Vocabulary. CoRR abs/2405.13344 (2024)
[i312]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-13514
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-13514
Muhammad Shakeel, Yui Sudo, Yifan Peng, Shinji Watanabe:
Joint Optimization of Streaming and Non-Streaming Automatic Speech Recognition with Multi-Decoder and Knowledge Distillation. CoRR abs/2405.13514 (2024)
[i311]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-20402
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-20402
Zhong-Qiu Wang, Anurag Kumar, Shinji Watanabe:
Cross-Talk Reduction. CoRR abs/2405.20402 (2024)
[i310]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-00899
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-00899
Xinjian Li, Shinnosuke Takamichi, Takaaki Saeki, William Chen, Sayaka Shiota, Shinji Watanabe:
YODAS: Youtube-Oriented Dataset for Audio and Speech. CoRR abs/2406.00899 (2024)
[i309]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-02560
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-02560
Ruizhe Huang, Xiaohui Zhang, Zhaoheng Ni, Li Sun, Moto Hira, Jeff Hwang, Vimal Manohar, Vineel Pratap, Matthew Wiesner, Shinji Watanabe, Daniel Povey, Sanjeev Khudanpur:
Less Peaky and More Accurate CTC Forced Alignment by Label Priors. CoRR abs/2406.02560 (2024)
[i308]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-02950
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-02950
Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Brian Yan, Jiatong Shi, Yifan Peng, Shinji Watanabe:
4D ASR: Joint Beam Search Integrating CTC, Attention, Transducer, and Mask Predict Decoders. CoRR abs/2406.02950 (2024)
[i307]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-04269
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-04269
Wangyou Zhang, Kohei Saijo, Jee-weon Jung, Chenda Li, Shinji Watanabe, Yanmin Qian:
Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement. CoRR abs/2406.04269 (2024)
[i306]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-04660
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-04660
Wangyou Zhang, Robin Scheibler, Kohei Saijo, Samuele Cornell, Chenda Li, Zhaoheng Ni, Anurag Kumar, Jan Pirklbauer, Marvin Sach, Shinji Watanabe, Tim Fingscheidt, Yanmin Qian:
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement. CoRR abs/2406.04660 (2024)
[i305]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-05339
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-05339
Jee-weon Jung, Xin Wang, Nicholas W. D. Evans, Shinji Watanabe, Hye-jin Shim, Hemlata Tak, Sidhhant Arora, Junichi Yamagishi, Joon Son Chung:
To what extent can ASV systems naturally defend against spoofing attacks? CoRR abs/2406.05339 (2024)
[i304]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-06185
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-06185
Julius Richter, Yi-Chiao Wu, Steven Krenn, Simon Welker, Bunlong Lay, Shinji Watanabe, Alexander Richard, Timo Gerkmann:
EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation. CoRR abs/2406.06185 (2024)
[i303]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-07725
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-07725
Xuankai Chang, Jiatong Shi, Jinchuan Tian, Yuning Wu, Yuxun Tang, Yihan Wu, Shinji Watanabe, Yossi Adi, Xie Chen, Qin Jin:
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units. CoRR abs/2406.07725 (2024)
[i302]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08396
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08396
Yoshiaki Bando, Tomohiko Nakamura, Shinji Watanabe:
Neural Blind Source Separation and Diarization for Distant Speech Recognition. CoRR abs/2406.08396 (2024)
[i301]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08619
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08619
Kwanghee Choi, Ankita Pasad, Tomohiko Nakamura, Satoru Fukayama, Karen Livescu, Shinji Watanabe:
Self-Supervised Speech Representations are More Phonetic than Semantic. CoRR abs/2406.08619 (2024)
[i300]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08641
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08641
Jiatong Shi, Shih-Heng Wang, William Chen, Martijn Bartelds, Vanya Bannihatti Kumar, Jinchuan Tian, Xuankai Chang, Dan Jurafsky, Karen Livescu, Hung-yi Lee, Shinji Watanabe:
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets. CoRR abs/2406.08641 (2024)
[i299]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08761
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08761
Yifeng Yu, Jiatong Shi, Yuning Wu, Shinji Watanabe:
VISinger2+: End-to-End Singing Voice Synthesis Augmented by Self-Supervised Learning Representation. CoRR abs/2406.08761 (2024)
[i298]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-09282
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-09282
Jinchuan Tian, Yifan Peng, William Chen, Kwanghee Choi, Karen Livescu, Shinji Watanabe:
On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models. CoRR abs/2406.09282 (2024)
[i297]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-09345
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-09345
Suwon Shon, Kwangyoun Kim, Yi-Te Hsu, Prashant Sridhar, Shinji Watanabe, Karen Livescu:
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding. CoRR abs/2406.09345 (2024)
[i296]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-09869
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-09869
Jiatong Shi, Xutai Ma, Hirofumi Inaguma, Anna Y. Sun, Shinji Watanabe:
MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model. CoRR abs/2406.09869 (2024)
[i295]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-10083
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-10083
Siddhant Arora, Ankita Pasad, Chung-Ming Chien, Jionghao Han, Roshan S. Sharma, Jee-weon Jung, Hira Dhamyal, William Chen, Suwon Shon, Hung-yi Lee, Karen Livescu, Shinji Watanabe:
On the Evaluation of Speech Foundation Models for Spoken Language Understanding. CoRR abs/2406.10083 (2024)
[i294]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-12317
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-12317
Hayato Futami, Siddhant Arora, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe:
Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model. CoRR abs/2406.12317 (2024)
[i293]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-12611
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-12611
Yosuke Kashiwagi, Hayato Futami, Emiru Tsunoo, Siddhant Arora, Shinji Watanabe:
Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting. CoRR abs/2406.12611 (2024)
[i292]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-13471
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-13471
Chenda Li, Samuele Cornell, Shinji Watanabe, Yanmin Qian:
Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement. CoRR abs/2406.13471 (2024)
[i291]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-16107
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-16107
Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe:
Decoder-only Architecture for Streaming End-to-end Speech Recognition. CoRR abs/2406.16107 (2024)
[i290]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-16120
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-16120
Muhammad Shakeel, Yui Sudo, Yifan Peng, Shinji Watanabe:
Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss. CoRR abs/2406.16120 (2024)
[i289]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-17246
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-17246
Hye-jin Shim, Md. Sahidullah, Jee-weon Jung, Shinji Watanabe, Tomi Kinnunen:
Beyond Silence: Bias Analysis through Loss and Asymmetric Approach in Audio Anti-Spoofing. CoRR abs/2406.17246 (2024)
[i288]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-00837
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-00837
William Chen, Wangyou Zhang, Yifan Peng, Xinjian Li, Jinchuan Tian, Jiatong Shi, Xuankai Chang, Soumi Maiti, Karen Livescu, Shinji Watanabe:
Towards Robust Speech Representation Learning for Thousands of Languages. CoRR abs/2407.00837 (2024)
[i287]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-03718
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-03718
Darshan Prabhu, Yifan Peng, Preethi Jyothi, Shinji Watanabe:
Multi-Convformer: Extending Conformer with Multiple Convolution Kernels. CoRR abs/2407.03718 (2024)
[i286]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-16447
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-16447
Samuele Cornell, Taejin Park, Steve Huang, Christoph Böddeker, Xuankai Chang, Matthew Maciejewski, Matthew Wiesner, Paola García, Shinji Watanabe:
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization. CoRR abs/2407.16447 (2024)
[i285]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-00624
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-00624
Yichen Lu, Jiaqi Song, Xuankai Chang, Hengwei Bian, Soumi Maiti, Shinji Watanabe:
SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data. CoRR abs/2408.00624 (2024)
[i284]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-07452
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-07452
Xi Xu, Siqi Ouyang, Brian Yan, Patrick Fernandes, William Chen, Lei Li, Graham Neubig, Shinji Watanabe:
CMU's IWSLT 2024 Simultaneous Speech Translation System. CoRR abs/2408.07452 (2024)
[i283]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-09215
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-09215
Samuele Cornell, Jordan Darefsky, Zhiyao Duan, Shinji Watanabe:
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition. CoRR abs/2408.09215 (2024)
[i282]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-07226
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-07226
Yuning Wu, Jiatong Shi, Yifeng Yu, Yuxun Tang, Tao Qian, Yueqian Lin, Jionghao Han, Xinyi Bai, Shinji Watanabe, Qin Jin:
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm. CoRR abs/2409.07226 (2024)
[i281]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-08711
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-08711
Jee-weon Jung, Wangyou Zhang, Soumi Maiti, Yihan Wu, Xin Wang, Ji-Hoon Kim, Yuta Matsunaga, Seyun Um, Jinchuan Tian, Hye-jin Shim, Nicholas W. D. Evans, Joon Son Chung, Shinnosuke Takamichi, Shinji Watanabe:
Text-To-Speech Synthesis In The Wild. CoRR abs/2409.08711 (2024)
[i280]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-09506
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-09506
Masao Someki, Kwanghee Choi, Siddhant Arora, William Chen, Samuele Cornell, Jionghao Han, Yifan Peng, Jiatong Shi, Vaibhav Srivastav, Shinji Watanabe:
ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration. CoRR abs/2409.09506 (2024)
[i279]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-09785
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-09785
Chao-Han Huck Yang, Taejin Park, Yuan Gong, Yuanchao Li, Zhehuai Chen, Yen-Ting Lin, Chen Chen, Yuchen Hu, Kunal Dhawan, Piotr Zelasko, Chao Zhang, Yun-Nung Chen, Yu Tsao, Jagadeesh Balam, Boris Ginsburg, Sabato Marco Siniscalchi, Eng Siong Chng, Peter Bell, Catherine Lai, Shinji Watanabe, Andreas Stolcke:
Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition. CoRR abs/2409.09785 (2024)
[i278]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-10788
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-10788
Li-Wei Chen, Takuya Higuchi, He Bai, Ahmed Hussen Abdelaziz, Alexander Rudnicky, Shinji Watanabe, Tatiana Likhomanenko, Barry-John Theobald, Zakaria Aldeneh:
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models. CoRR abs/2409.10788 (2024)
[i277]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-10791
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-10791
Zakaria Aldeneh, Takuya Higuchi, Jee-weon Jung, Li-Wei Chen, Stephen Shum, Ahmed Hussen Abdelaziz, Shinji Watanabe, Tatiana Likhomanenko, Barry-John Theobald:
Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-Labels. CoRR abs/2409.10791 (2024)
[i276]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-11274
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-11274
Yao-Fei Cheng, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Wen Shen Teo, Siddhant Arora, Shinji Watanabe:
Task Arithmetic for Language Expansion in Speech Translation. CoRR abs/2409.11274 (2024)
[i275]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-12370
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-12370
Yihan Wu, Yifan Peng, Yichen Lu, Xuankai Chang, Ruihua Song, Shinji Watanabe:
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts. CoRR abs/2409.12370 (2024)
[i274]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-12403
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-12403
Jinchuan Tian, Chunlei Zhang, Jiatong Shi, Hao Zhang, Jianwei Yu, Shinji Watanabe, Dong Yu:
Preference Alignment Improves Language Model-Based TTS. CoRR abs/2409.12403 (2024)
[i273]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-14085
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-14085
Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kai-Wei Chang, Jiawei Du, Ke-Han Lu, Alexander H. Liu, Ho-Lam Chung, Yuan-Kuei Wu, Dongchao Yang, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James R. Glass, Shinji Watanabe, Hung-yi Lee:
Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models. CoRR abs/2409.14085 (2024)
[i272]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-15732
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-15732
Yosuke Kashiwagi, Hayato Futami, Emiru Tsunoo, Siddhant Arora, Shinji Watanabe:
Hypothesis Clustering and Merging: Novel MultiTalker Speech Recognition with Speaker Tokens. CoRR abs/2409.15732 (2024)
[i271]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-15897
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-15897
Jiatong Shi, Jinchuan Tian, Yihan Wu, Jee-weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharthi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander H. Liu, Bhiksha Raj, Qin Jin, Ruihua Song, Shinji Watanabe:
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech. CoRR abs/2409.15897 (2024)
[i270]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-17285
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-17285
Jee-weon Jung, Yihan Wu, Xin Wang, Ji-Hoon Kim, Soumi Maiti, Yuta Matsunaga, Hye-jin Shim, Jinchuan Tian, Nicholas W. D. Evans, Joon Son Chung, Wangyou Zhang, Seyun Um, Shinnosuke Takamichi, Shinji Watanabe:
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild. CoRR abs/2409.17285 (2024)
[i269]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-18428
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-18428
Brian Yan, Vineel Pratap, Shinji Watanabe, Michael Auli:
Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking. CoRR abs/2409.18428 (2024)
[i268]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-03007
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-03007
Yichen Lu, Jiaqi Song, Chao-Han Huck Yang, Shinji Watanabe:
FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model. CoRR abs/2410.03007 (2024)
[i267]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-17485
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-17485
Yifan Peng, Krishna C. Puvvada, Zhehuai Chen, Piotr Zelasko, He Huang, Kunal Dhawan, Ke Hu, Shinji Watanabe, Jagadeesh Balam, Boris Ginsburg:
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning. CoRR abs/2410.17485 (2024)
2023
[j56]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/MaciejewskiSWK23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/MaciejewskiSWK23
Matthew Maciejewski, Jing Shi, Shinji Watanabe, Sanjeev Khudanpur:
A dilemma of ground truth in noisy speech separation and an approach to lessen the impact of imperfect training data. Comput. Speech Lang. 77: 101410 (2023)
[j55]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/jossw/LuCLZCNMYSWTQW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jossw/LuCLZCNMYSWTQW23
Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhong-Qiu Wang, Yu Tsao, Yanmin Qian, Shinji Watanabe:
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing. J. Open Source Softw. 8(91): 5403 (2023)
[j54]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WangWWR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WangWWR23
Zhong-Qiu Wang, Gordon Wichern, Shinji Watanabe, Jonathan Le Roux:
STFT-Domain Neural Speech Enhancement With Very Low Algorithmic Latency. IEEE ACM Trans. Audio Speech Lang. Process. 31: 397-410 (2023)
[j53]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/taslp/HoriguchiWGTK23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/HoriguchiWGTK23
Shota Horiguchi, Shinji Watanabe, Paola García, Yuki Takashima, Yohei Kawaguchi:
Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors. IEEE ACM Trans. Audio Speech Lang. Process. 31: 706-720 (2023)
[j52]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/taslp/LuCYLHWT23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/LuCYLHWT23
Yen-Ju Lu, Chia-Yu Chang, Cheng Yu, Ching-Feng Liu, Jeih-weih Hung, Shinji Watanabe, Yu Tsao:
Improving Speech Enhancement Performance by Leveraging Contextual Broad Phonetic Class Information. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2738-2750 (2023)
[j51]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/DalmiaOLEWMZM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/DalmiaOLEWMZM23
Siddharth Dalmia, Dmytro Okhonko, Mike Lewis, Sergey Edunov, Shinji Watanabe, Florian Metze, Luke Zettlemoyer, Abdelrahman Mohamed:
LegoNN: Building Modular Encoder-Decoder Models. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3112-3126 (2023)
[j50]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/WangCCLKW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/WangCCLKW23
Zhong-Qiu Wang, Samuele Cornell, Shukjae Choi, Younglo Lee, Byeong-Yeol Kim, Shinji Watanabe:
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3221-3236 (2023)
[c384]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/Chen0R23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/Chen0R23
Li-Wei Chen, Shinji Watanabe, Alexander Rudnicky:
A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech. AAAI 2023: 12644-12652
[c383]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/YanS0IPDPFBHZNH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/YanS0IPDPFBHZNH23
Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polak, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe:
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit. ACL (demo) 2023: 400-411
[c382]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/ShonALPWSWLL023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/ShonALPWSWLL023
Suwon Shon, Siddhant Arora, Chyi-Jiunn Lin, Ankita Pasad, Felix Wu, Roshan S. Sharma, Wei-Lun Wu, Hung-yi Lee, Karen Livescu, Shinji Watanabe:
SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks. ACL (1) 2023: 8906-8937
[c381]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/InagumaPKCWC00023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/InagumaPKCWC00023
Hirofumi Inaguma, Sravya Popuri, Ilia Kulikov, Peng-Jen Chen, Changhan Wang, Yu-An Chung, Yun Tang, Ann Lee, Shinji Watanabe, Juan Pino:
UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units. ACL (1) 2023: 15655-15680
[c380]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/HoHWGS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/HoHWGS23
Tuan Vu Ho, Shota Horiguchi, Shinji Watanabe, Paola García, Takashi Sumiyoshi:
Synthetic Data Augmentation for ASR with Domain Filtering. APSIPA ASC 2023: 1760-1765
[c379]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ChenSYBZPCMW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ChenSYBZPCMW23
William Chen, Jiatong Shi, Brian Yan, Dan Berrebbi, Wangyou Zhang, Yifan Peng, Xuankai Chang, Soumi Maiti, Shinji Watanabe:
Joint Prediction and Denoising for Large-Scale Multilingual Self-Supervised Learning. ASRU 2023: 1-8
[c378]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/FujitaWCM23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/FujitaWCM23
Yuya Fujita, Shinji Watanabe, Xuankai Chang, Takashi Maekaku:
LV-CTC: Non-Autoregressive ASR With CTC and Latent Variable Models. ASRU 2023: 1-6
[c377]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HwangHCZNSMHPZKYZLKRSWST23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HwangHCZNSMHPZKYZLKRSWST23
Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao:
TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch. ASRU 2023: 1-9
[c376]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/KanoODMACW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/KanoODMACW23
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Kohei Matsuura, Takanori Ashihara, William Chen, Shinji Watanabe:
Summarize While Translating: Universal Model With Parallel Decoding for Summarization and Translation. ASRU 2023: 1-8
[c375]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/LiTSCSW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/LiTSCSW23
Xinjian Li, Shinnosuke Takamichi, Takaaki Saeki, William Chen, Sayaka Shiota, Shinji Watanabe:
Yodas: Youtube-Oriented Dataset for Audio and Speech. ASRU 2023: 1-8
[c374]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/PengTYBCLSACSZSSJMW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/PengTYBCLSACSZSSJMW23
Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan S. Sharma, Wangyou Zhang, Yui Sudo, Muhammad Shakeel, Jee-Weon Jung, Soumi Maiti, Shinji Watanabe:
Reproducing Whisper-Style Training Using An Open-Source Toolkit And Publicly Available Data. ASRU 2023: 1-8
[c373]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/SaijoZWWKO23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/SaijoZWWKO23
Kohei Saijo, Wangyou Zhang, Zhong-Qiu Wang, Shinji Watanabe, Tetsunori Kobayashi, Tetsuji Ogawa:
A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, And Extraction. ASRU 2023: 1-6
[c372]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/SharmaCKSAWODSR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/SharmaCKSAWODSR23
Roshan S. Sharma, William Chen, Takatomo Kano, Ruchira Sharma, Siddhant Arora, Shinji Watanabe, Atsunori Ogawa, Marc Delcroix, Rita Singh, Bhiksha Raj:
Espnet-Summ: Introducing a Novel Large Dataset, Toolkit, and a Cross-Corpora Evaluation of Speech Summarization Systems. ASRU 2023: 1-8
[c371]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ShiCBWHHCCTLMLW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ShiCBWHHCCTLMLW23
Jiatong Shi, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping Huang, En-Pei Hu, Ho-Lam Chuang, Xuankai Chang, Yuxun Tang, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Shinji Watanabe:
Findings of the 2023 ML-Superb Challenge: Pre-Training And Evaluation Over More Languages And Beyond. ASRU 2023: 1-8
[c370]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ShinoharaW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ShinoharaW23
Yusuke Shinohara, Shinji Watanabe:
Domain Adaptation by Data Distribution Matching Via Submodularity For Speech Recognition. ASRU 2023: 1-7
[c369]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/SomekiEHW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/SomekiEHW23
Masao Someki, Nicholas Eng, Yosuke Higuchi, Shinji Watanabe:
Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference. ASRU 2023: 1-8
[c368]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ZhangSWWQ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ZhangSWWQ23
Wangyou Zhang, Kohei Saijo, Zhong-Qiu Wang, Shinji Watanabe, Yanmin Qian:
Toward Universal Speech Enhancement For Diverse Input Conditions. ASRU 2023: 1-6
[c367]
- view
  authority control:
- export record
  dblp key:
  - conf/eacl/YanDHNMBW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eacl/YanDHNMBW23
Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W. Black, Shinji Watanabe:
CTC Alignments Improve Autoregressive Translation. EACL 2023: 1615-1631
[c366]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/AroraFTYW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/AroraFTYW23
Siddhant Arora, Hayato Futami, Emiru Tsunoo, Brian Yan, Shinji Watanabe:
Joint Modelling of Spoken Language Understanding Tasks with Integrated Dialog History. ICASSP 2023: 1-5
[c365]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/AroraFWHPKTYW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/AroraFWHPKTYW23
Siddhant Arora, Hayato Futami, Shih-Lun Wu, Jessica Huynh, Yifan Peng, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe:
A Study on the Integration of Pipeline and E2E SLU Systems for Spoken Semantic Parsing Toward Stop Quality Challenge. ICASSP 2023: 1-2
[c364]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/BerrebbiYW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/BerrebbiYW23
Dan Berrebbi, Brian Yan, Shinji Watanabe:
Avoid Overthinking in Self-Supervised Models for Speech Recognition. ICASSP 2023: 1-5
[c363]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenWDWD0C0SSLY23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenWDWD0C0SSLY23
Hang Chen, Shilong Wu, Yusheng Dai, Zhe Wang, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Bao-Cai Yin, Jia Pan, Jianqing Gao, Cong Liu:
Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge. ICASSP 2023: 1-2
[c362]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenWR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenWR23
Li-Wei Chen, Shinji Watanabe, Alexander Rudnicky:
A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Units. ICASSP 2023: 1-5
[c361]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenYSPMW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenYSPMW23
William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe:
Improving Massively Multilingual ASR with Auxiliary CTC Objectives. ICASSP 2023: 1-5
[c360]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/CornellWMWPOS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/CornellWMWPOS23
Samuele Cornell, Zhong-Qiu Wang, Yoshiki Masuyama, Shinji Watanabe, Manuel Pariente, Nobutaka Ono, Stefano Squartini:
Multi-Channel Speaker Extraction with Adversarial Training: The Wavlab Submission to The Clarity ICASSP 2023 Grand Challenge. ICASSP 2023: 1-2
[c359]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/FutamiHAWKPYTW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/FutamiHAWKPYTW23
Hayato Futami, Jessica Huynh, Siddhant Arora, Shih-Lun Wu, Yosuke Kashiwagi, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe:
The Pipeline System of ASR and NLU with MLM-based data Augmentation Toward Stop Low-Resource Challenge. ICASSP 2023: 1-2
[c358]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/FutamiTSKOAW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/FutamiTSKOAW23
Hayato Futami, Emiru Tsunoo, Kentaro Shibata, Yosuke Kashiwagi, Takao Okuda, Siddhant Arora, Shinji Watanabe:
Streaming Joint Speech Recognition and Disfluency Detection. ICASSP 2023: 1-5
[c357]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/GaoSCGLWK23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/GaoSCGLWK23
Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola García, Hung-Yi Lee, Shinji Watanabe, Sanjeev Khudanpur:
Euro: Espnet Unsupervised ASR Open-Source Toolkit. ICASSP 2023: 1-5
[c356]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HiguchiOKW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HiguchiOKW23
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe:
Intermpl: Momentum Pseudo-Labeling With Intermediate CTC Loss. ICASSP 2023: 1-5
[c355]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HiguchiOKW23a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HiguchiOKW23a
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe:
BECTRA: Transducer-Based End-To-End ASR with Bert-Enhanced Encoder. ICASSP 2023: 1-5
[c354]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HuangGMKCLW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HuangGMKCLW23
Junwei Huang, Karthik Ganesan, Soumi Maiti, Young Min Kim, Xuankai Chang, Paul Liang, Shinji Watanabe:
FindAdaptNet: Find and Insert Adapters by Learned Layer Importance. ICASSP 2023: 1-5
[c353]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/JungHLHBKWC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/JungHLHBKWC23
Jee-Weon Jung, Hee-Soo Heo, Bong-Jin Lee, Jaesung Huh, Andrew Brown, Youngki Kwon, Shinji Watanabe, Joon Son Chung:
In Search of Strong Embedding Extractors for Speaker Diarisation. ICASSP 2023: 1-5
[c352]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KanoODSMW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KanoODSMW23
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Roshan S. Sharma, Kohei Matsuura, Shinji Watanabe:
Speech Summarization of Long Spoken Document: Improving Memory Efficiency of Speech/Text Encoders. ICASSP 2023: 1-5
[c351]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KashiwagiAFHWPYTW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KashiwagiAFHWPYTW23
Yosuke Kashiwagi, Siddhant Arora, Hayato Futami, Jessica Huynh, Shih-Lun Wu, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe:
E-Branchformer-Based E2E SLU Toward Stop on-Device Challenge. ICASSP 2023: 1-2
[c350]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LianBLGWA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LianBLGWA23
Jiachen Lian, Alan W. Black, Yijing Lu, Louis Goldstein, Shinji Watanabe, Gopala Krishna Anumanchipalli:
Articulatory Representation Learning via Joint Factor Analysis and Neural Matrix Factorization. ICASSP 2023: 1-5
[c349]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaekakuFCW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaekakuFCW23
Takashi Maekaku, Yuya Fujita, Xuankai Chang, Shinji Watanabe:
Fully Unsupervised Topic Clustering of Unlabelled Spoken Audio Using Self-Supervised Representation Learning and Topic Model. ICASSP 2023: 1-5
[c348]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaitiPSW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaitiPSW23
Soumi Maiti, Yifan Peng, Takaaki Saeki, Shinji Watanabe:
Speechlmscore: Evaluating Speech Generation Using Speech Language Model. ICASSP 2023: 1-5
[c347]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OmachiYDFW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OmachiYDFW23
Motoi Omachi, Brian Yan, Siddharth Dalmia, Yuya Fujita, Shinji Watanabe:
Align, Write, Re-Order: Explainable End-to-End Speech Translation via Operation Sequence Generation. ICASSP 2023: 1-5
[c346]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/PengKWSW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/PengKWSW23
Yifan Peng, Kwangyoun Kim, Felix Wu, Prashant Sridhar, Shinji Watanabe:
Structured Pruning of Self-Supervised Pre-Trained Models for Speech Recognition and Understanding. ICASSP 2023: 1-5
[c345]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/PengLW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/PengLW23
Yifan Peng, Jaesong Lee, Shinji Watanabe:
I3D: Transformer Architectures with Input-Dependent Dynamic Depth for Speech Recognition. ICASSP 2023: 1-5
[c344]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShiHCGGWLL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShiHCGGWLL23
Jiatong Shi, Chan-Jan Hsu, Ho-Lam Chung, Dongji Gao, Paola García, Shinji Watanabe, Ann Lee, Hung-Yi Lee:
Bridging Speech and Textual Pre-Trained Models With Unsupervised ASR. ICASSP 2023: 1-5
[c343]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShiTLIWPW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShiTLIWPW23
Jiatong Shi, Yun Tang, Ann Lee, Hirofumi Inaguma, Changhan Wang, Juan Pino, Shinji Watanabe:
Enhancing Speech-To-Speech Translation with Multiple TTS Targets. ICASSP 2023: 1-5
[c342]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShonWKSLW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShonWKSLW23
Suwon Shon, Felix Wu, Kwangyoun Kim, Prashant Sridhar, Karen Livescu, Shinji Watanabe:
Context-Aware Fine-Tuning of Self-Supervised Speech Models. ICASSP 2023: 1-5
[c341]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangCCLKW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangCCLKW23
Zhong-Qiu Wang, Samuele Cornell, Shukjae Choi, Younglo Lee, Byeong-Yeol Kim, Shinji Watanabe:
TF-GRIDNET: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation. ICASSP 2023: 1-5
[c340]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangCCLKW23a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangCCLKW23a
Zhong-Qiu Wang, Samuele Cornell, Shukjae Choi, Younglo Lee, Byeong-Yeol Kim, Shinji Watanabe:
FNeural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated full- and sub-band Modeling. ICASSP 2023: 1-5
[c339]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangWCHDLCWSSLYPGL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangWCHDLCWSSLYPGL23
Zhe Wang, Shilong Wu, Hang Chen, Mao-Kui He, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Baocai Yin, Jia Pan, Jianqing Gao, Cong Liu:
The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition. ICASSP 2023: 1-5
[c338]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WuCCWGBA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WuCCWGBA23
Peter Wu, Li-Wei Chen, Cheol Jun Cho, Shinji Watanabe, Louis Goldstein, Alan W. Black, Gopala Krishna Anumanchipalli:
Speaker-Independent Acoustic-to-Articulatory Speech Inversion. ICASSP 2023: 1-5
[c337]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WuKWHMWA23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WuKWHMWA23
Felix Wu, Kwangyoun Kim, Shinji Watanabe, Kyu Jeong Han, Ryan McDonald, Kilian Q. Weinberger, Yoav Artzi:
Wav2Seq: Pre-Training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages. ICASSP 2023: 1-5
[c336]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/XuJMWG23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/XuJMWG23
Hainan Xu, Fei Jia, Somshubra Majumdar, Shinji Watanabe, Boris Ginsburg:
Multi-Blank Transducers for Speech Recognition. ICASSP 2023: 1-5
[c335]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YanWKJW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YanWKJW23
Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe:
Towards Zero-Shot Code-Switched Speech Recognition. ICASSP 2023: 1-5
[c334]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangKBZHKWR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangKBZHKWR23
Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
Paaploss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement. ICASSP 2023: 1-5
[c333]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZengKHBYKWR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZengKHBYKWR23
Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement. ICASSP 2023: 1-5
[c332]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/TianYYW0023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/TianYYW0023
Jinchuan Tian, Brian Yan, Jianwei Yu, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes Risk CTC: Controllable CTC Alignment in Sequence-to-Sequence Tasks. ICLR 2023
[c331]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/XuJMH0G23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/XuJMH0G23
Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg:
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations. ICML 2023: 38462-38484
[c330]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/ijcai/SaekiML0TS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/SaekiML0TS23
Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari:
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining. IJCAI 2023: 5179-5187
[c329]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PengS0023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PengS0023
Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe:
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models. INTERSPEECH 2023: 62-66
[c328]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PengY0H23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PengY0H23
Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath:
Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization. INTERSPEECH 2023: 396-400
[c327]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KashiwagiAFHWPY23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KashiwagiAFHWPY23
Yosuke Kashiwagi, Siddhant Arora, Hayato Futami, Jessica Huynh, Shih-Lun Wu, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe:
Tensor decomposition for minimization of E2E SLU model toward on-device processing. INTERSPEECH 2023: 710-714
[c326]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AroraFKTY023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AroraFKTY023
Siddhant Arora, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe:
Integrating Pretrained ASR and LM to Perform Sequence Generation for Spoken Language Understanding. INTERSPEECH 2023: 720-724
[c325]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiBCHHCC0ML023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiBCHHCC0ML023
Jiatong Shi, Dan Berrebbi, William Chen, En-Pei Hu, Wei-Ping Huang, Ho-Lam Chung, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe:
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark. INTERSPEECH 2023: 884-888
[c324]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TsunooFKA023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TsunooFKA023
Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe:
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition. INTERSPEECH 2023: 1369-1373
[c323]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChangYFM023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChangYFM023
Xuankai Chang, Brian Yan, Yuya Fujita, Takashi Maekaku, Shinji Watanabe:
Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning. INTERSPEECH 2023: 1399-1403
[c322]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0001AZ0SR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0001AZ0SR23
Roshan Sharma, Siddhant Arora, Kenneth Zheng, Shinji Watanabe, Rita Singh, Bhiksha Raj:
BASS: Block-wise Adaptation for Speech Summarization. INTERSPEECH 2023: 1454-1458
[c321]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TangCC0M23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TangCC0M23
Jiyang Tang, William Chen, Xuankai Chang, Shinji Watanabe, Brian MacWhinney:
A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and Multi-task Learning. INTERSPEECH 2023: 1528-1532
[c320]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PengKWYACTSS023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PengKWYACTSS023
Yifan Peng, Kwangyoun Kim, Felix Wu, Brian Yan, Siddhant Arora, William Chen, Jiyang Tang, Suwon Shon, Prashant Sridhar, Shinji Watanabe:
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks. INTERSPEECH 2023: 2208-2212
[c319]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Shi0IG0023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Shi0IG0023
Jiatong Shi, Yun Tang, Hirofumi Inaguma, Hongyu Gong, Juan Pino, Shinji Watanabe:
Exploration on HuBERT with Multiple Resolution. INTERSPEECH 2023: 3287-3291
[c318]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Sudo0YS023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Sudo0YS023
Yui Sudo, Muhammad Shakeel, Brian Yan, Jiatong Shi, Shinji Watanabe:
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders. INTERSPEECH 2023: 3312-3316
[c317]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PolakY0WB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PolakY0WB23
Peter Polák, Brian Yan, Shinji Watanabe, Alex Waibel, Ondrej Bojar:
Incremental Blockwise Beam Search for Simultaneous Speech Translation with Controllable Quality-Latency Tradeoff. INTERSPEECH 2023: 3979-3983
[c316]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenCPNM023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenCPNM023
William Chen, Xuankai Chang, Yifan Peng, Zhaoheng Ni, Soumi Maiti, Shinji Watanabe:
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute. INTERSPEECH 2023: 4404-4408
[c315]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Sudo0P023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Sudo0P023
Yui Sudo, Muhammad Shakeel, Yifan Peng, Shinji Watanabe:
Time-synchronous one-pass Beam Search for Parallel Online and Offline Transducers with Dynamic Block Training. INTERSPEECH 2023: 4479-4483
[c314]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TianYCYW0023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TianYCYW0023
Jinchuan Tian, Jianwei Yu, Hangting Chen, Brian Yan, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction. INTERSPEECH 2023: 4968-4972
[c313]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuLLZLBG0A23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuLLZLBG0A23
Peter Wu, Tingle Li, Yijing Lu, Yubin Zhang, Jiachen Lian, Alan W. Black, Louis Goldstein, Shinji Watanabe, Gopala Krishna Anumanchipalli:
Deep Speech Synthesis from MRI-Based Articulatory Representations. INTERSPEECH 2023: 5132-5136
[c312]
- view
  authority control:
- export record
  dblp key:
  - conf/iwslt/AgrawalABBBCCCC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwslt/AgrawalABBBCCCC23
Sweta Agrawal, Antonios Anastasopoulos, Luisa Bentivogli, Ondrej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, Mingda Chen, William Chen, Khalid Choukri, Alexandra Chronopoulou, Anna Currey, Thierry Declerck, Qianqian Dong, Kevin Duh, Yannick Estève, Marcello Federico, Souhir Gahbiche, Barry Haddow, Benjamin Hsu, Phu Mon Htut, Hirofumi Inaguma, Dávid Javorský, John Judge, Yasumasa Kano, Tom Ko, Rishu Kumar, Pengwei Li, Xutai Ma, Prashant Mathur, Evgeny Matusov, Paul McNamee, John P. McCrae, Kenton Murray, Maria Nadejde, Satoshi Nakamura, Matteo Negri, Ha Nguyen, Jan Niehues, Xing Niu, Atul Kr. Ojha, John E. Ortega, Proyag Pal, Juan Pino, Lonneke van der Plas, Peter Polák, Elijah Rippeth, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Yun Tang, Brian Thompson, Kevin Tran, Marco Turchi, Alex Waibel, Mingxuan Wang, Shinji Watanabe, Rodolfo Zevallos:
Findings of the IWSLT 2023 Evaluation Campaign. IWSLT@ACL 2023: 1-61
[c311]
- view
  authority control:
- export record
  dblp key:
  - conf/iwslt/YanSMCLPA023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwslt/YanSMCLPA023
Brian Yan, Jiatong Shi, Soumi Maiti, William Chen, Xinjian Li, Yifan Peng, Siddhant Arora, Shinji Watanabe:
CMU's IWSLT 2023 Simultaneous Speech Translation System. IWSLT@ACL 2023: 235-240
[c310]
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/Wang023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Wang023
Zhong-Qiu Wang, Shinji Watanabe:
UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures. NeurIPS 2023
[c309]
- view
  authority control:
- export record
  dblp key:
  - conf/sigmorphon/HeTR0MNL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/sigmorphon/HeTR0MNL23
Taiqi He, Lindia Tjuatja, Nathaniel R. Robinson, Shinji Watanabe, David R. Mortensen, Graham Neubig, Lori S. Levin:
SigMoreFun Submission to the SIGMORPHON Shared Task on Interlinear Glossing. SIGMORPHON 2023: 209-216
[c308]
- view
  authority control:
- export record
  dblp key:
  - conf/tsd/KarakasidisRGOAAWMK23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/tsd/KarakasidisRGOAAWMK23
Georgios Karakasidis, Nathaniel R. Robinson, Yaroslav Getman, Atieno Ogayo, Ragheb Al-Ghezi, Ananya Ayasi, Shinji Watanabe, David R. Mortensen, Mikko Kurimo:
Multilingual TTS Accent Impressions for Accented ASR. TSD 2023: 317-327
[c307]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/MasuyamaCZCWOQW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/MasuyamaCZCWOQW23
Yoshiki Masuyama, Xuankai Chang, Wangyou Zhang, Samuele Cornell, Zhong-Qiu Wang, Nobutaka Ono, Yanmin Qian, Shinji Watanabe:
Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation. WASPAA 2023: 1-5
[d1]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - data/10/LuCLZCNMYSWTQW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/data/10/LuCLZCNMYSWTQW23
Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhong-Qiu Wang, Yu Tsao, Yanmin Qian, Shinji Watanabe:
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing (espnet-v.202310). Zenodo, 2023
[i266]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-09099
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-09099
Massa Baali, Tomoki Hayashi, Hamdy Mubarak, Soumi Maiti, Shinji Watanabe, Wassim El-Hajj, Ahmed Ali:
Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case Study. CoRR abs/2301.09099 (2023)
[i265]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-12596
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-12596
Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari:
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining. CoRR abs/2301.12596 (2023)
[i264]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-04215
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-04215
Li-Wei Chen, Shinji Watanabe, Alexander Rudnicky:
A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech. CoRR abs/2302.04215 (2023)
[i263]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-06774
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-06774
Peter Wu, Li-Wei Chen, Cheol Jun Cho, Shinji Watanabe, Louis Goldstein, Alan W. Black, Gopala Krishna Anumanchipalli:
Speaker-Independent Acoustic-to-Articulatory Speech Inversion. CoRR abs/2302.06774 (2023)
[i262]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-07928
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-07928
Samuele Cornell, Zhong-Qiu Wang, Yoshiki Masuyama, Shinji Watanabe, Manuel Pariente, Nobutaka Ono:
Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge. CoRR abs/2302.07928 (2023)
[i261]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-08088
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-08088
Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement. CoRR abs/2302.08088 (2023)
[i260]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-08095
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-08095
Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
PAAPLoss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement. CoRR abs/2302.08095 (2023)
[i259]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-12829
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-12829
William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe:
Improving Massively Multilingual ASR With Auxiliary CTC Objectives. CoRR abs/2302.12829 (2023)
[i258]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-14132
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-14132
Yifan Peng, Kwangyoun Kim, Felix Wu, Prashant Sridhar, Shinji Watanabe:
Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding. CoRR abs/2302.14132 (2023)
[i257]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-03329
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-03329
Rohit Prabhavalkar, Takaaki Hori, Tara N. Sainath, Ralf Schlüter, Shinji Watanabe:
End-to-End Speech Recognition: A Survey. CoRR abs/2303.03329 (2023)
[i256]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-06326
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-06326
Zhe Wang, Shilong Wu, Hang Chen, Mao-Kui He, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Baocai Yin, Jia Pan, Jianqing Gao, Cong Liu:
The Multimodal Information based Speech Processing (MISP) 2022 Challenge: Audio-Visual Diarization and Recognition. CoRR abs/2303.06326 (2023)
[i255]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-07624
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-07624
Yifan Peng, Jaesong Lee, Shinji Watanabe:
I3D: Transformer architectures with input-dependent dynamic depth for speech recognition. CoRR abs/2303.07624 (2023)
[i254]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-04618
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-04618
Jiatong Shi, Yun Tang, Ann Lee, Hirofumi Inaguma, Changhan Wang, Juan Pino, Shinji Watanabe:
Enhancing Speech-to-Speech Translation with Multiple TTS Targets. CoRR abs/2304.04618 (2023)
[i253]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-06795
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-06795
Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg:
Efficient Sequence Transduction by Jointly Predicting Tokens and Durations. CoRR abs/2304.06795 (2023)
[i252]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-08707
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-08707
Zhong-Qiu Wang, Samuele Cornell, Shukjae Choi, Younglo Lee, Byeong-Yeol Kim, Shinji Watanabe:
Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling. CoRR abs/2304.08707 (2023)
[i251]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-12995
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-12995
Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Zhou Zhao, Shinji Watanabe:
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head. CoRR abs/2304.12995 (2023)
[i250]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-00926
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-00926
Siddhant Arora, Hayato Futami, Emiru Tsunoo, Brian Yan, Shinji Watanabe:
Joint Modelling of Spoken Language Understanding Tasks with Integrated Dialog History. CoRR abs/2305.00926 (2023)
[i249]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-01194
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-01194
Hayato Futami, Jessica Huynh, Siddhant Arora, Shih-Lun Wu, Yosuke Kashiwagi, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe:
The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge. CoRR abs/2305.01194 (2023)
[i248]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-01620
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-01620
Siddhant Arora, Hayato Futami, Shih-Lun Wu, Jessica Huynh, Yifan Peng, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe:
A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge. CoRR abs/2305.01620 (2023)
[i247]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-07455
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-07455
Yu-Kuan Fu, Liang-Hsuan Tseng, Jiatong Shi, Chen-An Li, Tsu-Yuan Hsu, Shinji Watanabe, Hung-Yi Lee:
Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation. CoRR abs/2305.07455 (2023)
[i246]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-10615
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-10615
Jiatong Shi, Dan Berrebbi, William Chen, Ho-Lam Chung, En-Pei Hu, Wei-Ping Huang, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe:
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark. CoRR abs/2305.10615 (2023)
[i245]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-11073
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-11073
Yifan Peng, Kwangyoun Kim, Felix Wu, Brian Yan, Siddhant Arora, William Chen, Jiyang Tang, Suwon Shon, Prashant Sridhar, Shinji Watanabe:
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks. CoRR abs/2305.11073 (2023)
[i244]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-11095
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-11095
Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath:
Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization. CoRR abs/2305.11095 (2023)
[i243]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-13331
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-13331
Jiyang Tang, William Chen, Xuankai Chang, Shinji Watanabe, Brian MacWhinney:
A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and Multi-task Learning. CoRR abs/2305.13331 (2023)
[i242]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-17651
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-17651
Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe:
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models. CoRR abs/2305.17651 (2023)
[i241]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18108
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18108
Xuankai Chang, Brian Yan, Yuya Fujita, Takashi Maekaku, Shinji Watanabe:
Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning. CoRR abs/2305.18108 (2023)
[i240]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-20054
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-20054
Zhong-Qiu Wang, Shinji Watanabe:
UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures. CoRR abs/2305.20054 (2023)
[i239]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-01084
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-01084
Jiatong Shi, Yun Tang, Hirofumi Inaguma, Hongyu Gong, Juan Pino, Shinji Watanabe:
Exploration on HuBERT with Multiple Resolutions. CoRR abs/2306.01084 (2023)
[i238]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-06672
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-06672
William Chen, Xuankai Chang, Yifan Peng, Zhaoheng Ni, Soumi Maiti, Shinji Watanabe:
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute. CoRR abs/2306.06672 (2023)
[i237]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-13734
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-13734
Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola García, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur:
The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios. CoRR abs/2306.13734 (2023)
[i236]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-08217
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-08217
Roshan S. Sharma, Kenneth Zheng, Siddhant Arora, Shinji Watanabe, Rita Singh, Bhiksha Raj:
BASS: Block-wise Adaptation for Speech Summarization. CoRR abs/2307.08217 (2023)
[i235]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-11005
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-11005
Siddhant Arora, Hayato Futami, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe:
Integrating Pretrained ASR and LM to Perform Sequence Generation for Spoken Language Understanding. CoRR abs/2307.11005 (2023)
[i234]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-12231
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-12231
Yoshiki Masuyama, Xuankai Chang, Wangyou Zhang, Samuele Cornell, Zhong-Qiu Wang, Nobutaka Ono, Yanmin Qian, Shinji Watanabe:
Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation. CoRR abs/2307.12231 (2023)
[i233]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-12767
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-12767
Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe:
Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition. CoRR abs/2307.12767 (2023)
[i232]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-10107
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-10107
Jinchuan Tian, Jianwei Yu, Hangting Chen, Brian Yan, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction. CoRR abs/2308.10107 (2023)
[i231]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-07937
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-07937
Soumi Maiti, Yifan Peng, Shukjae Choi, Jee-weon Jung, Xuankai Chang, Shinji Watanabe:
Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks. CoRR abs/2309.07937 (2023)
[i230]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-08348
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-08348
Shilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, Chenyue Zhang, Ruoyu Wang, Hongbo Lan, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Zhong-Qiu Wang, Jia Pan, Jianqing Gao:
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction. CoRR abs/2309.08348 (2023)
[i229]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-08531
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-08531
Minsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe, Yong Man Ro:
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens. CoRR abs/2309.08531 (2023)
[i228]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-08535
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-08535
Jeong Hun Yeo, Minsu Kim, Shinji Watanabe, Yong Man Ro:
Visual Speech Recognition for Low-resource Languages with Automatic Labels From Whisper Model. CoRR abs/2309.08535 (2023)
[i227]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-08876
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-08876
Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe:
Decoder-only Architecture for Speech Recognition with CTC Prompts and Text Data Augmentation. CoRR abs/2309.08876 (2023)
[i226]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-09510
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-09510
Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan S. Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-yi Lee:
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech. CoRR abs/2309.09510 (2023)
[i225]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-10787
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-10787
Yuan Tseng, Layne Berry, Yi-Ting Chen, I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, Puyuan Peng, Yi-Jen Shih, Hung-Yu Wang, Haibin Wu, Po-Yao Huang, Chun-Mao Lai, Shang-Wen Li, David Harwath, Yu Tsao, Shinji Watanabe, Abdelrahman Mohamed, Chi-Luen Feng, Hung-yi Lee:
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models. CoRR abs/2309.10787 (2023)
[i224]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-10926
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-10926
Siddhant Arora, George Saon, Shinji Watanabe, Brian Kingsbury:
Semi-Autoregressive Streaming ASR With Label Context. CoRR abs/2309.10926 (2023)
[i223]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-11379
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-11379
Peter Polák, Brian Yan, Shinji Watanabe, Alex Waibel, Ondrej Bojar:
Incremental Blockwise Beam Search for Simultaneous Speech Translation with Controllable Quality-Latency Tradeoff. CoRR abs/2309.11379 (2023)
[i222]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-13876
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-13876
Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan S. Sharma, Wangyou Zhang, Yui Sudo, Muhammad Shakeel, Jee-weon Jung, Soumi Maiti, Shinji Watanabe:
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data. CoRR abs/2309.13876 (2023)
[i221]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-14922
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-14922
Masao Someki, Nicholas Eng, Yosuke Higuchi, Shinji Watanabe:
Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference. CoRR abs/2309.14922 (2023)
[i220]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-15317
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-15317
William Chen, Jiatong Shi, Brian Yan, Dan Berrebbi, Wangyou Zhang, Yifan Peng, Xuankai Chang, Soumi Maiti, Shinji Watanabe:
Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning. CoRR abs/2309.15317 (2023)
[i219]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-15674
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-15674
Amir Hussein, Dorsa Zeinali, Ondrej Klejch, Matthew Wiesner, Brian Yan, Shammur Absar Chowdhury, Ahmed M. Ali, Shinji Watanabe, Sanjeev Khudanpur:
Speech collage: code-switched audio generation by collaging monolingual corpora. CoRR abs/2309.15674 (2023)
[i218]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-15686
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-15686
Amir Hussein, Brian Yan, Antonios Anastasopoulos, Shinji Watanabe, Sanjeev Khudanpur:
Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization. CoRR abs/2309.15686 (2023)
[i217]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-15800
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-15800
Xuankai Chang, Brian Yan, Kwanghee Choi, Jee-Weon Jung, Yichen Lu, Soumi Maiti, Roshan S. Sharma, Jiatong Shi, Jinchuan Tian, Shinji Watanabe, Yuya Fujita, Takashi Maekaku, Pengcheng Guo, Yao-Fei Cheng, Pavel Denisov, Kohei Saijo, Hsiu-Hsuan Wang:
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study. CoRR abs/2309.15800 (2023)
[i216]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-15826
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-15826
Brian Yan, Xuankai Chang, Antonios Anastasopoulos, Yuya Fujita, Shinji Watanabe:
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing. CoRR abs/2309.15826 (2023)
[i215]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-17352
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-17352
Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-weon Jung, François G. Germain, Jonathan Le Roux, Shinji Watanabe:
Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation. CoRR abs/2309.17352 (2023)
[i214]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-17384
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-17384
Wangyou Zhang, Kohei Saijo, Zhong-Qiu Wang, Shinji Watanabe, Yanmin Qian:
Toward Universal Speech Enhancement for Diverse Input Conditions. CoRR abs/2309.17384 (2023)
[i213]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-00704
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-00704
Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng:
UniAudio: An Audio Foundation Model Toward Universal Audio Generation. CoRR abs/2310.00704 (2023)
[i212]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-01688
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-01688
Samuele Cornell, Jee-weon Jung, Shinji Watanabe, Stefano Squartini:
One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition. CoRR abs/2310.01688 (2023)
[i211]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-02973
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-02973
Siddhant Arora, Hayato Futami, Jee-weon Jung, Yifan Peng, Roshan S. Sharma, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe:
UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network. CoRR abs/2310.02973 (2023)
[i210]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-03938
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-03938
Tejes Srivastava, Jiatong Shi, William Chen, Shinji Watanabe:
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Multilingual and Low Resource Scenarios. CoRR abs/2310.03938 (2023)
[i209]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-03975
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-03975
Takashi Maekaku, Jiatong Shi, Xuankai Chang, Yuya Fujita, Shinji Watanabe:
HuBERTopic: Enhancing Semantic Representation of HuBERT through Self-supervision Utilizing Topic Model. CoRR abs/2310.03975 (2023)
[i208]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-05513
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-05513
Jiatong Shi, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping Huang, En-Pei Hu, Ho-Lam Chung, Xuankai Chang, Yuxun Tang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe:
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond. CoRR abs/2310.05513 (2023)
[i207]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-08277
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-08277
Kohei Saijo, Wangyou Zhang, Zhong-Qiu Wang, Shinji Watanabe, Tetsunori Kobayashi, Tetsuji Ogawa:
A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction. CoRR abs/2310.08277 (2023)
[i206]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-17864
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-17864
Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao, Robin Scheibler, Samuele Cornell, Sean Kim, Stavros Petridis:
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch. CoRR abs/2310.17864 (2023)
[i205]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-07069
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-07069
Shih-Lun Wu, Chris Donahue, Shinji Watanabe, Nicholas J. Bryan:
Music ControlNet: Multiple Time-varying Controls for Music Generation. CoRR abs/2311.07069 (2023)
[i204]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-09582
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-09582
Hayato Futami, Emiru Tsunoo, Yosuke Kashiwagi, Hiroaki Ogawa, Siddhant Arora, Shinji Watanabe:
Phoneme-aware Encoding for Prefix-tree-based Contextual ASR. CoRR abs/2312.09582 (2023)
[i203]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-09895
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-09895
Suwon Shon, Kwangyoun Kim, Prashant Sridhar, Yi-Te Hsu, Shinji Watanabe, Karen Livescu:
Generative Context-aware Fine-tuning of Self-supervised Speech Models. CoRR abs/2312.09895 (2023)
[i202]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-10019
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-10019
Kwanghee Choi, Jee-weon Jung, Shinji Watanabe:
Understanding Probe Behaviors through Variational Bounds of Mutual Information. CoRR abs/2312.10019 (2023)
2022
[j49]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/HusseinWA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/HusseinWA22
Amir Hussein, Shinji Watanabe, Ahmed Ali:
Arabic speech recognition by end-to-end, modular systems and human. Comput. Speech Lang. 71: 101272 (2022)
[j48]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/HuangDGWRK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/HuangDGWRK22
Zili Huang, Marc Delcroix, Leibny Paola García-Perera, Shinji Watanabe, Desh Raj, Sanjeev Khudanpur:
Joint speaker diarization and speech recognition based on region proposal networks. Comput. Speech Lang. 72: 101316 (2022)
[j47]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/ParkKDHWN22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/ParkKDHWN22
Tae Jin Park, Naoyuki Kanda, Dimitrios Dimitriadis, Kyu Jeong Han, Shinji Watanabe, Shrikanth Narayanan:
A review of speaker diarization: Recent advances with deep learning. Comput. Speech Lang. 72: 101317 (2022)
[j46]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/ShiZWWYY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/ShiZWWYY22
Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
An investigation of neural uncertainty estimation for target speaker extraction equipped RNN transducer. Comput. Speech Lang. 73: 101327 (2022)
[j45]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/SubramanianWWYY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/SubramanianWWYY22
Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition. Comput. Speech Lang. 75: 101360 (2022)
[j44]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/ShiCWX22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/ShiCWX22
Jing Shi, Xuankai Chang, Shinji Watanabe, Bo Xu:
Train from scratch: Single-stage joint training of speech separation and recognition. Comput. Speech Lang. 76: 101387 (2022)
[j43]
- view
  authority control:
- export record
  dblp key:
  - journals/jstsp/LeeWLMS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jstsp/LeeWLMS22
Hung-Yi Lee, Shinji Watanabe, Karen Livescu, Abdelrahman Mohamed, Tara N. Sainath:
Editorial Editorial of Special Issue on Self-Supervised Learning for Speech and Audio Processing. IEEE J. Sel. Top. Signal Process. 16(6): 1174-1178 (2022)
[j42]
- view
  authority control:
- export record
  dblp key:
  - journals/jstsp/MohamedLBHEIKLL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jstsp/MohamedLBHEIKLL22
Abdelrahman Mohamed, Hung-yi Lee, Lasse Borgholt, Jakob D. Havtorn, Joakim Edin, Christian Igel, Katrin Kirchhoff, Shang-Wen Li, Karen Livescu, Lars Maaløe, Tara N. Sainath, Shinji Watanabe:
Self-Supervised Speech Representation Learning: A Review. IEEE J. Sel. Top. Signal Process. 16(6): 1179-1210 (2022)
[j41]
- view
  authority control:
- export record
  dblp key:
  - journals/spl/WangW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spl/WangW22
Zhong-Qiu Wang, Shinji Watanabe:
Improving Frame-Online Neural Speech Enhancement With Overlapped-Frame Prediction. IEEE Signal Process. Lett. 29: 1422-1426 (2022)
[j40]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/taslp/HoriguchiFWXG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/HoriguchiFWXG22
Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Paola García:
Encoder-Decoder Based Attractors for End-to-End Neural Diarization. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1493-1507 (2022)
[j39]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/ZhangCBNWQ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/ZhangCBNWQ22
Wangyou Zhang, Xuankai Chang, Christoph Böddeker, Tomohiro Nakatani, Shinji Watanabe, Yanmin Qian:
End-to-End Dereverberation, Beamforming, and Speech Recognition in a Cocktail Party. IEEE ACM Trans. Audio Speech Lang. Process. 30: 3173-3188 (2022)
[c306]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/LiMM0B22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/LiMM0B22
Xinjian Li, Florian Metze, David R. Mortensen, Shinji Watanabe, Alan W. Black:
Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble. ACL (Findings) 2022: 2106-2115
[c305]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/TsaiCHHLYDLLSCH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/TsaiCHHLYDLLSCH22
Hsiang-Sheng Tsai, Heng-Jui Chang, Wen-Chin Huang, Zili Huang, Kushal Lakhotia, Shu-Wen Yang, Shuyan Dong, Andy T. Liu, Cheng-I Lai, Jiatong Shi, Xuankai Chang, Phil Hall, Hsuan-Jui Chen, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee:
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities. ACL (1) 2022: 8479-8492
[c304]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/AroraDYMB022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/AroraDYMB022
Siddhant Arora, Siddharth Dalmia, Brian Yan, Florian Metze, Alan W. Black, Shinji Watanabe:
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models. EMNLP (Findings) 2022: 5419-5429
[c303]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/HiguchiYAOK022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/HiguchiYAOK022
Yosuke Higuchi, Brian Yan, Siddhant Arora, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe:
BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model. EMNLP (Findings) 2022: 5486-5503
[c302]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KanoODW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KanoODW22
Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe:
Integrating Multiple ASR Systems into NLP Backend with Attention Fusion. ICASSP 2022: 6237-6241
[c301]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YanZYZDBWWY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YanZYZDBWWY22
Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu:
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. ICASSP 2022: 6412-6416
[c300]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HuangYHLWT22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HuangYHLWT22
Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Hung-Yi Lee, Shinji Watanabe, Tomoki Toda:
S3PRL-VC: Open-Source Voice Conversion Framework with Self-Supervised Speech Representations. ICASSP 2022: 6552-6556
[c299]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OmachiFWW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OmachiFWW22
Motoi Omachi, Yuya Fujita, Shinji Watanabe, Tianzi Wang:
Non-Autoregressive End-To-End Automatic Speech Recognition Incorporating Downstream Natural Language Processing. ICASSP 2022: 6772-6776
[c298]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HuangWYGK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HuangWYGK22
Zili Huang, Shinji Watanabe, Shu-Wen Yang, Paola García, Sanjeev Khudanpur:
Investigating Self-Supervised Learning for Speech Enhancement and Separation. ICASSP 2022: 6837-6841
[c297]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangHNACPPGGYLH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangHNACPPGGYLH22
Yao-Yuan Yang, Moto Hira, Zhaoheng Ni, Artyom Astafurov, Caroline Chen, Christian Puhrsch, David Pollack, Dmitriy Genzel, Donny Greenberg, Edward Z. Yang, Jason Lian, Jeff Hwang, Ji Chen, Peter Goldsborough, Sean Narenthiran, Shinji Watanabe, Soumith Chintala, Vincent Quenneville-Bélair:
Torchaudio: Building Blocks for Audio and Speech Processing. ICASSP 2022: 6982-6986
[c296]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaekakuCFW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaekakuCFW22
Takashi Maekaku, Xuankai Chang, Yuya Fujita, Shinji Watanabe:
An Exploration of Hubert with Large Number of Cluster Units and Model Assessment Using Bayesian Information Criterion. ICASSP 2022: 7107-7111
[c295]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/AroraDDCUPZKGYV22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/AroraDDCUPZKGYV22
Siddhant Arora, Siddharth Dalmia, Pavel Denisov, Xuankai Chang, Yushi Ueda, Yifan Peng, Yuekai Zhang, Sujay Kumar, Karthik Ganesan, Brian Yan, Ngoc Thang Vu, Alan W. Black, Shinji Watanabe:
ESPnet-SLU: Advancing Spoken Language Understanding Through ESPnet. ICASSP 2022: 7167-7171
[c294]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MoritzHWR22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MoritzHWR22
Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux:
Sequence Transduction with Graph-Based Supervision. ICASSP 2022: 7212-7216
[c293]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChangMHWR22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChangMHWR22
Xuankai Chang, Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux:
Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR. ICASSP 2022: 7322-7326
[c292]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HoriguchiTGWK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HoriguchiTGWK22
Shota Horiguchi, Yuki Takashima, Paola García, Shinji Watanabe, Yohei Kawaguchi:
Multi-Channel End-To-End Neural Diarization with Distributed Microphones. ICASSP 2022: 7332-7336
[c291]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LuWWRYT22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LuWWRYT22
Yen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Alexander Richard, Cheng Yu, Yu Tsao:
Conditional Diffusion Probabilistic Model for Speech Enhancement. ICASSP 2022: 7402-7406
[c290]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/PanLKHW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/PanLKHW22
Jing Pan, Tao Lei, Kwangyoun Kim, Kyu Jeong Han, Shinji Watanabe:
SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition. ICASSP 2022: 7872-7876
[c289]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/NarisettyTCKHW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/NarisettyTCKHW22
Chaitanya Narisetty, Emiru Tsunoo, Xuankai Chang, Yosuke Kashiwagi, Michael Hentschel, Shinji Watanabe:
Joint Speech Recognition and Audio Captioning. ICASSP 2022: 7892-7896
[c288]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TsunooNHKW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TsunooNHKW22
Emiru Tsunoo, Chaitanya Narisetty, Michael Hentschel, Yosuke Kashiwagi, Shinji Watanabe:
Run-and-Back Stitch Search: Novel Block Synchronous Decoding For Streaming Encoder-Decoder ASR. ICASSP 2022: 8287-8291
[c287]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DengYWHCZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DengYWHCZ22
Keqi Deng, Zehui Yang, Shinji Watanabe, Yosuke Higuchi, Gaofeng Cheng, Pengyuan Zhang:
Improving Non-Autoregressive End-to-End Speech Recognition with Pre-Trained Acoustic and Language Models. ICASSP 2022: 8522-8526
[c286]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LuCCZLNWW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LuCCZLNWW22
Yen-Ju Lu, Samuele Cornell, Xuankai Chang, Wangyou Zhang, Chenda Li, Zhaoheng Ni, Zhong-Qiu Wang, Shinji Watanabe:
Towards Low-Distortion Multi-Channel Speech Enhancement: The ESPNET-Se Submission to the L3DAS22 Challenge. ICASSP 2022: 9201-9205
[c285]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenZDLCWSSLYPG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenZDLCWSSLYPG22
Hang Chen, Hengshun Zhou, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Bao-Cai Yin, Jia Pan, Jianqing Gao, Cong Liu:
The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results. ICASSP 2022: 9266-9270
[c284]
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/PengDL022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/PengDL022
Yifan Peng, Siddharth Dalmia, Ian R. Lane, Shinji Watanabe:
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding. ICML 2022: 17627-17643
[c283]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JuKYKKM022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JuKYKKM022
Yooncheol Ju, Ilhwan Kim, Hongsun Yang, Ji-Hoon Kim, Byeongyeol Kim, Soumi Maiti, Shinji Watanabe:
TriniTTS: Pitch-controllable End-to-end TTS without External Aligner. INTERSPEECH 2022: 16-20
[c282]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Wu0GBA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Wu0GBA22
Peter Wu, Shinji Watanabe, Louis Goldstein, Alan W. Black, Gopala Krishna Anumanchipalli:
Deep Speech Synthesis from Articulatory Representations. INTERSPEECH 2022: 779-783
[c281]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaekakuFP022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaekakuFP022
Takashi Maekaku, Yuya Fujita, Yifan Peng, Shinji Watanabe:
Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR. INTERSPEECH 2022: 1071-1075
[c280]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouDZNLS0SCXG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouDZNLS0SCXG22
Hengshun Zhou, Jun Du, Gongzhen Zou, Zhaoxu Nian, Chin-Hui Lee, Sabato Marco Siniscalchi, Shinji Watanabe, Odette Scharenborg, Jingdong Chen, Shifu Xiong, Jianqing Gao:
Audio-Visual Wake Word Spotting in MISP2021 Challenge: Dataset Release and Deep Analysis. INTERSPEECH 2022: 1111-1115
[c279]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiSH0K22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiSH0K22
Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury:
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States. INTERSPEECH 2022: 1656-1660
[c278]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Deng0SA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Deng0SA22
Keqi Deng, Shinji Watanabe, Jiatong Shi, Siddhant Arora:
Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation. INTERSPEECH 2022: 1746-1750
[c277]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenDDLS0SCYP22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenDDLS0SCYP22
Hang Chen, Jun Du, Yusheng Dai, Chin-Hui Lee, Sabato Marco Siniscalchi, Shinji Watanabe, Odette Scharenborg, Jingdong Chen, Baocai Yin, Jia Pan:
Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis. INTERSPEECH 2022: 1766-1770
[c276]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Shinohara022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Shinohara022
Yusuke Shinohara, Shinji Watanabe:
Minimum latency training of sequence transducers for streaming end-to-end speech recognition. INTERSPEECH 2022: 2098-2102
[c275]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TakashimaH0GK22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TakashimaH0GK22
Yuki Takashima, Shota Horiguchi, Shinji Watanabe, Leibny Paola García-Perera, Yohei Kawaguchi:
Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models. INTERSPEECH 2022: 2218-2222
[c274]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangL022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangL022
Muqiao Yang, Ian R. Lane, Shinji Watanabe:
Online Continual Learning of End-to-End Speech Recognition Models. INTERSPEECH 2022: 2668-2672
[c273]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangKB00R22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangKB00R22
Muqiao Yang, Joseph Konan, David Bick, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
Improving Speech Enhancement through Fine-Grained Speech Characteristics. INTERSPEECH 2022: 2953-2957
[c272]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AroraDCYB022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AroraDCYB022
Siddhant Arora, Siddharth Dalmia, Xuankai Chang, Brian Yan, Alan W. Black, Shinji Watanabe:
Two-Pass Low Latency End-to-End Spoken Language Understanding. INTERSPEECH 2022: 3478-3482
[c271]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BerrebbiSYLA022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BerrebbiSYLA022
Dan Berrebbi, Jiatong Shi, Brian Yan, Osbel López-Francisco, Jonathan D. Amith, Shinji Watanabe:
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation. INTERSPEECH 2022: 3533-3537
[c270]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RobinsonOGM022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RobinsonOGM022
Nathaniel Romney Robinson, Perez Ogayo, Swetha R. Gangu, David R. Mortensen, Shinji Watanabe:
When Is TTS Augmentation Through a Pivot Language Useful? INTERSPEECH 2022: 3538-3542
[c269]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChangMFW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChangMFW22
Xuankai Chang, Takashi Maekaku, Yuya Fujita, Shinji Watanabe:
End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation. INTERSPEECH 2022: 3819-3823
[c268]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TsunooKN022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TsunooKN022
Emiru Tsunoo, Yosuke Kashiwagi, Chaitanya Prasad Narisetty, Shinji Watanabe:
Residual Language Model for End-to-end Speech Recognition. INTERSPEECH 2022: 3899-3903
[c267]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GuoSQ0J22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GuoSQ0J22
Shuai Guo, Jiatong Shi, Tao Qian, Shinji Watanabe, Qin Jin:
SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy. INTERSPEECH 2022: 4272-4276
[c266]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiGQHWXCLW0J22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiGQHWXCLW0J22
Jiatong Shi, Shuai Guo, Tao Qian, Tomoki Hayashi, Yuning Wu, Fangzheng Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin:
Muskits: an End-to-end Music Processing Toolkit for Singing Voice Synthesis. INTERSPEECH 2022: 4277-4281
[c265]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeeL022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeeL022
Jaesong Lee, Lukas Lee, Shinji Watanabe:
Memory-Efficient Training of RNN-Transducer with Sampled Softmax. INTERSPEECH 2022: 4441-4445
[c264]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Sudo0NS022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Sudo0NS022
Yui Sudo, Muhammad Shakeel, Kazuhiro Nakadai, Jiatong Shi, Shinji Watanabe:
Streaming Automatic Speech Recognition with Re-blocking Processing Based on Integrated Voice Activity Detection. INTERSPEECH 2022: 4641-4645
[c263]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiMMB022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiMMB022
Xinjian Li, Florian Metze, David R. Mortensen, Alan W. Black, Shinji Watanabe:
ASR2K: Speech Recognition for Around 2000 Languages without Audio. INTERSPEECH 2022: 4885-4889
[c262]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KomatsuFLL0K22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KomatsuFLL0K22
Tatsuya Komatsu, Yusuke Fujita, Jaesong Lee, Lukas Lee, Shinji Watanabe, Yusuke Kida:
Better Intermediates Improve CTC Inference. INTERSPEECH 2022: 4965-4969
[c261]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuCLZCNMYSW0Q022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuCLZCNMYSW0Q022
Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhong-Qiu Wang, Yu Tsao, Yanmin Qian, Shinji Watanabe:
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding. INTERSPEECH 2022: 5458-5462
[c260]
- view
  authority control:
- export record
  dblp key:
  - conf/iwslt/AnastasopoulosB22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwslt/AnastasopoulosB22
Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondrej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vera Kloudová, Surafel Melaku Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nadejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Miguel Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe:
Findings of the IWSLT 2022 Evaluation Campaign. IWSLT@ACL 2022: 98-157
[c259]
- view
  authority control:
- export record
  dblp key:
  - conf/iwslt/YanFDSPBWNW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwslt/YanFDSPBWNW22
Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe:
CMU's IWSLT 2022 Dialect Speech Translation System. IWSLT@ACL 2022: 298-307
[c258]
- view
  - electronic edition @ aclanthology.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/lrec/LiMMB022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/lrec/LiMMB022
Xinjian Li, Florian Metze, David R. Mortensen, Alan W. Black, Shinji Watanabe:
Phone Inventories and Recognition for Every Language. LREC 2022: 1061-1067
[c257]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/KimWPPSHW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/KimWPPSHW22
Kwangyoun Kim, Felix Wu, Yifan Peng, Jing Pan, Prashant Sridhar, Kyu Jeong Han, Shinji Watanabe:
E-Branchformer: Branchformer with Enhanced Merging for Speech Recognition. SLT 2022: 84-91
[c256]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MasuyamaCCWO22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MasuyamaCCWO22
Yoshiki Masuyama, Xuankai Chang, Samuele Cornell, Shinji Watanabe, Nobutaka Ono:
End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation. SLT 2022: 260-265
[c255]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/PengAHUKGDCW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/PengAHUKGDCW22
Yifan Peng, Siddhant Arora, Yosuke Higuchi, Yushi Ueda, Sujay Kumar, Karthik Ganesan, Siddharth Dalmia, Xuankai Chang, Shinji Watanabe:
A Study on the Integration of Pre-Trained SSL, ASR, LM and SLU Models for Spoken Language Understanding. SLT 2022: 406-413
[c254]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MaitiUWZYZX22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MaitiUWZYZX22
Soumi Maiti, Yushi Ueda, Shinji Watanabe, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Yong Xu:
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers. SLT 2022: 480-487
[c253]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ScheiblerZCWQ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ScheiblerZCWQ22
Robin Scheibler, Wangyou Zhang, Xuankai Chang, Shinji Watanabe, Yanmin Qian:
End-to-End Multi-Speaker ASR with Independent Vector Analysis. SLT 2022: 496-501
[c252]
- view
  authority control:
  - iBetuBet.com - Apostando apenas na Vitórias!