default search action
Zhengqi Wen
Person information
Other persons with a similar name
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j19]Tao Wang, Jiangyan Yi, Ruibo Fu, Jianhua Tao, Zhengqi Wen, Chu Yuan Zhang:
Emotion selectable end-to-end text-based speech editing. Artif. Intell. 329: 104076 (2024) - [j18]Guofeng Yi, Cunhang Fan, Kang Zhu, Zhao Lv, Shan Liang, Zhengqi Wen, Guanxiong Pei, Taihao Li, Jianhua Tao:
VLP2MSA: Expanding vision-language pre-training to multimodal sentiment analysis. Knowl. Based Syst. 283: 111136 (2024) - [j17]Cunhang Fan, Mingming Ding, Jianhua Tao, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Zhao Lv:
Dual-Branch Knowledge Distillation for Noise-Robust Synthetic Speech Detection. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2453-2466 (2024) - [c86]Kang Zhu, Cunhang Fan, Jianhua Tao, Jun Xue, Heng Xie, Xuefei Liu, Yongwei Li, Zhengqi Wen, Zhao Lv:
Dual-View Multimodal Interaction in Multimodal Sentiment Analysis. ICME 2024: 1-6 - [i52]Yuankun Xie, Yi Lu, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Jianhua Tao, Xin Qi, Xiaopeng Wang, Yukun Liu, Haonan Cheng, Long Ye, Yi Sun:
The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio. CoRR abs/2405.04880 (2024) - [i51]Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Yuankun Xie, Yukun Liu, Xiaopeng Wang, Xuefei Liu, Yongwei Li, Jianhua Tao, Yi Lu, Xin Qi, Shuchen Shi:
Generalized Fake Audio Detection via Deep Stable Learning. CoRR abs/2406.03237 (2024) - [i50]Yuankun Xie, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Xiaopeng Wang, Haonan Cheng, Long Ye, Jianhua Tao:
Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy. CoRR abs/2406.03240 (2024) - [i49]Xiaopeng Wang, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Yuankun Xie, Yukun Liu, Jianhua Tao, Xuefei Liu, Yongwei Li, Xin Qi, Yi Lu, Shuchen Shi:
Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection. CoRR abs/2406.03247 (2024) - [i48]Shuchen Shi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Tao Wang, Chunyu Qiang, Yi Lu, Xin Qi, Xuefei Liu, Yukun Liu, Yongwei Li, Zhiyong Wang, Xiaopeng Wang:
PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation. CoRR abs/2406.04683 (2024) - [i47]Junzuo Zhou, Jiangyan Yi, Tao Wang, Jianhua Tao, Ye Bai, Chu Yuan Zhang, Yong Ren, Zhengqi Wen:
TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking. CoRR abs/2406.04840 (2024) - [i46]Yi Lu, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Zhiyong Wang, Xin Qi, Xuefei Liu, Yongwei Li, Yukun Liu, Xiaopeng Wang, Shuchen Shi:
Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio. CoRR abs/2406.08112 (2024) - [i45]Ruibo Fu, Shuchen Shi, Hongming Guo, Tao Wang, Chunyu Qiang, Zhengqi Wen, Jianhua Tao, Xin Qi, Yi Lu, Xiaopeng Wang, Zhiyong Wang, Yukun Liu, Xuefei Liu, Shuai Zhang, Guanjun Li:
MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation. CoRR abs/2406.10591 (2024) - [i44]Ruihan Jin, Ruibo Fu, Zhengqi Wen, Shuai Zhang, Yukun Liu, Jianhua Tao:
Fake News Detection and Manipulation Reasoning via Large Vision-Language Models. CoRR abs/2407.02042 (2024) - [i43]Ruibo Fu, Xin Qi, Zhengqi Wen, Jianhua Tao, Tao Wang, Chunyu Qiang, Zhiyong Wang, Yi Lu, Xiaopeng Wang, Shuchen Shi, Yukun Liu, Xuefei Liu, Shuai Zhang:
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation. CoRR abs/2407.05421 (2024) - [i42]Ruibo Fu, Rui Liu, Chunyu Qiang, Yingming Gao, Yi Lu, Shuchen Shi, Tao Wang, Ya Li, Zhengqi Wen, Chen Zhang, Hui Bu, Yukun Liu, Xin Qi, Guanjun Li:
ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024. CoRR abs/2407.12038 (2024) - [i41]Cong Cai, Shan Liang, Xuefei Liu, Kang Zhu, Zhengqi Wen, Jianhua Tao, Heng Xie, Jizhou Cui, Yiming Ma, Zhenhua Cheng, Hanzhe Xu, Ruibo Fu, Bin Liu, Yongwei Li:
MDPE: A Multimodal Deception Dataset with Personality and Emotional Characteristics. CoRR abs/2407.12274 (2024) - [i40]Chunyu Qiang, Wang Geng, Yi Zhao, Ruibo Fu, Tao Wang, Cheng Gong, Tianrui Wang, Qiuyu Liu, Jiangyan Yi, Zhengqi Wen, Chen Zhang, Hao Che, Longbiao Wang, Jiangwu Dang, Jianhua Tao:
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing. CoRR abs/2408.05758 (2024) - [i39]Yuankun Xie, Xiaopeng Wang, Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Haonan Cheng, Long Ye:
Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge. CoRR abs/2408.06922 (2024) - [i38]Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Yukun Liu, Guanjun Li, Xin Qi, Yi Lu, Xuefei Liu, Yongwei Li:
A Noval Feature via Color Quantisation for Fake Audio Detection. CoRR abs/2408.10849 (2024) - [i37]Xin Qi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Shuchen Shi, Yi Lu, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Guanjun Li, Xuefei Liu, Yongwei Li:
EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech. CoRR abs/2408.10852 (2024) - [i36]Yuankun Xie, Chenxu Xiong, Xiaopeng Wang, Zhiyong Wang, Yi Lu, Xin Qi, Ruibo Fu, Yukun Liu, Zhengqi Wen, Jianhua Tao, Guanjun Li, Long Ye:
Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio? CoRR abs/2408.10853 (2024) - [i35]Moyang Liu, Yukun Liu, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Xuefei Liu, Guanjun Li:
Exploring the Role of Audio in Multimodal Misinformation Detection. CoRR abs/2408.12558 (2024) - [i34]Chenxu Xiong, Ruibo Fu, Shuchen Shi, Zhengqi Wen, Jianhua Tao, Tao Wang, Chenxing Li, Chunyu Qiang, Yuankun Xie, Xin Qi, Guanjun Li, Zizheng Yang:
Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation. CoRR abs/2409.09381 (2024) - [i33]Xin Qi, Ruibo Fu, Zhengqi Wen, Tao Wang, Chunyu Qiang, Jianhua Tao, Chenxing Li, Yi Lu, Shuchen Shi, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Xuefei Liu, Guanjun Li:
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech. CoRR abs/2409.11835 (2024) - [i32]Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Xiaopeng Wang, Yuankun Xie, Xin Qi, Shuchen Shi, Yi Lu, Yukun Liu, Chenxing Li, Xuefei Liu, Guanjun Li:
Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0. CoRR abs/2409.11909 (2024) - 2023
- [c85]Junjie Chen, Yongwei Li, Ziping Zhao, Xuefei Liu, Zhengqi Wen, Jianhua Tao:
Hybrid Multi-Task Learning for End-To-End Multimodal Emotion Recognition. APSIPA ASC 2023: 1966-1971 - [c84]Yi Lu, Ruibo Fu, Xin Qi, Zhengqi Wen, Jianhua Tao, Jiangyan Yi, Tao Wang, Yong Ren, Chuyuan Zhang, Chenyu Yang, Wenling Shi:
The VIBVG Speech Synthesis System for Blizzard Challenge 2023. Blizzard Challenge 2023 - [c83]Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li:
ADD 2023: the Second Audio Deepfake Detection Challenge. DADA@IJCAI 2023: 125-130 - [c82]Jun Xue, Cunhang Fan, Jiangyan Yi, Chenglong Wang, Zhengqi Wen, Dan Zhang, Zhao Lv:
Learning From Yourself: A Self-Distillation Method For Fake Speech Detection. ICASSP 2023: 1-5 - [c81]Heng Xie, Jizhou Cui, Yuhang Cao, Junjie Chen, Jianhua Tao, Cunhang Fan, Xuefei Liu, Zhengqi Wen, Heng Lu, Yuguang Yang, Zhao Lv, Yongwei Li:
Multimodal Cross-Lingual Features and Weight Fusion for Cross-Cultural Humor Detection. MuSe@ACM Multimedia 2023: 51-57 - [i31]Haogeng Liu, Tao Wang, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Jianhua Tao:
UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion. CoRR abs/2301.03801 (2023) - [i30]Jun Xue, Cunhang Fan, Jiangyan Yi, Chenglong Wang, Zhengqi Wen, Dan Zhang, Zhao Lv:
Learning From Yourself: A Self-Distillation Method for Fake Speech Detection. CoRR abs/2303.01211 (2023) - [i29]Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li:
ADD 2023: the Second Audio Deepfake Detection Challenge. CoRR abs/2305.13774 (2023) - [i28]Cunhang Fan, Mingming Ding, Jianhua Tao, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Zhao Lv:
Learning to Behave Like Clean Speech: Dual-Branch Knowledge Distillation for Noise-Robust Fake Audio Detection. CoRR abs/2310.08869 (2023) - 2022
- [j16]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
Hybrid Autoregressive and Non-Autoregressive Transformer Models for Speech Recognition. IEEE Signal Process. Lett. 29: 762-766 (2022) - [j15]Tao Wang, Ruibo Fu, Jiangyan Yi, Jianhua Tao, Zhengqi Wen:
NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation. IEEE ACM Trans. Audio Speech Lang. Process. 30: 865-878 (2022) - [j14]Tao Wang, Jiangyan Yi, Ruibo Fu, Jianhua Tao, Zhengqi Wen:
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing. IEEE ACM Trans. Audio Speech Lang. Process. 30: 2241-2254 (2022) - [c80]Tao Wang, Jiangyan Yi, Liqun Deng, Ruibo Fu, Jianhua Tao, Zhengqi Wen:
Context-Aware Mask Prediction Network for End-to-End Text-Based Speech Editing. ICASSP 2022: 6082-6086 - [c79]Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li:
ADD 2022: the first Audio Deep Synthesis Detection Challenge. ICASSP 2022: 9216-9220 - [c78]Jun Xue, Cunhang Fan, Zhao Lv, Jianhua Tao, Jiangyan Yi, Chengshi Zheng, Zhengqi Wen, Minmin Yuan, Shegang Shao:
Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features. DDAM@MM 2022: 19-26 - [c77]Tao Wang, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Jianhua Tao:
Singing-Tacotron: Global Duration Control Attention and Dynamic Filter for End-to-end Singing Voice Synthesis. DDAM@MM 2022: 53-59 - [i27]Tao Wang, Ruibo Fu, Jiangyan Yi, Jianhua Tao, Zhengqi Wen:
Singing-Tacotron: Global duration control attention and dynamic filter for End-to-end singing voice synthesis. CoRR abs/2202.07907 (2022) - [i26]Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li, Zheng Lian, Bin Liu:
ADD 2022: the First Audio Deep Synthesis Detection Challenge. CoRR abs/2202.08433 (2022) - [i25]Tao Wang, Jiangyan Yi, Ruibo Fu, Jianhua Tao, Zhengqi Wen:
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing. CoRR abs/2202.09950 (2022) - [i24]Tao Wang, Ruibo Fu, Jiangyan Yi, Jianhua Tao, Zhengqi Wen:
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation. CoRR abs/2203.02678 (2022) - [i23]Jun Xue, Cunhang Fan, Zhao Lv, Jianhua Tao, Jiangyan Yi, Chengshi Zheng, Zhengqi Wen, Minmin Yuan, Shegang Shao:
Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features. CoRR abs/2208.01214 (2022) - [i22]Chunyu Qiang, Jianhua Tao, Ruibo Fu, Zhengqi Wen, Jiangyan Yi, Tao Wang, Shiming Wang:
Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS. CoRR abs/2210.11429 (2022) - [i21]Tao Wang, Jiangyan Yi, Ruibo Fu, Jianhua Tao, Zhengqi Wen, Chu Yuan Zhang:
Emotion Selectable End-to-End Text-based Speech Editing. CoRR abs/2212.10191 (2022) - 2021
- [j13]Cunhang Fan, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Bin Liu, Zhengqi Wen:
Gated Recurrent Fusion With Joint Training Framework for Robust End-to-End Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 29: 198-209 (2021) - [j12]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Zhengkun Tian, Shuai Zhang:
Integrating Knowledge Into End-to-End Speech Recognition From External Text-Only Data. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1340-1351 (2021) - [j11]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang:
Fast End-to-End Speech Recognition Via Non-Autoregressive Models and Cross-Modal Knowledge Transferring From BERT. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1897-1911 (2021) - [c76]Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
One In A Hundred: Selecting the Best Predicted Sequence from Numerous Candidates for Speech Recognition. APSIPA ASC 2021: 454-459 - [c75]Bo Liu, Wei Zhang, Zhengqi Wen, Zhen Wei, Zhuo Wang, Shujuan Li, Yuanyuan Hu, Wenlan Li, Lisha Guo:
Simulation Analysis on Seismic Capacity of 220kV GIS Switch Bay Mobile Load Transfer Equipment. EEET 2021: 1-8 - [c74]Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Zhengqi Wen:
Decoupling Pronunciation and Language for End-to-End Code-Switching Automatic Speech Recognition. ICASSP 2021: 6249-6253 - [c73]Ruibo Fu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Tao Wang, Chunyu Qiang:
Bi-Level Style and Prosody Decoupling Modeling for Personalized End-to-End Speech Synthesis. ICASSP 2021: 6568-6572 - [c72]Tao Wang, Ruibo Fu, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Chunyu Qiang, Shiming Wang:
Prosody and Voice Factorization for Few-Shot Speaker Adaptation in the Challenge M2voc 2021. ICASSP 2021: 8603-8607 - [c71]Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Xuefei Liu, Zhengqi Wen:
End-to-End Spelling Correction Conditioned on Acoustic Feature for Code-Switching Speech Recognition. Interspeech 2021: 266-270 - [c70]Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization. Interspeech 2021: 4034-4038 - [c69]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen, Leichao Song:
Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning. ISCSLP 2021: 1-5 - [c68]Zheng Lian, Rongxiu Zhong, Zhengqi Wen, Bin Liu, Jianhua Tao:
Towards Fine-Grained Prosody Control for Voice Conversion. ISCSLP 2021: 1-5 - [c67]Chunyu Qiang, Jianhua Tao, Ruibo Fu, Zhengqi Wen, Jiangyan Yi, Tao Wang, Shiming Wang:
Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS. ISCSLP 2021: 1-5 - [c66]Xuefei Liu, Jianhua Tao, Yurong Han, Chenglong Wang, Xueying Zheng, Zhengqi Wen:
Which Phonemes Will Distinguish the Different Regions Within the Same Dialect? O-COCOSDA 2021: 152-157 - [i20]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang:
Fast End-to-End Speech Recognition via a Non-Autoregressive Model and Cross-Modal Knowledge Transferring from BERT. CoRR abs/2102.07594 (2021) - [i19]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Shuai Zhang, Zhengqi Wen, Xuefei Liu:
TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition. CoRR abs/2104.01522 (2021) - [i18]Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization. CoRR abs/2104.02882 (2021) - 2020
- [j10]Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen, Xuefei Liu:
End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1303-1314 (2020) - [j9]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Cunhang Fan:
A Public Chinese Dataset for Language Model Adaptation. J. Signal Process. Syst. 92(8): 839-851 (2020) - [c65]Tao Wang, Jianhua Tao, Ruibo Fu, Zhengqi Wen, Chunyu Qiang:
The NLPR Speech Synthesis entry for Blizzard Challenge 2020. Blizzard Challenge / Voice Conversion Challenge 2020 - [c64]Zheng Lian, Jianhua Tao, Zhengqi Wen, Rongxiu Zhong:
CASIA Voice Conversion System for the Voice Conversion Challenge 2020. Blizzard Challenge / Voice Conversion Challenge 2020 - [c63]Ruibo Fu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Tao Wang:
Focusing on Attention: Prosody Transfer and Adaptative Optimization Strategy for Multi-Speaker End-to-End Speech Synthesis. ICASSP 2020: 6709-6713 - [c62]Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
Synchronous Transformers for end-to-end Speech Recognition. ICASSP 2020: 7884-7888 - [c61]Tao Wang, Jianhua Tao, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Rongxiu Zhong:
Spoken Content and Voice Factorization for Few-Shot Speaker Adaptation. INTERSPEECH 2020: 796-800 - [c60]Ruibo Fu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Chunyu Qiang, Tao Wang:
Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis. INTERSPEECH 2020: 2937-2941 - [c59]Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen:
Gated Recurrent Fusion of Spatial and Spectral Features for Multi-Channel Speech Separation with Deep Embedding Representations. INTERSPEECH 2020: 3321-3325 - [c58]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang:
Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition. INTERSPEECH 2020: 3381-3385 - [c57]Tao Wang, Xuefei Liu, Jianhua Tao, Jiangyan Yi, Ruibo Fu, Zhengqi Wen:
Non-Autoregressive End-to-End TTS with Coarse-to-Fine Decoding. INTERSPEECH 2020: 3984-3988 - [c56]Tao Wang, Jianhua Tao, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Chunyu Qiang:
Bi-Level Speaker Supervision for One-Shot Speech Synthesis. INTERSPEECH 2020: 3989-3993 - [c55]Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen:
Joint Training for Simultaneous Speech Denoising and Dereverberation with Deep Embedding Representations. INTERSPEECH 2020: 4536-4540 - [c54]Ruibo Fu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Tao Wang, Chunyu Qiang:
Dynamic Speaker Representations Adjustment and Decoder Factorization for Speaker Adaptation in End-to-End Speech Synthesis. INTERSPEECH 2020: 4701-4705 - [c53]Zheng Lian, Zhengqi Wen, Xinyong Zhou, Songbai Pu, Shengkai Zhang, Jianhua Tao:
ARVC: An Auto-Regressive Voice Conversion System Without Parallel Training Data. INTERSPEECH 2020: 4706-4710 - [c52]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Shuai Zhang, Zhengqi Wen:
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition. INTERSPEECH 2020: 5026-5030 - [i17]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen:
Spatial and spectral deep attention fusion for multi-channel speech separation using deep embedding features. CoRR abs/2002.01626 (2020) - [i16]Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen, Xuefei Liu:
Deep Attention Fusion Feature for Speech Separation with End-to-End Post-filter Method. CoRR abs/2003.07544 (2020) - [i15]Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen:
Simultaneous Denoising and Dereverberation Using Deep Embedding Features. CoRR abs/2004.02420 (2020) - [i14]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang:
Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition. CoRR abs/2005.04862 (2020) - [i13]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Shuai Zhang, Zhengqi Wen:
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition. CoRR abs/2005.07903 (2020) - [i12]Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Zhengqi Wen:
Decoupling Pronunciation and Language for End-to-end Code-switching Automatic Speech Recognition. CoRR abs/2010.14798 (2020) - [i11]Cunhang Fan, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Bin Liu, Zhengqi Wen:
Gated Recurrent Fusion with Joint Training Framework for Robust End-to-End Speech Recognition. CoRR abs/2011.04249 (2020) - [i10]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen, Leichao Song:
Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning. CoRR abs/2011.05591 (2020)
2010 – 2019
- 2019
- [j8]Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Ye Bai:
Language-Adversarial Transfer Learning for Low-Resource Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 27(3): 621-630 (2019) - [j7]Yibin Zheng, Jianhua Tao, Zhengqi Wen, Jiangyan Yi:
Forward-Backward Decoding Sequence for Regularizing End-to-End TTS. IEEE ACM Trans. Audio Speech Lang. Process. 27(12): 2067-2079 (2019) - [c51]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen, Ye Bai:
Noise Prior Knowledge Learning for Speech Enhancement via Gated Convolutional Generative Adversarial Network. APSIPA 2019: 662-666 - [c50]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Bin Liu:
Voice Activity Detection Based on Time-Delay Neural Networks. APSIPA 2019: 1173-1178 - [c49]Jianhua Tao, Ruibo Fu, Zhengqi Wen:
The NLPR Speech Synthesis entry for Blizzard Challenge 2019. Blizzard Challenge 2019 - [c48]Ruibo Fu, Jianhua Tao, Zhengqi Wen, Yibin Zheng:
Phoneme Dependent Speaker Embedding and Model Factorization for Multi-speaker Speech Synthesis and Adaptation. ICASSP 2019: 6930-6934 - [c47]Yibin Zheng, Xi Wang, Lei He, Shifeng Pan, Frank K. Soong, Zhengqi Wen, Jianhua Tao:
Forward-Backward Decoding for Regularizing End-to-End TTS. INTERSPEECH 2019: 1283-1287 - [c46]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Zhengkun Tian, Chenghao Zhao, Cunhang Fan:
A Time Delay Neural Network with Shared Weight Self-Attention for Small-Footprint Keyword Spotting. INTERSPEECH 2019: 2190-2194 - [c45]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen:
Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition. INTERSPEECH 2019: 3795-3799 - [c44]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengqi Wen:
Self-Attention Transducers for End-to-End Speech Recognition. INTERSPEECH 2019: 4395-4399 - [c43]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen:
Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features. INTERSPEECH 2019: 4599-4603 - [i9]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen:
Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition. CoRR abs/1907.06017 (2019) - [i8]Yibin Zheng, Xi Wang, Lei He, Shifeng Pan, Frank K. Soong, Zhengqi Wen, Jianhua Tao:
Forward-Backward Decoding for Regularizing End-to-End TTS. CoRR abs/1907.09006 (2019) - [i7]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen:
Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features. CoRR abs/1907.09884 (2019) - [i6]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengqi Wen:
Self-Attention Transducers for End-to-End Speech Recognition. CoRR abs/1909.13037 (2019) - [i5]Zheng Lian, Jianhua Tao, Zhengqi Wen, Bin Liu, Yibin Zheng, Rongxiu Zhong:
Towards Fine-Grained Prosody Control for Voice Conversion. CoRR abs/1910.11269 (2019) - [i4]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang:
Integrating Whole Context to Sequence-to-sequence Speech Recognition. CoRR abs/1912.01777 (2019) - [i3]Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
Synchronous Transformers for End-to-End Speech Recognition. CoRR abs/1912.02958 (2019) - 2018
- [j6]Jiangyan Yi, Zhengqi Wen, Jianhua Tao, Hao Ni, Bin Liu:
CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition. J. Signal Process. Syst. 90(7): 985-997 (2018) - [j5]Zhengqi Wen, Kehuang Li, Zhen Huang, Chin-Hui Lee, Jianhua Tao:
Improving Deep Neural Network Based Speech Synthesis through Contextual Feature Parametrization and Multi-Task Learning. J. Signal Process. Syst. 90(7): 1025-1037 (2018) - [j4]Yibin Zheng, Ya Li, Zhengqi Wen, Bin Liu, Jianhua Tao:
Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin. J. Signal Process. Syst. 90(7): 1039-1052 (2018) - [c42]Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Ye Bai:
Adversarial Multilingual Training for Low-Resource Speech Recognition. ICASSP 2018: 4899-4903 - [c41]Yibin Zheng, Jianhua Tao, Zhengqi Wen, Ya Li:
BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in a Text-to-Speech Front-End. INTERSPEECH 2018: 47-51 - [c40]Ruibo Fu, Jianhua Tao, Yibin Zheng, Zhengqi Wen:
Transfer Learning Based Progressive Neural Networks for Acoustic Modeling in Statistical Parametric Speech Synthesis. INTERSPEECH 2018: 907-911 - [c39]Yibin Zheng, Jianhua Tao, Zhengqi Wen, Ruibo Fu:
On the Application and Compression of Deep Time Delay Neural Network for Embedded Statistical Parametric Speech Synthesis. INTERSPEECH 2018: 922-926 - [c38]Ruibo Fu, Jianhua Tao, Yibin Zheng, Zhengqi Wen:
Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer. INTERSPEECH 2018: 2514-2518 - [c37]Cunhang Fan, Bin Liu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Ye Bai:
Utterance-level Permutation Invariant Training with Discriminative Learning for Single Channel Speech Separation. ISCSLP 2018: 26-30 - [c36]Ye Bai, Jianhua Tao, Jiangyan Yi, Zhengqi Wen, Cunhang Fan:
CLMAD: A Chinese Language Model Adaptation Dataset. ISCSLP 2018: 275-279 - [i2]Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Bin Liu:
Distilling Knowledge Using Parallel Data for Far-field Speech Recognition. CoRR abs/1802.06941 (2018) - 2017
- [c35]Jianhua Tao, Ruibo Fu, Yibin Zheng, Zhengqi Wen, Ya Li, Biu Liu:
The NLPR Speech Synthesis entry for Blizzard Challenge 2017. Blizzard Challenge 2017 - [c34]Yibin Zheng, Jianhua Tao, Zhengqi Wen, Ya Li, Bin Liu:
Investigating Efficient Feature Representation Methods and Training Objective for BLSTM-Based Phone Duration Prediction. INTERSPEECH 2017: 784-788 - [c33]Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Ya Li:
Distilling Knowledge from an Ensemble of Models for Punctuation Prediction. INTERSPEECH 2017: 2779-2783 - [c32]Jian Huang, Ya Li, Jianhua Tao, Zheng Lian, Zhengqi Wen, Minghao Yang, Jiangyan Yi:
Continuous Multimodal Emotion Prediction Based on Long Short Term Memory Recurrent Neural Network. AVEC@ACM Multimedia 2017: 11-18 - 2016
- [j3]Bin Liu, Jianhua Tao, Zhengqi Wen, Fuyuan Mo:
Speech Enhancement Based on Analysis-Synthesis Framework with Improved Parameter Domain Enhancement. J. Signal Process. Syst. 82(2): 141-150 (2016) - [j2]Hao Che, Ya Li, Jianhua Tao, Zhengqi Wen:
Investigating Effect of Rich Syntactic Features on Mandarin Prosodic Boundaries Prediction. J. Signal Process. Syst. 82(2): 263-271 (2016) - [c31]Zhengqi Wen, Kehuang Li, Jianhua Tao, Chin-Hui Lee:
Deep neural network based voice conversion with a large synthesized parallel corpus. APSIPA 2016: 1-5 - [c30]Jiangyan Yi, Hao Ni, Zhengqi Wen, Jianhua Tao:
Improving BLSTM RNN based Mandarin speech recognition using accent dependent bottleneck features. APSIPA 2016: 1-5 - [c29]Jianhua Tao, Yibin Zheng, Zhengqi Wen, Ya Li, Biu Liu:
BLSTM Guided Unit Selection Synthesis System for Blizzard Challenge 2016. Blizzard Challenge 2016 - [c28]Hao Ni, Jiangyan Yi, Zhengqi Wen, Jianhua Tao:
Recurrent Neural Network Based Language Model Adaptation for Accent Mandarin Speech. CCPR (2) 2016: 607-617 - [c27]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li, Zhengqi Wen:
Long short term memory recurrent neural network based encoding method for emotion recognition in video. ICASSP 2016: 2752-2756 - [c26]Zhengqi Wen, Ya Li, Jianhua Tao:
The Parameterized Phoneme Identity Feature as a Continuous Real-Valued Vector for Neural Network Based Speech Synthesis. INTERSPEECH 2016: 2248-2252 - [c25]Yibin Zheng, Ya Li, Zhengqi Wen, Xingguang Ding, Jianhua Tao:
Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach. INTERSPEECH 2016: 3201-3205 - [c24]Ye Bai, Jiangyan Yi, Hao Ni, Zhengqi Wen, Bin Liu, Ya Li, Jianhua Tao:
End-to-end keywords spotting based on connectionist temporal classification for Mandarin. ISCSLP 2016: 1-5 - [c23]Hao Ni, Jiangyan Yi, Zhengqi Wen, Bin Liu, Jianhua Tao:
Improving accented Mandarin speech recognition by using recurrent neural network based language model adaptation. ISCSLP 2016: 1-5 - [c22]Zhengqi Wen, Kehuang Li, Zhen Huang, Jianhua Tao, Chin-Hui Lee:
Learning auxiliary categorical information for speech synthesis based on deep and recurrent neural networks. ISCSLP 2016: 1-5 - [c21]Jiangyan Yi, Hao Ni, Zhengqi Wen, Bin Liu, Jianhua Tao:
CTC regularized model adaptation for improving LSTM RNN based multi-accent Mandarin speech recognition. ISCSLP 2016: 1-5 - [c20]Yibin Zheng, Ya Li, Zhengqi Wen, Bin Liu, Jianhua Tao:
Text-based sentential stress prediction using continuous lexical embedding for Mandarin speech synthesis. ISCSLP 2016: 1-5 - [c19]Yibin Zheng, Ya Li, Zhengqi Wen, Bin Liu, Jianhua Tao:
Investigating deep neural network adaptation for generating exclamatory and interrogative speech in Mandarin. ISCSLP 2016: 1-5 - [i1]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li, Zhengqi Wen:
Audio Visual Emotion Recognition with Temporal Alignment and Perception Attention. CoRR abs/1603.08321 (2016) - 2015
- [c18]Yang Wang, Minghao Yang, Zhengqi Wen, Jianhua Tao:
Combining extreme learning machine and decision tree for duration prediction in HMM based speech synthesis. INTERSPEECH 2015: 2197-2201 - [c17]Bin Liu, Jianhua Tao, Zhengqi Wen, Ya Li, Danish Bukhari:
A novel method of artificial bandwidth extension using deep architecture. INTERSPEECH 2015: 2598-2602 - [c16]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li, Zhengqi Wen:
Long Short Term Memory Recurrent Neural Network based Multimodal Dimensional Emotion Recognition. AVEC@ACM Multimedia 2015: 65-72 - 2014
- [j1]Zhengqi Wen, Jianhua Tao, Shifeng Pan, Yang Wang:
Pitch-Scaled Spectrum Based Excitation Model for HMM-based Speech Synthesis. J. Signal Process. Syst. 74(3): 423-435 (2014) - [c15]Ran Zhang, Jianhua Tao, Ya Li, Zhengqi Wen:
A novel hybrid mandarin speech synthesis system using different base units for model training and concatenation. ICASSP 2014: 295-299 - [c14]Ran Zhang, Zhengqi Wen, Jianhua Tao, Ya Li, Bing Liu, Xiaoyan Lou:
A hierarchical viterbi algorithm for Mandarin hybrid speech synthesis system. INTERSPEECH 2014: 795-799 - [c13]Xin Xu, Ya Li, Xiaoying Xu, Zhengqi Wen, Hao Che, Shanfeng Liu, Jianhua Tao:
Survey on discriminative feature selection for speech emotion recognition. ISCSLP 2014: 345-349 - [c12]Hao Che, Zhengqi Wen, Ya Li, Jianhua Tao:
Investigating effect of rich syntactic features on Mandarin prosodic phrase boundaries prediction. ISCSLP 2014: 501-505 - [c11]Shanfeng Liu, Zhengqi Wen, Ya Li, Jianhua Tao, Bin Liu:
Context features based pre-selection and weight prediction in concatenation speech synthesis system. ISCSLP 2014: 506-510 - [c10]Bin Liu, Jianhua Tao, Fuyuan Mo, Ya Li, Zhengqi Wen, Shanfeng Liu:
Efficient voice activity detection algorithm based on sub-band temporal envelope and sub-band long-term signal variability. ISCSLP 2014: 531-535 - [c9]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li, Zhengqi Wen:
Multi-scale Temporal Modeling for Dimensional Emotion Recognition in Video. AVEC@MM 2014: 11-18 - 2013
- [c8]Ran Zhang, Jianhua Tao, Ya Li, Zhengqi Wen:
A novel unit selection method for concatenation speech system using similarity measure. O-COCOSDA/CASLRE 2013: 1-5 - 2012
- [c7]Zhengqi Wen, Hideki Kawahara, Jianhua Tao:
Pitch-Scaled Analysis based Residual Reconstruction for Speech Analysis and Synthesis. INTERSPEECH 2012: 374-377 - [c6]Zhengqi Wen, Jianhua Tao:
Amplitude Spectrum based Excitation Model for HMM-based Speech Synthesis. INTERSPEECH 2012: 1428-1431 - [c5]Zhengqi Wen, Jianhua Tao, Hao Che:
Statistical modification based post-filtering technique for HMM-based speech synthesis. ISCSLP 2012: 146-149 - 2011
- [c4]Zhengqi Wen, Jianhua Tao:
Inverse Filtering Based Harmonic Plus Noise Excitation Model for HMM-Based Speech Synthesis. INTERSPEECH 2011: 1805-1808 - [c3]Zhengqi Wen, Jianhua Tao:
An excitation model based on inverse filtering for speech analysis and synthesis. MLSP 2011: 1-5 - 2010
- [c2]Jianhua Tao, Shifeng Pan, Ya Li, Zhengqi Wen, Yang Wang:
The WISTON Text to Speech System for Blizzard Challenge 2010. Blizzard Challenge 2010
2000 – 2009
- 2009
- [c1]Jianhua Tao, Ya Li, Shifeng Pan, Meng Zhang, Hongjun Sun, Zhengqi Wen:
The WISTON Text-to-Speech System for Blizzard Challenge 2009. Blizzard Challenge 2009
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-22 20:15 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint