iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://dblp.uni-trier.de/pid/254/2051.rss
dblp: Keyu An https://dblp.org/pid/254/2051.html dblp person page RSS feed Sat, 30 Nov 2024 01:11:11 +0100 en-US daily 1 released under the CC0 1.0 license dblp@dagstuhl.de (dblp team) dblp@dagstuhl.de (dblp team) Computers/Computer_Science/Publications/Bibliographies http://www.rssboard.org/rss-specification https://dblp.org/img/logo.144x51.pngdblp: Keyu Anhttps://dblp.org/pid/254/2051.html14451 FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs.https://doi.org/10.48550/arXiv.2407.04051, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , :
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs. CoRR abs/2407.04051 ()]]>
https://dblp.org/rec/journals/corr/abs-2407-04051Mon, 01 Jan 2024 00:00:00 +0100
Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition.https://doi.org/10.48550/arXiv.2409.17746, , , :
Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition. CoRR abs/2409.17746 ()]]>
https://dblp.org/rec/journals/corr/abs-2409-17746Mon, 01 Jan 2024 00:00:00 +0100
Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study.https://doi.org/10.48550/arXiv.2409.17750, , :
Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study. CoRR abs/2409.17750 ()]]>
https://dblp.org/rec/journals/corr/abs-2409-17750Mon, 01 Jan 2024 00:00:00 +0100
Analysis of Omni-Channel Evolution Game Strategy for E-Commerce Enterprises in the Context of Online and Offline Integration.https://doi.org/10.3390/systems11070321, , :
Analysis of Omni-Channel Evolution Game Strategy for E-Commerce Enterprises in the Context of Online and Offline Integration. Syst. 11(7): 321 ()]]>
https://dblp.org/rec/journals/systems/ChengXA23Sat, 01 Jul 2023 01:00:00 +0200
BAT: Boundary aware transducer for memory-efficient and low-latency ASR.https://doi.org/10.21437/Interspeech.2023-770, , :
BAT: Boundary aware transducer for memory-efficient and low-latency ASR. INTERSPEECH : 4963-4967]]>
https://dblp.org/rec/conf/interspeech/AnSZ23Sun, 01 Jan 2023 00:00:00 +0100
Exploring RWKV for Memory Efficient and Low Latency Streaming ASR.https://doi.org/10.48550/arXiv.2309.14758, :
Exploring RWKV for Memory Efficient and Low Latency Streaming ASR. CoRR abs/2309.14758 ()]]>
https://dblp.org/rec/journals/corr/abs-2309-14758Sun, 01 Jan 2023 00:00:00 +0100
Advancing VAD Systems Based on Multi-Task Learning with Improved Model Structures.https://doi.org/10.48550/arXiv.2312.14860, , , :
Advancing VAD Systems Based on Multi-Task Learning with Improved Model Structures. CoRR abs/2312.14860 ()]]>
https://dblp.org/rec/journals/corr/abs-2312-14860Sun, 01 Jan 2023 00:00:00 +0100
Dynamic Research on Three-Player Evolutionary Game in Waste Product Recycling Supply Chain System.https://doi.org/10.3390/systems10050185, , :
Dynamic Research on Three-Player Evolutionary Game in Waste Product Recycling Supply Chain System. Syst. 10(5): 185 ()]]>
https://dblp.org/rec/journals/systems/XieAC22Sat, 01 Jan 2022 00:00:00 +0100
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR.https://doi.org/10.21437/Interspeech.2022-11214, , , , , :
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR. INTERSPEECH : 2103-2107]]>
https://dblp.org/rec/conf/interspeech/AnZOXDW22Sat, 01 Jan 2022 00:00:00 +0100
An Empirical Study of Language Model Integration for Transducer based Speech Recognition.https://doi.org/10.21437/Interspeech.2022-10576, , , , , :
An Empirical Study of Language Model Integration for Transducer based Speech Recognition. INTERSPEECH : 3904-3908]]>
https://dblp.org/rec/conf/interspeech/ZhengAOHDW22Sat, 01 Jan 2022 00:00:00 +0100
Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study.https://doi.org/10.1109/ISCSLP57327.2022.10038153, , :
Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study. ISCSLP : 180-184]]>
https://dblp.org/rec/conf/iscslp/AnXO22Sat, 01 Jan 2022 00:00:00 +0100
Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study.https://doi.org/10.48550/arXiv.2203.16757, :
Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study. CoRR abs/2203.16757 ()]]>
https://dblp.org/rec/journals/corr/abs-2203-16757Sat, 01 Jan 2022 00:00:00 +0100
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR.https://doi.org/10.48550/arXiv.2203.16758, , , , , :
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR. CoRR abs/2203.16758 ()]]>
https://dblp.org/rec/journals/corr/abs-2203-16758Sat, 01 Jan 2022 00:00:00 +0100
An Empirical Study of Language Model Integration for Transducer based Speech Recognition.https://doi.org/10.48550/arXiv.2203.16776, , , , , :
An Empirical Study of Language Model Integration for Transducer based Speech Recognition. CoRR abs/2203.16776 ()]]>
https://dblp.org/rec/journals/corr/abs-2203-16776Sat, 01 Jan 2022 00:00:00 +0100
Multilingual and Crosslingual Speech Recognition Using Phonological-Vector Based Phone Embeddings.https://doi.org/10.1109/ASRU51503.2021.9687966, , , :
Multilingual and Crosslingual Speech Recognition Using Phonological-Vector Based Phone Embeddings. ASRU : 1034-1041]]>
https://dblp.org/rec/conf/asru/ZhuAZO21Fri, 01 Jan 2021 00:00:00 +0100
Deformable TDNN with Adaptive Receptive Fields for Speech Recognition.https://doi.org/10.21437/Interspeech.2021-387, , :
Deformable TDNN with Adaptive Receptive Fields for Speech Recognition. Interspeech : 2067-2071]]>
https://dblp.org/rec/conf/interspeech/AnZO21Fri, 01 Jan 2021 00:00:00 +0100
Efficient Neural Architecture Search for End-to-End Speech Recognition Via Straight-Through Gradients.https://doi.org/10.1109/SLT48900.2021.9383527, , :
Efficient Neural Architecture Search for End-to-End Speech Recognition Via Straight-Through Gradients. SLT : 60-67]]>
https://dblp.org/rec/conf/slt/ZhengAO21Fri, 01 Jan 2021 00:00:00 +0100
The SLT 2021 Children Speech Recognition Challenge: Open Datasets, Rules and Baselines.https://doi.org/10.1109/SLT48900.2021.9383608, , , , , , , , :
The SLT 2021 Children Speech Recognition Challenge: Open Datasets, Rules and Baselines. SLT : 1117-1123]]>
https://dblp.org/rec/conf/slt/YuYWAXOLLM21Fri, 01 Jan 2021 00:00:00 +0100
Deformable TDNN with adaptive receptive fields for speech recognition.https://arxiv.org/abs/2104.14791, , :
Deformable TDNN with adaptive receptive fields for speech recognition. CoRR abs/2104.14791 ()]]>
https://dblp.org/rec/journals/corr/abs-2104-14791Fri, 01 Jan 2021 00:00:00 +0100
Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings.https://arxiv.org/abs/2107.05038, , , :
Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings. CoRR abs/2107.05038 ()]]>
https://dblp.org/rec/journals/corr/abs-2107-05038Fri, 01 Jan 2021 00:00:00 +0100
Sequential Deformation for Accurate Scene Text Detection.https://doi.org/10.1007/978-3-030-58526-6_7, , , , , :
Sequential Deformation for Accurate Scene Text Detection. ECCV (29) : 108-124]]>
https://dblp.org/rec/conf/eccv/XiaoPYAYM20Wed, 01 Jan 2020 00:00:00 +0100
CAT: A CTC-CRF Based ASR Toolkit Bridging the Hybrid and the End-to-End Approaches Towards Data Efficiency and Low Latency.https://doi.org/10.21437/Interspeech.2020-2732, , :
CAT: A CTC-CRF Based ASR Toolkit Bridging the Hybrid and the End-to-End Approaches Towards Data Efficiency and Low Latency. INTERSPEECH : 566-570]]>
https://dblp.org/rec/conf/interspeech/AnXO20Wed, 01 Jan 2020 00:00:00 +0100
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency.https://arxiv.org/abs/2005.13326, , :
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency. CoRR abs/2005.13326 ()]]>
https://dblp.org/rec/journals/corr/abs-2005-13326Wed, 01 Jan 2020 00:00:00 +0100
Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients.https://arxiv.org/abs/2011.05649, , :
Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients. CoRR abs/2011.05649 ()]]>
https://dblp.org/rec/journals/corr/abs-2011-05649Wed, 01 Jan 2020 00:00:00 +0100
The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines.https://arxiv.org/abs/2011.06724, , , , , , , , :
The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines. CoRR abs/2011.06724 ()]]>
https://dblp.org/rec/journals/corr/abs-2011-06724Wed, 01 Jan 2020 00:00:00 +0100
CAT: CRF-based ASR Toolkit.http://arxiv.org/abs/1911.08747, , :
CAT: CRF-based ASR Toolkit. CoRR abs/1911.08747 ()]]>
https://dblp.org/rec/journals/corr/abs-1911-08747Tue, 01 Jan 2019 00:00:00 +0100