default search action
Takuya Higuchi
Person information
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c28]Takuya Higuchi, Avamarie Brueggeman, Masood Delfarah, Stephen Shum:
Multichannel Voice Trigger Detection Based on Transform-Average-Concatenate. ICASSP Workshops 2024: 510-514 - [i12]Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zakaria Aldeneh, Takuya Higuchi, Barry-John Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe:
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models. CoRR abs/2401.17230 (2024) - [i11]Zakaria Aldeneh, Takuya Higuchi, Jee-weon Jung, Skyler Seto, Tatiana Likhomanenko, Stephen Shum, Ahmed Hussen Abdelaziz, Shinji Watanabe, Barry-John Theobald:
Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features? CoRR abs/2402.00340 (2024) - [i10]Krishna Subramani, Paris Smaragdis, Takuya Higuchi, Mehrez Souden:
Rethinking Non-Negative Matrix Factorization with Implicit Neural Representations. CoRR abs/2404.04439 (2024) - [i9]Zakaria Aldeneh, Vimal Thilak, Takuya Higuchi, Barry-John Theobald, Tatiana Likhomanenko:
Towards Automatic Assessment of Self-Supervised Speech Models using Rank. CoRR abs/2409.10787 (2024) - [i8]Li-Wei Chen, Takuya Higuchi, He Bai, Ahmed Hussen Abdelaziz, Alexander Rudnicky, Shinji Watanabe, Tatiana Likhomanenko, Barry-John Theobald, Zakaria Aldeneh:
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models. CoRR abs/2409.10788 (2024) - [i7]Zakaria Aldeneh, Takuya Higuchi, Jee-weon Jung, Li-Wei Chen, Stephen Shum, Ahmed Hussen Abdelaziz, Shinji Watanabe, Tatiana Likhomanenko, Barry-John Theobald:
Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-Labels. CoRR abs/2409.10791 (2024) - 2023
- [i6]Takuya Higuchi, Avamarie Brueggeman, Masood Delfarah, Stephen Shum:
Multichannel Voice Trigger Detection Based on Transform-average-concatenate. CoRR abs/2309.16036 (2023) - [i5]Avamarie Brueggeman, Takuya Higuchi, Masood Delfarah, Stephen Shum, Vineet Garg:
Does Single-channel Speech Enhancement Improve Keyword Spotting Accuracy? A Case Study. CoRR abs/2309.16060 (2023) - 2022
- [c27]Prateeth Nayak, Takuya Higuchi, Anmol Gupta, Shivesh Ranjan, Stephen Shum, Siddharth Sigtia, Erik Marchi, Varun Lakshminarasimhan, Minsik Cho, Saurabh Adya, Chandra Dhir, Ahmed H. Tewfik:
Improving Voice Trigger Detection with Metric Learning. INTERSPEECH 2022: 1896-1900 - [i4]Prateeth Nayak, Takuya Higuchi, Anmol Gupta, Shivesh Ranjan, Stephen Shum, Siddharth Sigtia, Erik Marchi, Varun Lakshminarasimhan, Minsik Cho, Saurabh Adya, Chandra Dhir, Ahmed H. Tewfik:
Improving Voice Trigger Detection with Metric Learning. CoRR abs/2204.02455 (2022) - 2021
- [c26]Takuya Higuchi, Anmol Gupta, Chandra Dhir:
Multi-Task Learning with Cross Attention for Keyword Spotting. ASRU 2021: 571-578 - [c25]Takuya Higuchi, Shreyas Saxena, Mehrez Souden, Tien Dung Tran, Masood Delfarah, Chandra Dhir:
Dynamic Curriculum Learning via Data Parameters for Noise Robust Keyword Spotting. ICASSP 2021: 6848-6852 - [i3]Takuya Higuchi, Shreyas Saxena, Mehrez Souden, Tien Dung Tran, Masood Delfarah, Chandra Dhir:
Dynamic curriculum learning via data parameters for noise robust keyword spotting. CoRR abs/2102.09666 (2021) - [i2]Takuya Higuchi, Anmol Gupta, Chandra Dhir:
Multi-task Learning with Cross Attention for Keyword Spotting. CoRR abs/2107.07634 (2021) - 2020
- [c24]Takuya Higuchi, Mohammad Ghasemzadeh, Kisun You, Chandra Dhir:
Stacked 1D Convolutional Networks for End-to-End Small Footprint Voice Trigger Detection. INTERSPEECH 2020: 2592-2596 - [i1]Takuya Higuchi, Mohammad Ghasemzadeh, Kisun You, Chandra Dhir:
Stacked 1D convolutional networks for end-to-end small footprint voice trigger detection. CoRR abs/2008.03405 (2020)
2010 – 2019
- 2018
- [j2]Hirokazu Kameoka, Takuya Higuchi, Mikihiro Tanaka, Li Li:
Nonnegative Matrix Factorization With Basis Clustering Using Cepstral Distance Regularization. IEEE ACM Trans. Audio Speech Lang. Process. 26(6): 1025-1036 (2018) - [c23]Takuya Higuchi, Keisuke Kinoshita, Nobutaka Ito, Shigeki Karita, Tomohiro Nakatani:
Frame-by-Frame Closed-Form Update for Mask-Based Adaptive MVDR Beamforming. ICASSP 2018: 531-535 - [c22]Lukas Drude, Takuya Higuchi, Keisuke Kinoshita, Tomohiro Nakatani, Reinhold Haeb-Umbach:
Dual Frequency- and Block-Permutation Alignment for Deep Learning Based Block-Online Blind Source Separation. ICASSP 2018: 691-695 - [c21]Katerina Zmolíková, Marc Delcroix, Keisuke Kinoshita, Takuya Higuchi, Tomohiro Nakatani, Jan Cernocký:
Optimization of Speaker-Aware Multichannel Speech Extraction with ASR Criterion. ICASSP 2018: 6702-6706 - 2017
- [j1]Takuya Higuchi, Nobutaka Ito, Shoko Araki, Takuya Yoshioka, Marc Delcroix, Tomohiro Nakatani:
Online MVDR Beamformer Based on Complex Gaussian Mixture Model With Spatial Prior for Noise Robust ASR. IEEE ACM Trans. Audio Speech Lang. Process. 25(4): 780-793 (2017) - [c20]Katerina Zmolíková, Marc Delcroix, Keisuke Kinoshita, Takuya Higuchi, Atsunori Ogawa, Tomohiro Nakatani:
Learning speaker representation for neural network based multichannel speaker extraction. ASRU 2017: 8-15 - [c19]Takuya Higuchi, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani:
Adversarial training for data-driven speech enhancement without parallel corpus. ASRU 2017: 40-47 - [c18]Shoko Araki, Nobutaka Ito, Marc Delcroix, Atsunori Ogawa, Keisuke Kinoshita, Takuya Higuchi, Takuya Yoshioka, Dung T. Tran, Shigeki Karita, Tomohiro Nakatani:
Online meeting recognition in noisy environments with time-frequency mask based MVDR beamforming. HSCMA 2017: 16-20 - [c17]Keisuke Kinoshita, Marc Delcroix, Atsunori Ogawa, Takuya Higuchi, Tomohiro Nakatani:
Deep mixture density network for statistical model-based feature enhancement. ICASSP 2017: 251-255 - [c16]Tomohiro Nakatani, Nobutaka Ito, Takuya Higuchi, Shoko Araki, Keisuke Kinoshita:
Integrating DNN-based and spatial clustering-based mask estimation for robust MVDR beamforming. ICASSP 2017: 286-290 - [c15]Takuya Higuchi, Takuya Yoshioka, Keisuke Kinoshita, Tomohiro Nakatani:
Unsupervised utterance-wise beamformer estimation with speech recognition-level criterion. ICASSP 2017: 5170-5174 - [c14]Takuya Higuchi, Keisuke Kinoshita, Marc Delcroix, Katerina Zmolíková, Tomohiro Nakatani:
Deep Clustering-Based Beamforming for Separation with Unknown Number of Sources. INTERSPEECH 2017: 1183-1187 - [c13]Katerina Zmolíková, Marc Delcroix, Keisuke Kinoshita, Takuya Higuchi, Atsunori Ogawa, Tomohiro Nakatani:
Speaker-Aware Neural Network Based Beamformer for Speaker Extraction in Speech Mixtures. INTERSPEECH 2017: 2655-2659 - [p1]Marc Delcroix, Takuya Yoshioka, Nobutaka Ito, Atsunori Ogawa, Keisuke Kinoshita, Masakiyo Fujimoto, Takuya Higuchi, Shoko Araki, Tomohiro Nakatani:
Multichannel Speech Enhancement Approaches to DNN-Based Far-Field Speech Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 21-49 - 2016
- [c12]Shoko Araki, Masahiro Okada, Takuya Higuchi, Atsunori Ogawa, Tomohiro Nakatani:
Spatial correlation model based observation vector clustering and MVDR beamforming for meeting recognition. ICASSP 2016: 385-389 - [c11]Takuya Higuchi, Nobutaka Ito, Takuya Yoshioka, Tomohiro Nakatani:
Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise. ICASSP 2016: 5210-5214 - [c10]Li Li, Hirokazu Kameoka, Takuya Higuchi, Hiroshi Saruwatari:
Semi-Supervised Joint Enhancement of Spectral and Cepstral Sequences of Noisy Speech. INTERSPEECH 2016: 3753-3757 - [c9]Takuya Higuchi, Takuya Yoshioka, Tomohiro Nakatani:
Optimization of Speech Enhancement Front-End with Speech Recognition-Level Criterion. INTERSPEECH 2016: 3808-3812 - [c8]Takuya Higuchi, Takuya Yoshioka, Tomohiro Nakatani:
Sparseness-based multichannel nonnegative matrix factorization for blind source separation. IWAENC 2016: 1-5 - 2015
- [c7]Takuya Yoshioka, Nobutaka Ito, Marc Delcroix, Atsunori Ogawa, Keisuke Kinoshita, Masakiyo Fujimoto, Chengzhu Yu, Wojciech J. Fabian, Miquel Espi, Takuya Higuchi, Shoko Araki, Tomohiro Nakatani:
The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices. ASRU 2015: 436-443 - [c6]Takuya Higuchi, Hirokazu Kameoka:
Unified approach for audio source separation with multichannel factorial HMM and DOA mixture model. EUSIPCO 2015: 2043-2047 - 2014
- [c5]Takuya Higuchi, Hirokazu Kameoka:
Unified approach for underdetermined BSS, VAD, dereverberation and DOA estimation with multichannel factorial HMM. GlobalSIP 2014: 562-566 - [c4]Takuya Higuchi, Norihiro Takamune, Tomohiko Nakamura, Hirokazu Kameoka:
Underdetermined blind separation and tracking of moving sources based ONDOA-HMM. ICASSP 2014: 3191-3195 - [c3]Takuya Higuchi, Hirofumi Takeda, Tomohiko Nakamura, Hirokazu Kameoka:
A unified approach for underdetermined blind signal separation and source activity detection by multichannel factorial hidden Markov models. INTERSPEECH 2014: 850-854 - [c2]Takuya Higuchi, Hirokazu Kameoka:
Joint audio source separation and dereverberation based on multichannel factorial hidden Markov model. MLSP 2014: 1-6
2000 – 2009
- 2001
- [c1]Mamoru Mitsuishi, Shin'ichi Warisawa, Taishi Tsuda, Takuya Higuchi, Norihiro Koizumi, Hiroyuki Hashizume, Kazuo Fujiwara:
Remote Ultrasound Diagnostic System. ICRA 2001: 1567-1574
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-24 21:32 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint