default search action
Bhiksha Raj
Bhiksha Ramakrishnan
Person information
- affiliation: Carnegie Mellon University, Pittsburgh, USA
SPARQL queries
🛈 Please note that only 74% of the records listed on this page have a DOI. Therefore, DOI-based queries can only provide partial results.
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j39]Francisco Teixeira, Alberto Abad, Bhiksha Raj, Isabel Trancoso:
Privacy-Oriented Manipulation of Speaker Representations. IEEE Access 12: 82949-82971 (2024) - [j38]Fan Yang, Muqiao Yang, Xiang Li, Yuxuan Wu, Zhiyuan Zhao, Bhiksha Raj, Rita Singh:
A closer look at reinforcement learning-based automatic speech recognition. Comput. Speech Lang. 87: 101641 (2024) - [c259]Umberto Cappellazzo, Enrico Fini, Muqiao Yang, Daniele Falavigna, Alessio Brutti, Bhiksha Raj:
Continual Contrastive Spoken Language Understanding. ACL (Findings) 2024: 3727-3741 - [c258]Roshan Sharma, Suwon Shon, Mark Lindsey, Hira Dhamyal, Bhiksha Raj:
Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization? ACL (1) 2024: 14779-14797 - [c257]Yizhou Zhao, Tuanfeng Yang Wang, Bhiksha Raj, Min Xu, Jimei Yang, Chun-Hao Paul Huang:
Synergistic Global-Space Camera and Human Reconstruction from Videos. CVPR 2024: 1216-1226 - [c256]Xiang Li, Jinglu Wang, Xiaohao Xu, Xiulian Peng, Rita Singh, Yan Lu, Bhiksha Raj:
QDFormer: Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition. CVPR 2024: 3402-3413 - [c255]Xiang Li, Kai Qiu, Jinglu Wang, Xiaohao Xu, Rita Singh, Kashu Yamazaki, Hao Chen, Xiaonan Huang, Bhiksha Raj:
R2-Bench: Benchmarking the Robustness of Referring Perception Models Under Perturbations. ECCV (9) 2024: 211-230 - [c254]Soham Deshmukh, Benjamin Elizalde, Dimitra Emmanouilidou, Bhiksha Raj, Rita Singh, Huaming Wang:
Training Audio Captioning Models without Audio. ICASSP 2024: 371-375 - [c253]Muhammad A. Shah, Bhiksha Raj:
Fixed Inter-Neuron Covariability Induces Adversarial Robustness. ICASSP 2024: 7005-7009 - [c252]Muqiao Yang, Umberto Cappellazzo, Xiang Li, Bhiksha Raj:
Improving Continual Learning of Acoustic Scene Classification via Mutual Information Optimization. ICASSP 2024: 7105-7109 - [c251]Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu:
uSee: Unified Speech Enhancement And Editing with Conditional Diffusion Models. ICASSP 2024: 7125-7129 - [c250]Ankit Shah, Fuyu Tang, Zelin Ye, Rita Singh, Bhiksha Raj:
Importance of Negative Sampling in Weak Label Learning. ICASSP 2024: 7530-7534 - [c249]Hira Dhamyal, Benjamin Elizalde, Soham Deshmukh, Huaming Wang, Bhiksha Raj, Rita Singh:
Prompting Audios Using Acoustic Properties for Emotion Representation. ICASSP 2024: 11936-11940 - [c248]Jee-Weon Jung, Roshan S. Sharma, William Chen, Bhiksha Raj, Shinji Watanabe:
AugSumm: Towards Generalizable Speech Summarization Using Synthetic Labels from Large Language Models. ICASSP 2024: 12071-12075 - [c247]Chien-Yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan S. Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-Yi Lee:
Dynamic-Superb: Towards a Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark For Speech. ICASSP 2024: 12136-12140 - [c246]Hao Chen, Jindong Wang, Ankit Shah, Ran Tao, Hongxin Wei, Xing Xie, Masashi Sugiyama, Bhiksha Raj:
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks. ICLR 2024 - [c245]Hao Chen, Jindong Wang, Lei Feng, Xiang Li, Yidong Wang, Xing Xie, Masashi Sugiyama, Rita Singh, Bhiksha Raj:
A General Framework for Learning from Weak Supervision. ICML 2024 - [c244]Xiang Li, Yinpeng Chen, Chung-Ching Lin, Hao Chen, Kai Hu, Rita Singh, Bhiksha Raj, Lijuan Wang, Zicheng Liu:
Completing Visual Objects via Bridging Generation and Segmentation. ICML 2024 - [c243]Roshan Sharma, Ruchira Sharma, Hira Dhamyal, Rita Singh, Bhiksha Raj:
R-BASS : Relevance-aided Block-wise Adaptation for Speech Summarization. NAACL-HLT (Findings) 2024: 848-857 - [c242]Zhaorun Chen, Zhuokai Zhao, Zhihong Zhu, Ruiqi Zhang, Xiang Li, Bhiksha Raj, Huaxiu Yao:
AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition. NAACL-HLT 2024: 1346-1362 - [i159]Jee-weon Jung, Roshan S. Sharma, William Chen, Bhiksha Raj, Shinji Watanabe:
AugSumm: towards generalizable speech summarization using synthetic labels from large language model. CoRR abs/2401.06806 (2024) - [i158]Soham Deshmukh, Dareen Alharthi, Benjamin Elizalde, Hannes Gamper, Mahmoud Al Ismail, Rita Singh, Bhiksha Raj, Huaming Wang:
PAM: Prompting Audio-Language Models for Audio Quality Assessment. CoRR abs/2402.00282 (2024) - [i157]Hao Chen, Bhiksha Raj, Xing Xie, Jindong Wang:
On Catastrophic Inheritance of Large Foundation Models. CoRR abs/2402.01909 (2024) - [i156]Hao Chen, Jindong Wang, Lei Feng, Xiang Li, Yidong Wang, Xing Xie, Masashi Sugiyama, Rita Singh, Bhiksha Raj:
A General Framework for Learning from Weak Supervision. CoRR abs/2402.01922 (2024) - [i155]Xiaohao Xu, Tianyi Zhang, Sibo Wang, Xiang Li, Yongqi Chen, Ye Li, Bhiksha Raj, Matthew Johnson-Roberson, Xiaonan Huang:
Customizable Perturbation Synthesis for Robust SLAM Benchmarking. CoRR abs/2402.08125 (2024) - [i154]Soham Deshmukh, Rita Singh, Bhiksha Raj:
Domain Adaptation for Contrastive Audio-Language Models. CoRR abs/2402.09585 (2024) - [i153]Muqiao Yang, Xiang Li, Umberto Cappellazzo, Shinji Watanabe, Bhiksha Raj:
Evaluating and Improving Continual Learning in Spoken Language Understanding. CoRR abs/2402.10427 (2024) - [i152]Zhaorun Chen, Zhuokai Zhao, Zhihong Zhu, Ruiqi Zhang, Xiang Li, Bhiksha Raj, Huaxiu Yao:
AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition. CoRR abs/2402.11452 (2024) - [i151]Xiang Li, Kai Qiu, Jinglu Wang, Xiaohao Xu, Rita Singh, Kashu Yamazaki, Hao Chen, Xiaonan Huang, Bhiksha Raj:
R2-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations. CoRR abs/2403.04924 (2024) - [i150]Hao Chen, Jindong Wang, Zihan Wang, Ran Tao, Hongxin Wei, Xing Xie, Masashi Sugiyama, Bhiksha Raj:
Learning with Noisy Foundation Models. CoRR abs/2403.06869 (2024) - [i149]Francisco Teixeira, Karla Pizzi, Raphaël Olivier, Alberto Abad, Bhiksha Raj, Isabel Trancoso:
Improving Membership Inference in ASR Model Auditing with Perturbed Loss Features. CoRR abs/2405.01207 (2024) - [i148]Yizhou Zhao, Tuanfeng Y. Wang, Bhiksha Raj, Min Xu, Jimei Yang, Chun-Hao Paul Huang:
Synergistic Global-space Camera and Human Reconstruction from Videos. CoRR abs/2405.14855 (2024) - [i147]Hao Chen, Yujin Han, Diganta Misra, Xiang Li, Kai Hu, Difan Zou, Masashi Sugiyama, Jindong Wang, Bhiksha Raj:
Slight Corruption in Pre-training Data Makes Better Diffusion Models. CoRR abs/2405.20494 (2024) - [i146]Thanh-Dat Truong, Utsav Prabhu, Dongyi Wang, Bhiksha Raj, Susan Gauch, Jeyamkondan Subbiah, Khoa Luu:
EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding. CoRR abs/2406.01429 (2024) - [i145]Thanh-Dat Truong, Xin Li, Bhiksha Raj, Jackson David Cothren, Khoa Luu:
ED-SAM: An Efficient Diffusion Sampling Approach to Domain Generalization in Vision-Language Foundation Models. CoRR abs/2406.01432 (2024) - [i144]Xiang Li, Kai Qiu, Hao Chen, Jason Kuen, Zhe Lin, Rita Singh, Bhiksha Raj:
ControlVAR: Exploring Controllable Visual Autoregressive Modeling. CoRR abs/2406.09750 (2024) - [i143]Xiaohao Xu, Tianyi Zhang, Sibo Wang, Xiang Li, Yongqi Chen, Ye Li, Bhiksha Raj, Matthew Johnson-Roberson, Xiaonan Huang:
From Perfect to Noisy World Simulation: Customizable Embodied Multi-modal Perturbations for SLAM Robustness Benchmarking. CoRR abs/2406.16850 (2024) - [i142]Yuxuan Wu, Ziyu Wang, Bhiksha Raj, Gus Xia:
Emergent Interpretable Symbols and Content-Style Disentanglement via Variance-Invariance Constraints. CoRR abs/2407.03824 (2024) - [i141]Hazim T. Bukhari, Soham Deshmukh, Hira Dhamyal, Bhiksha Raj, Rita Singh:
SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios. CoRR abs/2407.15300 (2024) - [i140]Soham Deshmukh, Shuo Han, Hazim T. Bukhari, Benjamin Elizalde, Hannes Gamper, Rita Singh, Bhiksha Raj:
Audio Entailment: Assessing Deductive Reasoning for Audio Understanding. CoRR abs/2407.18062 (2024) - [i139]Roshan S. Sharma, Suwon Shon, Mark Lindsey, Hira Dhamyal, Rita Singh, Bhiksha Raj:
Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization? CoRR abs/2408.07277 (2024) - [i138]Kai Qiu, Xiang Li, Hao Chen, Jie Sun, Jinglu Wang, Zhe Lin, Marios Savvides, Bhiksha Raj:
Efficient Autoregressive Audio Modeling via Next-Scale Prediction. CoRR abs/2408.09027 (2024) - [i137]Massa Baali, Abdulhamid Aldoobi, Hira Dhamyal, Rita Singh, Bhiksha Raj:
PDAF: A Phonetic Debiasing Attention Framework For Speaker Verification. CoRR abs/2409.05799 (2024) - [i136]Kuang Yuan, Shuo Han, Swarun Kumar, Bhiksha Raj:
DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing. CoRR abs/2409.06137 (2024) - [i135]Jiatong Shi, Jinchuan Tian, Yihan Wu, Jee-weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharthi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander H. Liu, Bhiksha Raj, Qin Jin, Ruihua Song, Shinji Watanabe:
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech. CoRR abs/2409.15897 (2024) - [i134]Muhammad A. Shah, Bhiksha Raj:
Revisiting Acoustic Features for Robust ASR. CoRR abs/2409.16399 (2024) - [i133]Xiang Li, Kai Qiu, Hao Chen, Jason Kuen, Jiuxiang Gu, Bhiksha Raj, Zhe Lin:
ImageFolder: Autoregressive Image Generation with Folded Tokens. CoRR abs/2410.01756 (2024) - [i132]Ksheeraja Raghavan, Samiran Gode, Ankit Shah, Surabhi Raghavan, Wolfram Burgard, Bhiksha Raj, Rita Singh:
Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection. CoRR abs/2410.03904 (2024) - [i131]Ibrahim Aldarmaki, Thamar Solorio, Bhiksha Raj, Hanan Aldarmaki:
RelUNet: Relative Channel Fusion U-Net for Multichannel Speech Enhancement. CoRR abs/2410.05019 (2024) - [i130]Satvik Dixit, Massa Baali, Rita Singh, Bhiksha Raj:
Improving Speaker Representations Using Contrastive Losses on Multi-scale Features. CoRR abs/2410.05037 (2024) - [i129]Abdul Waheed, Hanin Atwany, Bhiksha Raj, Rita Singh:
What Do Speech Foundation Models Not Learn About Speech? CoRR abs/2410.12948 (2024) - [i128]Hao Chen, Abdul Waheed, Xiang Li, Yidong Wang, Jindong Wang, Bhiksha Raj, Marah I Abdin:
On the Diversity of Synthetic Data and its Impact on Training Large Language Models. CoRR abs/2410.15226 (2024) - [i127]Ravi Teja N. V. S. Chappa, Page Daniel Dobbs, Bhiksha Raj, Khoa Luu:
FLAASH: Flow-Attention Adaptive Semantic Hierarchical Fusion for Multi-Modal Tobacco Content Analysis. CoRR abs/2410.19896 (2024) - 2023
- [j37]Samiran Gode, Supreeth Bare, Bhiksha Raj, Hyungon Yoo:
Understanding political polarization using language models: A dataset and method. AI Mag. 44(3): 248-254 (2023) - [j36]Viet-Khoa Vo-Ho, Sang Truong, Kashu Yamazaki, Bhiksha Raj, Minh-Triet Tran, Ngan Le:
AOE-Net: Entities Interactions Modeling with Adaptive Attention Mechanism for Temporal Action Proposals Generation. Int. J. Comput. Vis. 131(1): 302-323 (2023) - [j35]Weiyang Liu, Yandong Wen, Bhiksha Raj, Rita Singh, Adrian Weller:
SphereFace Revived: Unifying Hyperspherical Face Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 45(2): 2458-2474 (2023) - [c241]Xiang Li, Haoyuan Cao, Shijie Zhao, Junlin Li, Li Zhang, Bhiksha Raj:
Panoramic Video Salient Object Detection with Ambisonic Audio Guidance. AAAI 2023: 1424-1432 - [c240]Kashu Yamazaki, Khoa Vo, Quang Sang Truong, Bhiksha Raj, Ngan Le:
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning. AAAI 2023: 3081-3090 - [c239]Roshan S. Sharma, William Chen, Takatomo Kano, Ruchira Sharma, Siddhant Arora, Shinji Watanabe, Atsunori Ogawa, Marc Delcroix, Rita Singh, Bhiksha Raj:
Espnet-Summ: Introducing a Novel Large Dataset, Toolkit, and a Cross-Corpora Evaluation of Speech Summarization Systems. ASRU 2023: 1-8 - [c238]Thanh-Dat Truong, Ngan Le, Bhiksha Raj, Jackson David Cothren, Khoa Luu:
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding. CVPR 2023: 19988-19997 - [c237]Xiang Li, Jinglu Wang, Xiaohao Xu, Muqiao Yang, Fan Yang, Yizhou Zhao, Rita Singh, Bhiksha Raj:
Towards Noise-Tolerant Speech-Referring Video Object Segmentation: Bridging Speech and Text. EMNLP 2023: 2283-2296 - [c236]Yutian Chen, Hao Kang, Vivian Zhai, Liangze Li, Rita Singh, Bhiksha Raj:
Token Prediction as Implicit Classification to Identify LLM-Generated Text. EMNLP 2023: 13112-13120 - [c235]Ankit Shah, Larry Tang, Po Hao Chou, Yi Yu Zheng, Ziqian Ge, Bhiksha Raj:
An Approach to Ontological Learning from Weak Labels. ICASSP 2023: 1-5 - [c234]Francisco Teixeira, Alberto Abad, Bhiksha Raj, Isabel Trancoso:
Privacy-Preserving Automatic Speaker Diarization. ICASSP 2023: 1-5 - [c233]Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
Paaploss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement. ICASSP 2023: 1-5 - [c232]Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement. ICASSP 2023: 1-5 - [c231]Yandong Wen, Weiyang Liu, Yao Feng, Bhiksha Raj, Rita Singh, Adrian Weller, Michael J. Black, Bernhard Schölkopf:
Pairwise Similarity Learning is SimPLE. ICCV 2023: 5285-5295 - [c230]Xiang Li, Jinglu Wang, Xiaohao Xu, Xiao Li, Bhiksha Raj, Yan Lu:
Robust Referring Video Object Segmentation with Cyclic Structural Consensus. ICCV 2023: 22179-22188 - [c229]Hao Chen, Ran Tao, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Bhiksha Raj, Marios Savvides:
SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning. ICLR 2023 - [c228]Yidong Wang, Hao Chen, Qiang Heng, Wenxin Hou, Yue Fan, Zhen Wu, Jindong Wang, Marios Savvides, Takahiro Shinozaki, Bhiksha Raj, Bernt Schiele, Xing Xie:
FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning. ICLR 2023 - [c227]Raphaël Olivier, Bhiksha Raj:
How Many Perturbations Break This Model? Evaluating Robustness Beyond Adversarial Accuracy. ICML 2023: 26583-26598 - [c226]Roshan Sharma, Siddhant Arora, Kenneth Zheng, Shinji Watanabe, Rita Singh, Bhiksha Raj:
BASS: Block-wise Adaptation for Speech Summarization. INTERSPEECH 2023: 1454-1458 - [c225]Liao Qu, Xianwei Zou, Xiang Li, Yandong Wen, Rita Singh, Bhiksha Raj:
The Hidden Dance of Phonemes and Visage: Unveiling the Enigmatic Link between Phonemes and Facial Features. INTERSPEECH 2023: 2578-2582 - [c224]Raphaël Olivier, Bhiksha Raj:
There is more than one kind of robustness: Fooling Whisper with adversarial examples. INTERSPEECH 2023: 4394-4398 - [c223]Xiang Li, Yandong Wen, Muqiao Yang, Jinglu Wang, Rita Singh, Bhiksha Raj:
Rethinking Voice-Face Correlation: A Geometry View. ACM Multimedia 2023: 2458-2467 - [c222]Xiang Li, Chung-Ching Lin, Yinpeng Chen, Zicheng Liu, Jinglu Wang, Rita Singh, Bhiksha Raj:
PaintSeg: Painting Pixels for Training-free Segmentation. NeurIPS 2023 - [c221]Shentong Mo, Bhiksha Raj:
Weakly-Supervised Audio-Visual Segmentation. NeurIPS 2023 - [c220]Muhammad Shah, Aqsa Kashaf, Bhiksha Raj:
Training on Foveated Images Improves Robustness to Adversarial Attacks. NeurIPS 2023 - [c219]Thanh-Dat Truong, Hoang-Quan Nguyen, Bhiksha Raj, Khoa Luu:
Fairness Continual Learning Approach to Semantic Scene Understanding in Open-World Environments. NeurIPS 2023 - [i126]Samiran Gode, Supreeth Bare, Bhiksha Raj, Hyungon Yoo:
Understanding Political Polarisation using Language Models: A dataset and method. CoRR abs/2301.00891 (2023) - [i125]Hao Chen, Ran Tao, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Bhiksha Raj, Marios Savvides:
SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised Learning. CoRR abs/2301.10921 (2023) - [i124]Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement. CoRR abs/2302.08088 (2023) - [i123]Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
PAAPLoss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement. CoRR abs/2302.08095 (2023) - [i122]Laurie M. Heller, Benjamin Elizalde, Bhiksha Raj, Soham Deshmukh:
Synergy between human and machine approaches to sound/scene recognition and processing: An overview of ICASSP special session. CoRR abs/2302.09719 (2023) - [i121]Ankit Shah, Shuyi Chen, Kejun Zhou, Yue Chen, Bhiksha Raj:
Approach to Learning Generalized Audio Representation Through Batch Embedding Covariance Regularization and Constant-Q Transforms. CoRR abs/2303.03591 (2023) - [i120]Joseph Konan, Ojas Bhargave, Shikhar Agnihotri, Hojeong Lee, Ankit Shah, Shuo Han, Yunyang Zeng, Amanda Shu, Haohui Liu, Xuankai Chang, Hamza Khalid, Minseon Gwak, Kawon Lee, Minjeong Kim, Bhiksha Raj:
Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms. CoRR abs/2303.09048 (2023) - [i119]Thanh-Dat Truong, Ngan Le, Bhiksha Raj, Jackson David Cothren, Khoa Luu:
FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding. CoRR abs/2304.02135 (2023) - [i118]Yutian Chen, Hao Kang, Vivian Zhai, Liangze Li, Rita Singh, Bhiksha Raj:
GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content. CoRR abs/2305.07969 (2023) - [i117]Hao Chen, Ankit Shah, Jindong Wang, Ran Tao, Yidong Wang, Xing Xie, Masashi Sugiyama, Rita Singh, Bhiksha Raj:
Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations. CoRR abs/2305.12715 (2023) - [i116]Thanh-Dat Truong, Hoang-Quan Nguyen, Bhiksha Raj, Khoa Luu:
Fairness Continual Learning Approach to Semantic Scene Understanding in Open-World Environments. CoRR abs/2305.15700 (2023) - [i115]Xiang Li, Chung-Ching Lin, Yinpeng Chen, Zicheng Liu, Jinglu Wang, Bhiksha Raj:
PaintSeg: Training-free Segmentation via Painting. CoRR abs/2305.19406 (2023) - [i114]Pha A. Nguyen, Kha Gia Quach, John Gauch, Samee U. Khan, Bhiksha Raj, Khoa Luu:
UTOPIA: Unconstrained Tracking Objects without Preliminary Examination via Cross-Domain Adaptation. CoRR abs/2306.09613 (2023) - [i113]Roshan S. Sharma, Kenneth Zheng, Siddhant Arora, Shinji Watanabe, Rita Singh, Bhiksha Raj:
BASS: Block-wise Adaptation for Speech Summarization. CoRR abs/2307.08217 (2023) - [i112]Xiang Li, Yandong Wen, Muqiao Yang, Jinglu Wang, Rita Singh, Bhiksha Raj:
Rethinking Voice-Face Correlation: A Geometry View. CoRR abs/2307.13948 (2023) - [i111]Liao Qu, Xianwei Zou, Xiang Li, Yandong Wen, Rita Singh, Bhiksha Raj:
The Hidden Dance of Phonemes and Visage: Unveiling the Enigmatic Link between Phonemes and Facial Features. CoRR abs/2307.13953 (2023) - [i110]Muhammad A. Shah, Bhiksha Raj:
Training on Foveated Images Improves Robustness to Adversarial Attacks. CoRR abs/2308.00854 (2023) - [i109]Muhammad Ahmed Shah, Bhiksha Raj:
Fixed Inter-Neuron Covariability Induces Adversarial Robustness. CoRR abs/2308.03956 (2023) - [i108]Soham Deshmukh, Benjamin Elizalde, Dimitra Emmanouilidou, Bhiksha Raj, Rita Singh, Huaming Wang:
Training Audio Captioning Models without Audio. CoRR abs/2309.07372 (2023) - [i107]Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan S. Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-yi Lee:
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech. CoRR abs/2309.09510 (2023) - [i106]Ankit Shah, Fuyu Tang, Zelin Ye, Rita Singh, Bhiksha Raj:
Importance of negative sampling in weak label learning. CoRR abs/2309.13227 (2023) - [i105]Hao Chen, Jindong Wang, Ankit Shah, Ran Tao, Hongxin Wei, Xing Xie, Masashi Sugiyama, Bhiksha Raj:
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks. CoRR abs/2309.17002 (2023) - [i104]Xiang Li, Jinglu Wang, Xiaohao Xu, Xiulian Peng, Rita Singh, Yan Lu, Bhiksha Raj:
Rethinking Audiovisual Segmentation with Semantic Quantization and Decomposition. CoRR abs/2310.00132 (2023) - [i103]Dareen Alharthi, Roshan Sharma, Hira Dhamyal, Soumi Maiti, Bhiksha Raj, Rita Singh:
Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech. CoRR abs/2310.00706 (2023) - [i102]Xiang Li, Yinpeng Chen, Chung-Ching Lin, Rita Singh, Bhiksha Raj, Zicheng Liu:
Completing Visual Objects via Bridging Generation and Segmentation. CoRR abs/2310.00808 (2023) - [i101]Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu:
uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models. CoRR abs/2310.00900 (2023) - [i100]Hira Dhamyal, Benjamin Elizalde, Soham Deshmukh, Huaming Wang, Bhiksha Raj, Rita Singh:
Prompting Audios Using Acoustic Properties For Emotion Representation. CoRR abs/2310.02298 (2023) - [i99]Umberto Cappellazzo, Enrico Fini, Muqiao Yang, Daniele Falavigna, Alessio Brutti, Bhiksha Raj:
Continual Contrastive Spoken Language Understanding. CoRR abs/2310.02699 (2023) - [i98]Muhammad Ahmed Shah, Roshan Sharma, Hira Dhamyal, Raphaël Olivier, Ankit Shah, Joseph Konan, Dareen Alharthi, Hazim T. Bukhari, Massa Baali, Soham Deshmukh, Michael Kuhlmann, Bhiksha Raj, Rita Singh:
LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model. CoRR abs/2310.04445 (2023) - [i97]Joseph Konan, Ojas Bhargave, Shikhar Agnihotri, Shuo Han, Yunyang Zeng, Ankit Shah, Bhiksha Raj:
Psychoacoustic Challenges Of Speech Enhancement On VoIP Platforms. CoRR abs/2310.07161 (2023) - [i96]Yandong Wen, Weiyang Liu, Yao Feng, Bhiksha Raj, Rita Singh, Adrian Weller, Michael J. Black, Bernhard Schölkopf:
Pairwise Similarity Learning is SimPLE. CoRR abs/2310.09449 (2023) - [i95]Yutian Chen, Hao Kang, Vivian Zhai, Liangze Li, Rita Singh, Bhiksha Raj:
Token Prediction as Implicit Classification to Identify LLM-Generated Text. CoRR abs/2311.08723 (2023) - [i94]Shentong Mo, Bhiksha Raj:
Weakly-Supervised Audio-Visual Segmentation. CoRR abs/2311.15080 (2023) - [i93]Thanh-Dat Truong, Utsav Prabhu, Bhiksha Raj, Jackson David Cothren, Khoa Luu:
FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding in Open World. CoRR abs/2311.15965 (2023) - 2022
- [c218]Roshan Sharma, Bhiksha Raj:
Cross-utterance context for multimodal video transcription. IEEECONF 2022: 1321-1325 - [c217]Yandong Wen, Weiyang Liu, Adrian Weller, Bhiksha Raj, Rita Singh:
SphereFace2: Binary Classification is All You Need for Deep Face Recognition. ICLR 2022 - [c216]Hira Dhamyal, Bhiksha Raj, Rita Singh:
Positional Encoding for Capturing Modality Specific Cadence for Emotion Detection. INTERSPEECH 2022: 166-170 - [c215]Francisco Teixeira, Alberto Abad, Bhiksha Raj, Isabel Trancoso:
Towards End-to-End Private Automatic Speaker Recognition. INTERSPEECH 2022: 2798-2802 - [c214]Muqiao Yang, Joseph Konan, David Bick, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
Improving Speech Enhancement through Fine-Grained Speech Characteristics. INTERSPEECH 2022: 2953-2957 - [c213]Raphaël Olivier, Bhiksha Raj:
Recent improvements of ASR models in the face of adversarial attacks. INTERSPEECH 2022: 4113-4117 - [c212]Yidong Wang, Hao Chen, Yue Fan, Wang Sun, Ran Tao, Wenxin Hou, Renjie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yufeng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang:
USB: A Unified Semi-supervised Learning Benchmark for Classification. NeurIPS 2022 - [i92]Larry Tang, Po Hao Chou, Yi Yu Zheng, Ziqian Ge, Ankit Shah, Bhiksha Raj:
Ontological Learning from Weak Labels. CoRR abs/2203.02483 (2022) - [i91]Joseph Turian, Jordie Shier, Humair Raj Khan, Bhiksha Raj, Björn W. Schuller, Christian J. Steinmetz, Colin Malloy, George Tzanetakis, Gissel Velarde, Kirk McNally, Max Henry, Nicolas Pinto, Camille Noufi, Christian Clough, Dorien Herremans, Eduardo Fonseca, Jesse H. Engel, Justin Salamon, Philippe Esling, Pranay Manocha, Shinji Watanabe, Zeyu Jin, Yonatan Bisk:
HEAR 2021: Holistic Evaluation of Audio Representations. CoRR abs/2203.03022 (2022) - [i90]Shentong Mo, Jingfei Xia, Xiaoqing Tan, Bhiksha Raj:
Point3D: tracking actions as moving points with 3D CNNs. CoRR abs/2203.10584 (2022) - [i89]Raphaël Olivier, Bhiksha Raj:
Recent improvements of ASR models in the face of adversarial attacks. CoRR abs/2203.16536 (2022) - [i88]Ankit Shah, Hira Dhamyal, Yang Gao, Rita Singh, Bhiksha Raj:
On the pragmatism of using binary classifiers over data intensive neural network classifiers for detection of COVID-19 from voice. CoRR abs/2204.04802 (2022) - [i87]Yidong Wang, Hao Chen, Qiang Heng, Wenxin Hou, Yue Fan, Zhen Wu, Marios Savvides, Takahiro Shinozaki, Bhiksha Raj, Bernt Schiele:
FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning. CoRR abs/2205.07246 (2022) - [i86]Chonghan Chen, Qi Jiang, Chih-Hao Wang, Noel Chen, Haohan Wang, Xiang Li, Bhiksha Raj:
Bear the Query in Mind: Visual Grounding with Query-conditioned Convolution. CoRR abs/2206.09114 (2022) - [i85]Francisco Teixeira, Alberto Abad, Bhiksha Raj, Isabel Trancoso:
Towards End-to-End Private Automatic Speaker Recognition. CoRR abs/2206.11750 (2022) - [i84]Roshan Sharma, Tyler Vuong, Mark Lindsey, Hira Dhamyal, Rita Singh, Bhiksha Raj:
Self-supervision and Learnable STRFs for Age, Emotion, and Country Prediction. CoRR abs/2206.12568 (2022) - [i83]Muqiao Yang, Joseph Konan, David Bick, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
Improving Speech Enhancement through Fine-Grained Speech Characteristics. CoRR abs/2207.00237 (2022) - [i82]Xiang Li, Jinglu Wang, Xiaohao Xu, Xiao Li, Yan Lu, Bhiksha Raj:
R^2VOS: Robust Referring Video Object Segmentation via Relational Multimodal Cycle Consistency. CoRR abs/2207.01203 (2022) - [i81]Raphaël Olivier, Bhiksha Raj:
Not all broken defenses are equal: The dead angles of adversarial accuracy. CoRR abs/2207.04129 (2022) - [i80]Xiang Li, Jinglu Wang, Xiaohao Xu, Bhiksha Raj, Yan Lu:
Online Video Instance Segmentation via Robust Context Fusion. CoRR abs/2207.05580 (2022) - [i79]Yidong Wang, Hao Chen, Yue Fan, Wang Sun, Ran Tao, Wenxin Hou, Renjie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yufeng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang:
USB: A Unified Semi-supervised Learning Benchmark. CoRR abs/2208.07204 (2022) - [i78]Raphaël Olivier, Hadi Abdullah, Bhiksha Raj:
Watch What You Pretrain For: Targeted, Transferable Adversarial Examples on Self-Supervised Speech Recognition models. CoRR abs/2209.13523 (2022) - [i77]Khoa Vo, Sang Truong, Kashu Yamazaki, Bhiksha Raj, Minh-Triet Tran, Ngan Le:
AOE-Net: Entities Interactions Modeling with Adaptive Attention Mechanism for Temporal Action Proposals Generation. CoRR abs/2210.02578 (2022) - [i76]Francisco Teixeira, Alberto Abad, Bhiksha Raj, Isabel Trancoso:
Privacy-preserving Automatic Speaker Diarization. CoRR abs/2210.14995 (2022) - [i75]Roshan Sharma, Hira Dhamyal, Bhiksha Raj, Rita Singh:
Unifying the Discrete and Continuous Emotion labels for Speech Emotion Recognition. CoRR abs/2210.16642 (2022) - [i74]Roshan Sharma, Bhiksha Raj:
XNOR-FORMER: Learning Accurate Approximations in Long Speech Transformers. CoRR abs/2210.16643 (2022) - [i73]Raphaël Olivier, Bhiksha Raj:
There is more than one kind of robustness: Fooling Whisper with adversarial examples. CoRR abs/2210.17316 (2022) - [i72]Hira Dhamyal, Benjamin Elizalde, Soham Deshmukh, Huaming Wang, Bhiksha Raj, Rita Singh:
Describing emotions with acoustic property prompts for speech emotion recognition. CoRR abs/2211.07737 (2022) - [i71]Xiang Li, Haoyuan Cao, Shijie Zhao, Junlin Li, Li Zhang, Bhiksha Raj:
Panoramic Video Salient Object Detection with Ambisonic Audio Guidance. CoRR abs/2211.14419 (2022) - [i70]Kashu Yamazaki, Khoa Vo, Sang Truong, Bhiksha Raj, Ngan Le:
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning. CoRR abs/2211.15103 (2022) - 2021
- [j34]Wenbo Liu, Ming Li, Xiaobing Zou, Bhiksha Raj:
Discriminative Dictionary Learning for Autism Spectrum Disorder Identification. Frontiers Comput. Neurosci. 15: 662401 (2021) - [c211]Shentong Mo, Jingfei Xia, Xiaoqing Tan, Bhiksha Raj:
Point3D: tracking actions as moving points with 3D CNNs. BMVC 2021: 259 - [c210]Raphaël Olivier, Bhiksha Raj:
Sequential Randomized Smoothing for Adversarially Robust Speech Recognition. EMNLP (1) 2021: 6372-6386 - [c209]Raphaël Olivier, Bhiksha Raj, Muhammad Shah:
High-Frequency Adversarial Defense for Speech and Audio. ICASSP 2021: 2995-2999 - [c208]Muhammad A. Shah, Raphaël Olivier, Bhiksha Raj:
Towards Adversarial Robustness Via Compact Feature Representations. ICASSP 2021: 3845-3849 - [c207]Ali Shahin Shamsabadi, Francisco Sepúlveda Teixeira, Alberto Abad, Bhiksha Raj, Andrea Cavallaro, Isabel Trancoso:
FoolHD: Fooling Speaker Identification by Highly Imperceptible Adversarial Disturbances. ICASSP 2021: 6159-6163 - [c206]Maria Joana Correia, Francisco Teixeira, Catarina Botelho, Isabel Trancoso, Bhiksha Raj:
The in-the-Wild Speech Medical Corpus. ICASSP 2021: 6973-6977 - [c205]Thanh-Dat Truong, Chi Nhan Duong, The De Vu, Hoang Anh Pham, Bhiksha Raj, Ngan Le, Khoa Luu:
The Right to Talk: An Audio-Visual Transformer Approach. ICCV 2021: 1085-1094 - [c204]Kai Hu, Jie Shao, Yuan Liu, Bhiksha Raj, Marios Savvides, Zhiqiang Shen:
Contrast and Order Representations for Video Self-supervised Learning. ICCV 2021: 7919-7929 - [c203]Yandong Wen, Weiyang Liu, Bhiksha Raj, Rita Singh:
Self-Supervised 3D Face Reconstruction via Conditional Estimation. ICCV 2021: 13269-13278 - [c202]Soham Deshmukh, Bhiksha Raj, Rita Singh:
Improving Weakly Supervised Sound Event Detection with Self-Supervised Auxiliary Tasks. Interspeech 2021: 596-600 - [c201]Jiachen Lian, Aiswarya Vinod Kumar, Hira Dhamyal, Bhiksha Raj, Rita Singh:
Masked Proxy Loss for Text-Independent Speaker Verification. Interspeech 2021: 4638-4642 - [c200]Joseph Turian, Jordie Shier, Humair Raj Khan, Bhiksha Raj, Björn W. Schuller, Christian J. Steinmetz, Colin Malloy, George Tzanetakis, Gissel Velarde, Kirk McNally, Max Henry, Nicolas Pinto, Camille Noufi, Christian Clough, Dorien Herremans, Eduardo Fonseca, Jesse H. Engel, Justin Salamon, Philippe Esling, Pranay Manocha, Shinji Watanabe, Zeyu Jin, Yonatan Bisk:
HEAR: Holistic Evaluation of Audio Representations. NeurIPS (Competition and Demos) 2021: 125-145 - [c199]Yang Gao, Jiachen Lian, Bhiksha Raj, Rita Singh:
Detection and Evaluation of Human and Machine Generated Speech in Spoofing Attacks on Automatic Speaker Verification Systems. SLT 2021: 544-551 - [c198]Benjamin Elizalde, Radu Revutchi, Samarjit Das, Bhiksha Raj, Ian R. Lane, Laurie M. Heller:
Identifying Actions for Sound Event Classification. WASPAA 2021: 26-30 - [i69]Bronya Roni Chernyak, Bhiksha Raj, Tamir Hazan, Joseph Keshet:
Constant Random Perturbations Provide Adversarial Robustness with Minimal Effect on Accuracy. CoRR abs/2103.08265 (2021) - [i68]Anxiang Zhang, Ankit Shah, Bhiksha Raj:
Training image classifiers using Semi-Weak Label Data. CoRR abs/2103.10608 (2021) - [i67]Benjamin Elizalde, Radu Revutchi, Samarjit Das, Bhiksha Raj, Ian R. Lane, Laurie M. Heller:
Identifying Actions for Sound Event Classification. CoRR abs/2104.12693 (2021) - [i66]Soham Deshmukh, Bhiksha Raj, Rita Singh:
Improving weakly supervised sound event detection with self-supervised auxiliary tasks. CoRR abs/2106.06858 (2021) - [i65]Hao Liang, Lulan Yu, Guikang Xu, Bhiksha Raj, Rita Singh:
Controlled AutoEncoders to Generate Faces from Voices. CoRR abs/2107.07988 (2021) - [i64]Yandong Wen, Weiyang Liu, Adrian Weller, Bhiksha Raj, Rita Singh:
SphereFace2: Binary Classification is All You Need for Deep Face Recognition. CoRR abs/2108.01513 (2021) - [i63]Thanh-Dat Truong, Chi Nhan Duong, The De Vu, Hoang Anh Pham, Bhiksha Raj, Ngan Le, Khoa Luu:
The Right to Talk: An Audio-Visual Transformer Approach. CoRR abs/2108.03256 (2021) - [i62]Weiyang Liu, Yandong Wen, Bhiksha Raj, Rita Singh, Adrian Weller:
SphereFace Revived: Unifying Hyperspherical Face Recognition. CoRR abs/2109.05565 (2021) - [i61]Yandong Wen, Weiyang Liu, Bhiksha Raj, Rita Singh:
Self-Supervised 3D Face Reconstruction via Conditional Estimation. CoRR abs/2110.04800 (2021) - [i60]Raphaël Olivier, Bhiksha Raj:
Sequential Randomized Smoothing for Adversarially Robust Speech Recognition. CoRR abs/2112.03000 (2021) - 2020
- [c197]Muhammad Ahmed Shah, Bhiksha Raj:
Deriving Compact Feature Representations Via Annealed Contraction. ICASSP 2020: 2068-2072 - [c196]Rowland Chen, Roger B. Dannenberg, Bhiksha Raj, Rita Singh:
Artificial Creative Intelligence: Breaking the Imitation Barrier. ICCC 2020: 319-325 - [c195]Wenbo Zhao, Yang Gao, Shahan Ali Memon, Bhiksha Raj, Rita Singh:
Hierarchical Routing Mixture of Experts. ICPR 2020: 7900-7906 - [c194]Muhammad Ahmed Shah, Raphaël Olivier, Bhiksha Raj:
Exploiting Non-Linear Redundancy for Neural Model Compression. ICPR 2020: 9928-9935 - [c193]Muhammad A. Shah, Raphaël Olivier, Bhiksha Raj:
Optimal Strategies For Comparing Covariates To Solve Matching Problems. ICPR 2020: 10622-10628 - [c192]Hira Dhamyal, Shahan Ali Memon, Bhiksha Raj, Rita Singh:
The Phonetic Bases of Vocal Expressed Emotion: Natural versus Acted. INTERSPEECH 2020: 3451-3455 - [c191]Felix Kreuk, Yossi Adi, Bhiksha Raj, Rita Singh, Joseph Keshet:
Hide and Speak: Towards Deep Neural Networks for Speech Steganography. INTERSPEECH 2020: 4656-4660 - [c190]Hao Liang, Lulan Yu, Guikang Xu, Bhiksha Raj, Rita Singh:
Controlled AutoEncoders to Generate Faces from Voices. ISVC (1) 2020: 476-487 - [c189]Maria Joana Correia, Isabel Trancoso, Bhiksha Raj:
Automatic In-the-wild Dataset Annotation with Deep Generalized Multiple Instance Learning. LREC 2020: 3542-3550 - [c188]Muhammad Ahmed Shah, Khaled A. Harras, Bhiksha Raj:
Sherlock: A Crowd-sourced System For Automatic Tagging Of Indoor Floor Plans. MASS 2020: 594-602 - [c187]Jie Shao, Kai Hu, Changhu Wang, Xiangyang Xue, Bhiksha Raj:
Is normalization indispensable for training deep neural network? NeurIPS 2020 - [i59]Yuichiro Koyama, Tyler Vuong, Stefan Uhlich, Bhiksha Raj:
Exploring the Best Loss Function for DNN-Based Low-latency Speech Enhancement with Temporal Convolutional Networks. CoRR abs/2005.11611 (2020) - [i58]Yuichiro Koyama, Oluwafemi Azeez, Bhiksha Raj:
Efficient Integration of Multi-channel Information for Speaker-independent Speech Separation. CoRR abs/2005.11612 (2020) - [i57]Yuichiro Koyama, Bhiksha Raj:
Exploring Optimal DNN Architecture for End-to-End Beamformers Based on Time-frequency References. CoRR abs/2005.12683 (2020) - [i56]Muhammad Ahmed Shah, Raphaël Olivier, Bhiksha Raj:
Exploiting Non-Linear Redundancy for Neural Model Compression. CoRR abs/2005.14070 (2020) - [i55]Soham Deshmukh, Bhiksha Raj, Rita Singh:
Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection. CoRR abs/2008.07085 (2020) - [i54]Yang Gao, Jiachen Lian, Bhiksha Raj, Rita Singh:
Detection and Evaluation of human and machine generated speech in spoofing attacks on automatic speaker verification systems. CoRR abs/2011.03689 (2020) - [i53]Jiachen Lian, Aiswarya Vinod Kumar, Hira Dhamyal, Bhiksha Raj, Rita Singh:
Mask Proxy Loss for Text-Independent Speaker Recognition. CoRR abs/2011.04491 (2020) - [i52]Ali Shahin Shamsabadi, Francisco Sepúlveda Teixeira, Alberto Abad, Bhiksha Raj, Andrea Cavallaro, Isabel Trancoso:
FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances. CoRR abs/2011.08483 (2020)
2010 – 2019
- 2019
- [j33]Annamaria Mesaros, Aleksandr Diment, Benjamin Elizalde, Toni Heittola, Emmanuel Vincent, Bhiksha Raj, Tuomas Virtanen:
Sound Event Detection in the DCASE 2017 Challenge. IEEE ACM Trans. Audio Speech Lang. Process. 27(6): 992-1006 (2019) - [c186]M. Joana Correia, Isabel Trancoso, Bhiksha Raj:
In-the-Wild End-to-End Detection of Speech Affecting Diseases. ASRU 2019: 734-741 - [c185]Hira Dhamyal, Tianyan Zhou, Bhiksha Raj, Rita Singh:
Optimizing Neural Network Embeddings Using a Pair-Wise Loss for Text-Independent Speaker Verification. ASRU 2019: 742-748 - [c184]Abelino Jiménez, Bhiksha Raj:
Time Signal Classification Using Random Convolutional Features. ICASSP 2019: 3592-3596 - [c183]Benjamin Elizalde, Shuayb Zarar, Bhiksha Raj:
Cross Modal Audio Search and Retrieval with Joint Embeddings Based on Text and Audio. ICASSP 2019: 4095-4099 - [c182]Daanish Ali Khan, Saquib Razak, Bhiksha Raj, Rita Singh:
Human Behaviour Recognition Using Wifi Channel State Information. ICASSP 2019: 7625-7629 - [c181]Yandong Wen, Mahmoud Al Ismail, Weiyang Liu, Bhiksha Raj, Rita Singh:
Disjoint Mapping Network for Cross-modal Matching of Voices and Faces. ICLR (Poster) 2019 - [c180]Anurag Kumar, Ankit Shah, Alexander G. Hauptmann, Bhiksha Raj:
Learning Sound Events from Webly Labeled Data. IJCAI 2019: 2772-2778 - [c179]Shahan Ali Memon, Wenbo Zhao, Bhiksha Raj, Rita Singh:
Neural Regression Trees. IJCNN 2019: 1-8 - [c178]Yandong Wen, Bhiksha Raj, Rita Singh:
Face Reconstruction from Voice using Generative Adversarial Networks. NeurIPS 2019: 5266-5275 - [i51]Felix Kreuk, Yossi Adi, Bhiksha Raj, Rita Singh, Joseph Keshet:
Hide and Speak: Deep Neural Networks for Speech Steganography. CoRR abs/1902.03083 (2019) - [i50]Wenbo Zhao, Yang Gao, Shahan Ali Memon, Bhiksha Raj, Rita Singh:
Hierarchical Routing Mixture of Experts. CoRR abs/1903.07756 (2019) - [i49]Chirag Nagpal, Rohan Sangave, Amit Chahar, Parth Shah, Artur Dubrawski, Bhiksha Raj:
Nonlinear Semi-Parametric Models for Survival Analysis. CoRR abs/1905.05865 (2019) - [i48]Yandong Wen, Rita Singh, Bhiksha Raj:
Reconstructing faces from voices. CoRR abs/1905.10604 (2019) - [i47]Daanish Ali Khan, Linhong Li, Ninghao Sha, Zhuoran Liu, Abelino Jimenez, Bhiksha Raj, Rita Singh:
Non-Determinism in Neural Networks for Adversarial Robustness. CoRR abs/1905.10906 (2019) - [i46]Shahan Ali Memon, Hira Dhamyal, Oren Wright, Daniel Justice, Vijaykumar Palat, William Boler, Yandong Wen, Bhiksha Raj, Rita Singh:
Detecting gender differences in perception of emotion in crowdsourced data. CoRR abs/1910.11386 (2019) - [i45]Yuichiro Koyama, Bhiksha Raj:
W-Net BF: DNN-based Beamformer Using Joint Training Approach. CoRR abs/1910.14262 (2019) - [i44]Hira Dhamyal, Shahan Ali Memon, Bhiksha Raj, Rita Singh:
The phonetic bases of vocal expressed emotion: natural versus acted. CoRR abs/1911.05733 (2019) - 2018
- [j32]Sebastian Säger, Benjamin Elizalde, Damian Borth, Christian Schulze, Bhiksha Raj, Ian R. Lane:
AudioPairBank: towards a large-scale tag-pair-based audio content analysis. EURASIP J. Audio Speech Music. Process. 2018: 12 (2018) - [c177]Abelino Jimenez, Benjamin Elizalde, Bhiksha Raj:
Acoustic Scene Classification Using Discrete Random Hashing for Laplacian Kernel Machines. ICASSP 2018: 146-150 - [c176]Yang Gao, Rita Singh, Bhiksha Raj:
Voice Impersonation Using Generative Adversarial Networks. ICASSP 2018: 2506-2510 - [c175]Rohan Badlani, Ankit Shah, Benjamin Elizalde, Anurag Kumar, Bhiksha Raj:
Framework for Evaluation of Sound Event Detection in Web Videos. ICASSP 2018: 3096-3100 - [c174]Pranay Manocha, Rohan Badlani, Anurag Kumar, Ankit Shah, Benjamin Elizalde, Bhiksha Raj:
Content-Based Representations of Audio Using Siamese Neural Networks. ICASSP 2018: 3136-3140 - [c173]Yandong Wen, Tianyan Zhou, Rita Singh, Bhiksha Raj:
A Corrective Learning Approach for Text-Independent Speaker Verification. ICASSP 2018: 4894-4898 - [c172]Sabit Hassan, Shaden Shaar, Bhiksha Raj, Saquib Razak:
Interactive Evaluation of Classifiers Under Limited Resources. ICMLA 2018: 173-180 - [c171]M. Joana Correia, Bhiksha Raj, Isabel Trancoso, Francisco Teixeira:
Mining Multimodal Repositories for Speech Affecting Diseases. INTERSPEECH 2018: 2963-2967 - [c170]Anurag Kumar, Bhiksha Raj:
Classifier Risk Estimation Under Limited Labeling Resources. PAKDD (1) 2018: 3-15 - [c169]Isabel Trancoso, Maria Joana Correia, Francisco Teixeira, Bhiksha Raj, Alberto Abad:
Analysing Speech for Clinical Applications. SLSP 2018: 3-6 - [c168]M. Joana Correia, Bhiksha Raj, Isabel Trancoso:
Querying Depression Vlogs. SLT 2018: 987-993 - [c167]Isabel Trancoso, Maria Joana Correia, Francisco Teixeira, Bhiksha Raj, Alberto Abad:
Speech Analytics for Medical Applications. TSD 2018: 26-37 - [p7]Gerald Friedland, Paris Smaragdis, Josh H. McDermott, Bhiksha Raj:
Audition for multimedia computing. Frontiers of Multimedia Research 2018: 31-50 - [i43]Abelino Jimenez, Benjamin Elizalde, Bhiksha Raj:
DCASE 2017 Task 1: Acoustic Scene Classification Using Shift-Invariant Kernels and Random Features. CoRR abs/1801.02690 (2018) - [i42]Benjamin Elizalde, Rohan Badlani, Ankit Shah, Anurag Kumar, Bhiksha Raj:
NELS - Never-Ending Learner of Sounds. CoRR abs/1801.05544 (2018) - [i41]Yang Gao, Rita Singh, Bhiksha Raj:
Voice Impersonation using Generative Adversarial Networks. CoRR abs/1802.06840 (2018) - [i40]Ankit Shah, Anurag Kumar, Alexander G. Hauptmann, Bhiksha Raj:
A Closer Look at Weak Label Learning for Audio Events. CoRR abs/1804.09288 (2018) - [i39]Yandong Wen, Mahmoud Al Ismail, Bhiksha Raj, Rita Singh:
Optimal Strategies for Matching and Retrieval Problems by Comparing Covariates. CoRR abs/1807.04834 (2018) - [i38]Yandong Wen, Mahmoud Al Ismail, Weiyang Liu, Bhiksha Raj, Rita Singh:
Disjoint Mapping Network for Cross-modal Matching of Voices and Faces. CoRR abs/1807.04836 (2018) - [i37]Shahan Ali Memon, Wenbo Zhao, Bhiksha Raj, Rita Singh:
Neural Regression Trees. CoRR abs/1810.00974 (2018) - [i36]Anurag Kumar, Ankit Shah, Alexander G. Hauptmann, Bhiksha Raj:
Learning Sound Events From Webly Labeled Data. CoRR abs/1811.09967 (2018) - 2017
- [c166]Nia Peters, Bhiksha Raj, Griffin D. Romigh:
Topic and Prosodic Modeling for Interruption Management in Multi-User Multitasking Communication Interactions. AAAI Fall Symposia 2017: 45-53 - [c165]Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, Le Song:
SphereFace: Deep Hypersphere Embedding for Face Recognition. CVPR 2017: 6738-6746 - [c164]Abelino Jimenez, Benjamin Elizalde, Bhiksha Raj:
DCASE 2017 Task 1: Acoustic Scene Classification Using Shift-Invariant Kernels and Random Features. DCASE 2017: 55-58 - [c163]Annamaria Mesaros, Toni Heittola, Aleksandr Diment, Benjamin Elizalde, Ankit Shah, Emmanuel Vincent, Bhiksha Raj, Tuomas Virtanen:
DCASE2017 Challenge Setup: Tasks, Datasets and Baseline System. DCASE 2017: 85-92 - [c162]Benjamin Elizalde, Ankit Shah, Siddharth Dalmia, Min Hun Lee, Rohan Badlani, Anurag Kumar, Bhiksha Raj, Ian R. Lane:
An approach for self-training audio event detectors using web data. EUSIPCO 2017: 1863-1867 - [c161]Keiichi Osako, Yuki Mitsufuji, Rita Singh, Bhiksha Raj:
Supervised monaural source separation based on autoencoders. ICASSP 2017: 11-15 - [c160]Anurag Kumar, Bhiksha Raj, Ndapandula Nakashole:
Discovering sound concepts and acoustic relations in text. ICASSP 2017: 631-635 - [c159]Abelino Jimenez, Bhiksha Raj:
Privacy preserving Distance computation using somewhat-trusted third parties. ICASSP 2017: 6399-6403 - [c158]Bhiksha Raj, Anurag Kumar:
Audio event and scene recognition: A unified approach using strongly and weakly labeled data. IJCNN 2017: 3475-3482 - [c157]Janek Ebbers, Jahn Heymann, Lukas Drude, Thomas Glarner, Reinhold Haeb-Umbach, Bhiksha Raj:
Hidden Markov Model Variational Autoencoder for Acoustic Unit Discovery. INTERSPEECH 2017: 488-492 - [c156]Anurag Kumar, Benjamin Elizalde, Bhiksha Raj:
Audio Content Based Geotagging in Multimedia. INTERSPEECH 2017: 1874-1878 - [c155]Muhammad Ahmed Shah, Bhiksha Raj, Khaled A. Harras:
Inferring room semantics using acoustic monitoring. MLSP 2017: 1-6 - [c154]Abelino Jimenez, Bhiksha Raj:
A two factor transformation for speaker verification through ℓ1 comparison. WIFS 2017: 1-6 - [p6]Keisuke Kinoshita, Marc Delcroix, Sharon Gannot, Emanuël A. P. Habets, Reinhold Haeb-Umbach, Walter Kellermann, Volker Leutnant, Roland Maas, Tomohiro Nakatani, Bhiksha Raj, Armin Sehr, Takuya Yoshioka:
The REVERB Challenge: A Benchmark Task for Reverberation-Robust ASR Techniques. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 345-354 - [i35]Nikolas Wolfe, Aditya Sharma, Lukas Drude, Bhiksha Raj:
The Incredible Shrinking Neural Network: New Perspectives on Learning Representations Through The Lens of Pruning. CoRR abs/1701.04465 (2017) - [i34]Haohan Wang, Bhiksha Raj, Eric P. Xing:
On the Origin of Deep Learning. CoRR abs/1702.07800 (2017) - [i33]Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, Le Song:
SphereFace: Deep Hypersphere Embedding for Face Recognition. CoRR abs/1704.08063 (2017) - [i32]Anurag Kumar, Bhiksha Raj:
Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data. CoRR abs/1707.02530 (2017) - [i31]Anders Øland, Aayush Bansal, Roger B. Dannenberg, Bhiksha Raj:
Be Careful What You Backpropagate: A Case For Linear Output Activations & Gradient Boosting. CoRR abs/1707.04199 (2017) - [i30]Muhammad Ahmed Shah, Bhiksha Raj, Khaled A. Harras:
Inferring Room Semantics Using Acoustic Monitoring. CoRR abs/1710.08684 (2017) - [i29]Pranay Manocha, Rohan Badlani, Anurag Kumar, Ankit Shah, Benjamin Elizalde, Bhiksha Raj:
Content-based Representations of audio using Siamese neural networks. CoRR abs/1710.10974 (2017) - [i28]Rohan Badlani, Ankit Shah, Benjamin Elizalde, Anurag Kumar, Bhiksha Raj:
Framework for evaluation of sound event detection in web videos. CoRR abs/1711.00804 (2017) - 2016
- [j31]Keisuke Kinoshita, Marc Delcroix, Sharon Gannot, Emanuël A. P. Habets, Reinhold Haeb-Umbach, Walter Kellermann, Volker Leutnant, Roland Maas, Tomohiro Nakatani, Bhiksha Raj, Armin Sehr, Takuya Yoshioka:
A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research. EURASIP J. Adv. Signal Process. 2016: 7 (2016) - [j30]Sohail Bahmani, Petros T. Boufounos, Bhiksha Raj:
Learning Model-Based Sparsity via Projected Gradient Descent. IEEE Trans. Inf. Theory 62(4): 2092-2099 (2016) - [j29]Afsaneh Asaei, Mohammad Javad Taghizadeh, Saeid Haghighatshoar, Bhiksha Raj, Hervé Bourlard, Volkan Cevher:
Binary Sparse Coding of Convolutive Mixtures for Sound Localization and Separation via Spatialization. IEEE Trans. Signal Process. 64(3): 567-579 (2016) - [c153]Zhen-Zhong Lan, Shoou-I Yu, Dezhong Yao, Ming Lin, Bhiksha Raj, Alexander G. Hauptmann:
The Best of BothWorlds: Combining Data-Independent and Data-Driven Approaches for Action Recognition. CVPR Workshops 2016: 1196-1205 - [c152]Benjamin Elizalde, Anurag Kumar, Ankit Shah, Rohan Badlani, Emmanuel Vincent, Bhiksha Raj, Ian R. Lane:
Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording. DCASE 2016: 20-24 - [c151]João Miranda, Ramón Fernandez Astudillo, Ângela Costa, André Silva, Hugo Silva, João Graça, Bhiksha Raj:
Crowdsourced Video Subtitling with Adaptation Based on User-Corrected Lattices. IberSPEECH 2016: 138-147 - [c150]Maria Joana Correia, Isabel Trancoso, Bhiksha Raj:
Detecting Psychological Distress in Adults Through Transcriptions of Clinical Interviews. IberSPEECH 2016: 162-171 - [c149]Rita Singh, Joseph Keshet, Deniz Gençaga, Bhiksha Raj:
The relationship of voice onset time and Voice Offset Time to physical age. ICASSP 2016: 5390-5394 - [c148]Anurag Kumar, Bhiksha Raj:
Weakly supervised scalable audio content analysis. ICME 2016: 1-6 - [c147]Agha Ali Raza, Rajat Kulshreshtha, Spandana Gella, Sean Blagsvedt, Maya Chandrasekaran, Bhiksha Raj, Roni Rosenfeld:
Viral Spread via Entertainment and Voice-Messaging Among Telephone Users in India. ICTD 2016: 1 - [c146]Lukas Drude, Bhiksha Raj, Reinhold Haeb-Umbach:
On the Appropriateness of Complex-Valued Neural Networks for Speech Enhancement. INTERSPEECH 2016: 1745-1749 - [c145]Rita Singh, Deniz Gençaga, Bhiksha Raj:
Formant manipulations in voice disguise by mimicry. IWBF 2016: 1-6 - [c144]Rita Singh, Bhiksha Raj, James Baker:
Short-term analysis for estimating physical parameters of speakers. IWBF 2016: 1-6 - [c143]Rita Singh, Bhiksha Raj, Deniz Gençaga:
Forensic anthropometry from voice: An articulatory-phonetic approach. MIPRO 2016: 1375-1380 - [c142]Anurag Kumar, Bhiksha Raj:
Audio Event Detection using Weakly Labeled Data. ACM Multimedia 2016: 1038-1047 - [c141]Maria Joana Correia, Isabel Trancoso, Bhiksha Raj:
Adaptation of SVM for MIL for inferring the polarity of movies and movie reviews. SLT 2016: 258-264 - [i27]Suyoun Kim, Bhiksha Raj, Ian R. Lane:
Environmental Noise Embeddings for Robust Speech Recognition. CoRR abs/1601.02553 (2016) - [i26]Rahul Radhakrishnan Iyer, Sanjeel Parekh, Vikas Mohandoss, Anush Ramsurat, Bhiksha Raj, Rita Singh:
Content-based Video Indexing and Retrieval Using Corr-LDA. CoRR abs/1602.08581 (2016) - [i25]Anurag Kumar, Bhiksha Raj:
Audio Event Detection using Weakly Labeled Data. CoRR abs/1605.02401 (2016) - [i24]Anurag Kumar, Benjamin Elizalde, Bhiksha Raj:
Audio Content based Geotagging in Multimedia. CoRR abs/1606.02816 (2016) - [i23]Anurag Kumar, Bhiksha Raj:
Weakly Supervised Scalable Audio Content Analysis. CoRR abs/1606.03664 (2016) - [i22]Anurag Kumar, Bhiksha Raj:
Classifier Risk Estimation under Limited Labeling Resources. CoRR abs/1607.02665 (2016) - [i21]Sebastian Säger, Damian Borth, Benjamin Elizalde, Christian Schulze, Bhiksha Raj, Ian R. Lane, Andreas Dengel:
AudioSentibank: Large-scale Semantic Ontology of Acoustic Concepts for Audio Content Analysis. CoRR abs/1607.03766 (2016) - [i20]Anurag Kumar, Bhiksha Raj:
Features and Kernels for Audio Event Recognition. CoRR abs/1607.05765 (2016) - [i19]Benjamin Elizalde, Anurag Kumar, Ankit Shah, Rohan Badlani, Emmanuel Vincent, Bhiksha Raj, Ian R. Lane:
Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording. CoRR abs/1607.06706 (2016) - [i18]Abelino Jimenez, Bhiksha Raj:
Privacy Preserving Distance Computation using Somewhat-trusted Third Parties. CoRR abs/1609.05178 (2016) - [i17]Ankit Shah, Rohan Badlani, Anurag Kumar, Benjamin Elizalde, Bhiksha Raj:
An Approach for Self-Training Audio Event Detectors Using Web Data. CoRR abs/1609.06026 (2016) - [i16]Anurag Kumar, Bhiksha Raj, Ndapandula Nakashole:
Discovering Sound Concepts and Acoustic Relations In Text. CoRR abs/1609.07384 (2016) - [i15]Anurag Kumar, Bhiksha Raj:
Audio Event and Scene Recognition: A Unified Approach using Strongly and Weakly Labeled Data. CoRR abs/1611.04871 (2016) - 2015
- [j28]Tuomas Virtanen, Jort Florent Gemmeke, Bhiksha Raj, Paris Smaragdis:
Compositional Models for Audio Processing: Uncovering the structure of sound mixtures. IEEE Signal Process. Mag. 32(2): 125-144 (2015) - [c140]Wenbo Liu, Li Yi, Zhiding Yu, Xiaobing Zou, Bhiksha Raj, Ming Li:
Efficient autism spectrum disorder prediction with eye movement: A machine learning framework. ACII 2015: 649-655 - [c139]Zhen-Zhong Lan, Ming Lin, Xuanchong Li, Alexander G. Hauptmann, Bhiksha Raj:
Beyond Gaussian Pyramid: Multi-skip Feature Stacking for action recognition. CVPR 2015: 204-212 - [c138]José Portelo, Alberto Abad, Bhiksha Raj, Isabel Trancoso:
Privacy-preserving Query-by-Example Speech Search. ICASSP 2015: 1797-1801 - [c137]Anurag Kumar, Bhiksha Raj:
A novel ranking method for multiple classifier systems. ICASSP 2015: 1931-1935 - [c136]Anders Øland, Bhiksha Raj:
Reducing communication overhead in distributed learning by an order of magnitude (almost). ICASSP 2015: 2219-2223 - [c135]Wenbo Liu, Zhiding Yu, Bhiksha Raj, Ming Li:
Locality constrained transitive distance clustering on speech data. INTERSPEECH 2015: 2917-2921 - [c134]Nikolas Wolfe, Juneki Hong, Agha Ali Raza, Bhiksha Raj, Roni Rosenfeld:
Rapid development of public health education systems in low-literacy multilingual environments: combating ebola through voice messaging. SLaTE 2015: 131-136 - [c133]Shoou-I Yu, Lu Jiang, Zhongwen Xu, Zhenzhong Lan, Shicheng Xu, Xiaojun Chang, Xuanchong Li, Zexi Mao, Chuang Gan, Yajie Miao, Xingzhong Du, Yang Cai, Lara J. Martin, Nikolas Wolfe, Anurag Kumar, Huan Li, Ming Lin, Zhigang Ma, Yi Yang, Deyu Meng, Shiguang Shan, Pinar Duygulu Sahin, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Teruko Mitamura, Richard M. Stern, Alexander G. Hauptmann:
CMU Informedia@TRECVID 2015: MED/SIN/LNK/SED. TRECVID 2015 - [c132]Keiichi Osako, Rita Singh, Bhiksha Raj:
Complex recurrent neural networks for denoising speech signals. WASPAA 2015: 1-5 - [c131]Abelino Jimenez, Bhiksha Raj, José Portelo, Isabel Trancoso:
Secure Modular Hashing. WIFS 2015: 1-6 - [i14]Anurag Kumar, Bhiksha Raj:
Unsupervised Fusion Weight Learning in Multiple Classifier Systems. CoRR abs/1502.01823 (2015) - [i13]Soham De, Indradyumna Roy, Tarunima Prabhakar, Kriti Suneja, Sourish Chaudhuri, Rita Singh, Bhiksha Raj:
Plagiarism Detection in Polyphonic Music using Monaural Signal Separation. CoRR abs/1503.00022 (2015) - [i12]Luís Marujo, José Portelo, Wang Ling, David Martins de Matos, João Paulo Neto, Anatole Gershman, Jaime G. Carbonell, Isabel Trancoso, Bhiksha Raj:
Privacy-Preserving Multi-Document Summarization. CoRR abs/1508.01420 (2015) - [i11]Haohan Wang, Bhiksha Raj:
A Survey: Time Travel in Deep Learning Space: An Introduction to Deep Learning Models and How Deep Learning Models Evolved from the Initial Ideas. CoRR abs/1510.04781 (2015) - [i10]Zhen-Zhong Lan, Shoou-I Yu, Ming Lin, Bhiksha Raj, Alexander G. Hauptmann:
Handcrafted Local Features are Convolutional Neural Networks. CoRR abs/1511.05045 (2015) - 2014
- [c130]Anurag Kumar, Rita Singh, Bhiksha Raj:
Detecting sound objects in audio recordings. EUSIPCO 2014: 905-909 - [c129]José Portelo, Bhiksha Raj, Alberto Abad, Isabel Trancoso:
Privacy-preserving speaker verification using garbled GMMS. EUSIPCO 2014: 2070-2074 - [c128]Tuomas Virtanen, Bhiksha Raj, Jort F. Gemmeke, Hugo Van hamme:
Active-set newton algorithm for non-negative sparse coding of audio. ICASSP 2014: 3092-3096 - [c127]Jahn Heymann, Oliver Walter, Reinhold Haeb-Umbach, Bhiksha Raj:
Iterative Bayesian word segmentation for unsupervised vocabulary discovery from phoneme lattices. ICASSP 2014: 4057-4061 - [c126]Amir R. Moghimi, Bhiksha Raj, Richard M. Stern:
Post-masking: a hybrid approach to array processing for speech recognition. INTERSPEECH 2014: 2425-2429 - [c125]José Portelo, Bhiksha Raj, Alberto Abad, Isabel Trancoso:
Privacy-preserving speaker verification using secure binary embeddings. MIPRO 2014: 1268-1272 - [c124]Luís Marujo, José Portelo, David Martins de Matos, João Paulo Neto, Anatole Gershman, Jaime G. Carbonell, Isabel Trancoso, Bhiksha Raj:
Privacy-Preserving Important Passage Retrieval. PIR@SIGIR 2014: 7-12 - [c123]Shoou-I Yu, Lu Jiang, Zhongwen Xu, Zhenzhong Lan, Shicheng Xu, Xiaojun Chang, Xuanchong Li, Zexi Mao, Chuang Gan, Yajie Miao, Xingzhong Du, Yang Cai, Lara J. Martin, Nikolas Wolfe, Anurag Kumar, Huan Li, Ming Lin, Zhigang Ma, Yi Yang, Deyu Meng, Shiguang Shan, Pinar Duygulu Sahin, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Teruko Mitamura, Richard M. Stern, Alexander G. Hauptmann, Anil Armagan, Yicheng Zhao:
Informedia @ TRECVID 2014. TRECVID 2014 - [i9]Luís Marujo, José Portelo, David Martins de Matos, João Paulo Neto, Anatole Gershman, Jaime G. Carbonell, Isabel Trancoso, Bhiksha Raj:
Privacy-Preserving Important Passage Retrieval. CoRR abs/1407.5416 (2014) - [i8]Zhen-Zhong Lan, Ming Lin, Xuanchong Li, Alexander G. Hauptmann, Bhiksha Raj:
Beyond Gaussian Pyramid: Multi-skip Feature Stacking for Action Recognition. CoRR abs/1411.6660 (2014) - [i7]I-Ting Liu, Bhiksha Ramakrishnan:
Bach in 2014: Music Composition with Recurrent Neural Network. CoRR abs/1412.3191 (2014) - 2013
- [j27]Gahgene Gweon, Mahaveer Jain, John W. McDonough, Bhiksha Raj, Carolyn P. Rosé:
Measuring prevalence of other-oriented transactive contributions using an automated measure of speech style accommodation. Int. J. Comput. Support. Collab. Learn. 8(2): 245-265 (2013) - [j26]Sohail Bahmani, Bhiksha Raj, Petros T. Boufounos:
Greedy sparsity-constrained optimization. J. Mach. Learn. Res. 14(1): 807-841 (2013) - [j25]Manas A. Pathak, Bhiksha Raj, Shantanu Rane, Paris Smaragdis:
Privacy-Preserving Speech Processing: Cryptographic and String-Matching Frameworks Show Promise. IEEE Signal Process. Mag. 30(2): 62-74 (2013) - [j24]Manas A. Pathak, Bhiksha Raj:
Privacy-Preserving Speaker Verification and Identification Using Gaussian Mixture Models. IEEE Trans. Speech Audio Process. 21(2): 397-406 (2013) - [j23]Tuomas Virtanen, Jort Florent Gemmeke, Bhiksha Raj:
Active-Set Newton Algorithm for Overcomplete Non-Negative Representations of Audio. IEEE Trans. Speech Audio Process. 21(11): 2277-2289 (2013) - [c122]Oliver Walter, Timo Korthals, Reinhold Haeb-Umbach, Bhiksha Raj:
A hierarchical system for word discovery exploiting DTW-based initialization. ASRU 2013: 386-391 - [c121]Jahn Heymann, Oliver Walter, Reinhold Haeb-Umbach, Bhiksha Raj:
Unsupervised word segmentation from noisy input. ASRU 2013: 458-463 - [c120]Anurag Kumar, Rajesh M. Hegde, Rita Singh, Bhiksha Raj:
Event detection in short duration audio using Gaussian Mixture Model and Random Forest Classifier. EUSIPCO 2013: 1-5 - [c119]José Portelo, Bhiksha Raj, Petros Boufounos, Isabel Trancoso, Alberto Abad:
Speaker verification using Secure Binary Embeddings. EUSIPCO 2013: 1-5 - [c118]Sourish Chaudhuri, Bhiksha Raj:
Unsupervised hierarchical structure induction for deeper semantic analysis of audio. ICASSP 2013: 833-837 - [c117]John W. McDonough, Ken'ichi Kumatani, Takayuki Arakawa, Kazumasa Yamamoto, Bhiksha Raj:
Speaker tracking with spherical microphone arrays. ICASSP 2013: 3981-3985 - [c116]Leibny Paola García-Perera, Bhiksha Raj, Juan Arturo Nolazco-Flores:
Optimization of the DET curve in speaker verification under noisy conditions. ICASSP 2013: 7765-7769 - [c115]Shubhranshu Barnwal, Rohit Barnwal, Rajesh M. Hegde, Rita Singh, Bhiksha Raj:
Doppler based speed estimation of vehicles using passive sensor. ICME Workshops 2013: 1-4 - [c114]Pranay Dighe, Parul Agarwal, Harish Karnick, Siddartha Thota, Bhiksha Raj:
Scale independent raga identification using chromagram patterns and swara based features. ICME Workshops 2013: 1-4 - [c113]Leibny Paola García-Perera, Bhiksha Raj, Juan Arturo Nolazco-Flores:
Ensemble approach in speaker verification. INTERSPEECH 2013: 2455-2459 - [c112]José Portelo, Alberto Abad, Bhiksha Raj, Isabel Trancoso:
Secure binary embeddings of front-end factor analysis for privacy preserving speaker verification. INTERSPEECH 2013: 2494-2498 - [c111]Benjamin Lambert, Bhiksha Raj, Rita Singh:
Discriminatively trained dependency language modeling for conversational speech recognition. INTERSPEECH 2013: 3414-3418 - [c110]Parul Agarwal, Harish Karnick, Bhiksha Raj:
A Comparative Study Of Indian And Western Music Forms. ISMIR 2013: 29-34 - [c109]Pranay Dighe, Harish Karnick, Bhiksha Raj:
Swara Histogram Based Structural Analysis And Identification Of Indian Classical Ragas. ISMIR 2013: 35-40 - [c108]Zhenzhong Lan, Lu Jiang, Shoou-I Yu, Chenqiang Gao, Shourabh Rawat, Yang Cai, Shicheng Xu, Haoquan Shen, Xuanchong Li, Yipei Wang, Waito Sze, Yan Yan, Zhigang Ma, Nicolas Ballas, Deyu Meng, Wei Tong, Yi Yang, Susanne Burger, Florian Metze, Rita Singh, Bhiksha Raj, Richard M. Stern, Teruko Mitamura, Eric Nyberg, Alexander G. Hauptmann:
Informedia@TRECVID 2013. TRECVID 2013 - [i6]Sohail Bahmani, Petros T. Boufounos, Bhiksha Raj:
Robust 1-bit Compressive Sensing via Gradient Support Pursuit. CoRR abs/1304.6627 (2013) - 2012
- [j22]Paris Smaragdis, Bhiksha Raj:
The Markov selection model for concurrent speech recognition. Neurocomputing 80: 64-72 (2012) - [j21]Bhiksha Raj, Kaustubh Kalgaonkar, Chris Harrison, Paul H. Dietz:
Ultrasonic Doppler Sensing in HCI. IEEE Pervasive Comput. 11(2): 24-29 (2012) - [j20]Ken'ichi Kumatani, John W. McDonough, Bhiksha Raj:
Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors. IEEE Signal Process. Mag. 29(6): 127-140 (2012) - [j19]Yu-Hsiang Bosco Chiu, Bhiksha Raj, Richard M. Stern:
Learning-Based Auditory Encoding for Robust Speech Recognition. IEEE Trans. Speech Audio Process. 20(3): 900-914 (2012) - [j18]Manas A. Pathak, Bhiksha Raj:
Large Margin Gaussian Mixture Models with Differential Privacy. IEEE Trans. Dependable Secur. Comput. 9(4): 463-469 (2012) - [c107]Ken'ichi Kumatani, Takayuki Arakawa, Kazumasa Yamamoto, John W. McDonough, Bhiksha Raj, Rita Singh, Ivan Tashev:
Microphone array processing for distant speech recognition: Towards real-world deployment. APSIPA 2012: 1-10 - [c106]John W. McDonough, Ken'ichi Kumatani, Bhiksha Raj:
Microphone array processing for distant speech recognition: Spherical arrays. APSIPA 2012: 1-10 - [c105]Mahaveer Jain, John W. McDonough, Gahgene Gweon, Bhiksha Raj, Carolyn Penstein Rosé:
An Unsupervised Dynamic Bayesian Network Approach to Measuring Speech Style Accommodation. EACL 2012: 787-797 - [c104]Anurag Kumar, Pranay Dighe, Rita Singh, Sourish Chaudhuri, Bhiksha Raj:
Audio event detection from acoustic unit occurrence patterns. ICASSP 2012: 489-492 - [c103]José Portelo, Bhiksha Raj, Isabel Trancoso:
Attacking a privacy preserving music matching algorithm. ICASSP 2012: 1821-1824 - [c102]Manas A. Pathak, Bhiksha Raj:
Privacy-preserving speaker verification as password matching. ICASSP 2012: 1849-1852 - [c101]Shubhranshu Barnwal, Kamal Sahni, Rita Singh, Bhiksha Raj:
Spectrographic seam patterns for discriminative word spotting. ICASSP 2012: 4725-4728 - [c100]Gahgene Gweon, Mahaveer Jain, John W. McDonough, Bhiksha Raj, Carolyn P. Rosé:
Predicting Idea Co-Construction in Speech Data using Insights from Sociolinguistics. ICLS 2012 - [c99]Afsaneh Asaei, Bhiksha Raj, Hervé Bourlard, Volkan Cevher:
Structured sparse coding for microphone array location calibration. SAPA@INTERSPEECH 2012: 74-79 - [c98]Kamal Sahni, Pranay Dighe, Rita Singh, Bhiksha Raj:
Language identification using spectro-temporal patch features. SAPA@INTERSPEECH 2012: 110-113 - [c97]Ken'ichi Kumatani, Bhiksha Raj, Rita Singh, John W. McDonough:
Microphone Array Post-filter based on Spatially-Correlated Noise Measurements for Distant Speech Recognition. INTERSPEECH 2012: 298-301 - [c96]Sourish Chaudhuri, Rita Singh, Bhiksha Raj:
Exploiting Temporal Sequence Structure for Semantic Analysis of Multimedia. INTERSPEECH 2012: 1728-1731 - [c95]Soham De, Indradyumna Roy, Tarunima Prabhakar, Kriti Suneja, Sourish Chaudhuri, Rita Singh, Bhiksha Raj:
Plagiarism Detection in Polyphonic Music using Monaural Signal Separation. INTERSPEECH 2012: 1744-1747 - [c94]Manas A. Pathak, José Portelo, Bhiksha Raj, Isabel Trancoso:
Privacy-Preserving Speaker Authentication. ISC 2012: 1-22 - [c93]Sourish Chaudhuri, Bhiksha Raj:
Unsupervised Structure Discovery for Semantic Analysis of Audio. NIPS 2012: 1187-1195 - [c92]