Xuankai Chang | Publications

2021

SUPERB: Speech processing Universal PERformance Benchmark

Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y Lin, Andy T Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, and others

arXiv preprint arXiv:2105.01051 2021
Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings

Xuankai Chang, Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, and Takuya Yoshioka

In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021
Recent developments on espnet toolkit boosted by conformer

Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, Jiatong Shi, and others

In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
Highland Puebla Nahuatl Speech Translation Corpus for Endangered Language Documentation

Jiatong Shi, Jonathan D Amith, Xuankai Chang, Siddharth Dalmia, Brian Yan, and Shinji Watanabe

In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas 2021
ESPnet-SE: end-to-end speech enhancement and separation toolkit designed for ASR integration

Chenda Li, Jing Shi, Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Naoyuki Kamo, Moto Hira, Tomoki Hayashi, Christoph Boeddeker, Zhuo Chen, and others

In 2021 IEEE Spoken Language Technology Workshop (SLT) 2021
Investigation of end-to-end speaker-attributed ASR for continuous multi-talker recordings

Naoyuki Kanda, Xuankai Chang, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, and Takuya Yoshioka

In IEEE Spoken Language Technology Workshop (SLT), 2021

2020

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans

Shinji Watanabe, Florian Boyer, Xuankai Chang, Pengcheng Guo, Tomoki Hayashi, Yosuke Higuchi, Takaaki Hori, Wen-Chin Huang, Hirofumi Inaguma, Naoyuki Kamo, and others

arXiv preprint arXiv:2012.13006 2020
Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals

Jing Shi, Xuankai Chang, Pengcheng Guo, Shinji Watanabe, Yusuke Fujita, Jiaming Xu, Bo Xu, and Lei Xie

Advances in Neural Information Processing Systems 2020
End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming

Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Shinji Watanabe, and Yanmin Qian

Proc. Interspeech 2020 2020
End-to-end multi-speaker speech recognition with transformer

Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, and Shinji Watanabe

In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020
CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings

Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, and others

arXiv preprint arXiv:2004.09249 2020
Improving end-to-end single-channel multi-talker speech recognition

Wangyou Zhang, Xuankai Chang, Yanmin Qian, and Shinji Watanabe

IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020
End-to-End ASR with Adaptive Span Self-Attention

Xuankai Chang, Aswin Shanmugam Subramanian, Pengcheng Guo, Shinji Watanabe, Yuya Fujita, and Motoi Omachi

Proc. Interspeech 2020 2020

2019

MIMO-Speech: End-to-end multi-channel multi-speaker speech recognition

Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, and Shinji Watanabe

In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019
End-to-end monaural multi-speaker ASR system without pretraining

Xuankai Chang, Yanmin Qian, Kai Yu, and Shinji Watanabe

In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019
Knowledge Distillation for End-to-End Monaural Multi-Talker ASR System$

Wangyou Zhang, Xuankai Chang, and Yanmin Qian

Conference of the International Speech Communication Association (InterSpeech), 2019

2018

Single-channel multi-talker speech recognition with permutation invariant training

Yanmin Qian, Xuankai Chang, and Dong Yu

Speech Communication 2018
Monaural Multi-Talker Speech Recognition with Attention Mechanism and Gated Convolutional Networks

Xuankai Chang, Yanmin Qian, and Dong Yu

Conference of the International Speech Communication Association (InterSpeech), 2018
Adaptive permutation invariant training with auxiliary information for monaural multi-talker speech recognition

Xuankai Chang, Yanmin Qian, and Dong Yu

In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018
Past review, current progress, and challenges ahead on the cocktail party problem

Yan-min Qian, Chao Weng, Xuan-kai Chang, Shuai Wang, and Dong Yu

Frontiers of Information Technology & Electronic Engineering 2018

2017

Recognizing Multi-Talker Speech with Permutation Invariant Training

Dong Yu, Xuankai Chang, and Yanmin Qian

Conference of the International Speech Communication Association (InterSpeech), 2017

2016

Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC.

Yimeng Zhuang, Xuankai Chang, Yanmin Qian, and Kai Yu

In Conference of the International Speech Communication Association (InterSpeech), 2016