Diffsound: Discrete diffusion model for text-to-sound generation D Yang, J Yu, H Wang, W Wang, C Weng, Y Zou, D Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 1720-1733, 2023 | 284 | 2023 |
Gigaspeech: An evolving, multi-domain asr corpus with 10,000 hours of transcribed audio G Chen, S Chai, G Wang, J Du, WQ Zhang, C Weng, D Su, D Povey, ... arXiv preprint arXiv:2106.06909, 2021 | 223 | 2021 |
Videocrafter1: Open diffusion models for high-quality video generation H Chen, M Xia, Y He, Y Zhang, X Cun, S Yang, J Xing, Y Liu, Q Chen, ... arXiv preprint arXiv:2310.19512, 2023 | 188 | 2023 |
Replay and synthetic speech detection with res2net architecture X Li, N Li, C Weng, X Liu, D Su, D Yu, H Meng ICASSP 2021-2021 IEEE international conference on acoustics, speech and …, 2021 | 177 | 2021 |
Recurrent deep neural networks for robust speech recognition C Weng, D Yu, S Watanabe, BHF Juang 2014 IEEE International Conference on Acoustics, Speech and Signal …, 2014 | 162 | 2014 |
Videocrafter2: Overcoming data limitations for high-quality video diffusion models H Chen, Y Zhang, X Cun, M Xia, X Wang, C Weng, Y Shan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 135 | 2024 |
Deep neural networks for single-channel multi-talker speech recognition C Weng, D Yu, ML Seltzer, J Droppo IEEE/ACM Transactions on Audio, Speech, and Language Processing 23 (10 …, 2015 | 117 | 2015 |
DurIAN: Duration Informed Attention Network for Speech Synthesis. C Yu, H Lu, N Hu, M Yu, C Weng, K Xu, P Liu, D Tuo, S Kang, G Lei, D Su, ... Interspeech, 2027-2031, 2020 | 110 | 2020 |
Component fusion: Learning replaceable language model component for end-to-end speech recognition system C Shan, C Weng, G Wang, D Su, M Luo, D Yu, L Xie ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 106 | 2019 |
Durian: Duration informed attention network for multimodal synthesis C Yu, H Lu, N Hu, M Yu, C Weng, K Xu, P Liu, D Tuo, S Kang, G Lei, D Su, ... arXiv preprint arXiv:1909.01700, 2019 | 105 | 2019 |
Past review, current progress, and challenges ahead on the cocktail party problem Y Qian, C Weng, X Chang, S Wang, D Yu Frontiers of Information Technology & Electronic Engineering 19, 40-63, 2018 | 103 | 2018 |
Hifi-codec: Group-residual vector quantization for high fidelity audio codec D Yang, S Liu, R Huang, J Tian, C Weng, Y Zou arXiv preprint arXiv:2305.02765, 2023 | 97 | 2023 |
Investigating end-to-end speech recognition for mandarin-english code-switching C Shan, C Weng, G Wang, D Su, M Luo, D Yu, L Xie ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 91 | 2019 |
Self-supervised text-independent speaker verification using prototypical momentum contrastive learning W Xia, C Zhang, C Weng, M Yu, D Yu ICASSP 2021-2021 IEEE international conference on acoustics, speech and …, 2021 | 86 | 2021 |
Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition AS Subramanian, C Weng, S Watanabe, M Yu, D Yu Computer Speech & Language 75, 101360, 2022 | 78 | 2022 |
Instructtts: Modelling expressive tts in discrete latent space with natural language style prompt D Yang, S Liu, R Huang, C Weng, H Meng IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 75 | 2024 |
Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition. C Weng, J Cui, G Wang, J Wang, C Yu, D Su, D Yu Interspeech, 761-765, 2018 | 63 | 2018 |
Mixed speech recognition D Yu, C Weng, ML Seltzer, J Droppo US Patent 9,390,712, 2016 | 61 | 2016 |
Simple attention module based speaker verification with iterative noisy label detection X Qin, N Li, C Weng, D Su, M Li ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 59 | 2022 |
Pitchnet: Unsupervised singing voice conversion with pitch adversarial network C Deng, C Yu, H Lu, C Weng, D Yu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 53 | 2020 |