Speech emotion recognition using capsule networks X Wu, S Liu, Y Cao, X Li, J Yu, D Dai, X Ma, S Hu, Z Wu, X Liu, H Meng ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 137 | 2019 |
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling S Liu, Y Cao, D Wang, X Wu, X Liu, H Meng IEEE/ACM Transactions on Audio Speech and Language Processing, 2020 | 100 | 2020 |
Hifi-codec: Group-residual vector quantization for high fidelity audio codec D Yang, S Liu, R Huang, J Tian, C Weng, Y Zou arXiv preprint arXiv:2305.02765, 2023 | 97 | 2023 |
Uniaudio: An audio foundation model toward universal audio generation D Yang, J Tian, X Tan, R Huang, S Liu, X Chang, J Shi, S Zhao, J Bian, ... arXiv preprint arXiv:2310.00704, 2023 | 81 | 2023 |
Instructtts: Modelling expressive tts in discrete latent space with natural language style prompt D Yang, S Liu, R Huang, C Weng, H Meng IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 75 | 2024 |
Adversarial attacks on spoofing countermeasures of automatic speaker verification S Liu, H Wu, H Lee, H Meng 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 74 | 2019 |
Voice Conversion Across Arbitrary Speakers Based on a Single Target-Speaker Utterance. S Liu, J Zhong, L Sun, X Wu, X Liu, H Meng Interspeech, 496-500, 2018 | 68 | 2018 |
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs S Liu, D Su, D Yu ICML 2022 Workshop on Machine Learning for Audio Synthesis, 2022 | 65 | 2022 |
Defense against adversarial attacks on spoofing countermeasures of ASV H Wu, S Liu, H Meng, H Lee ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 65 | 2020 |
Diffsvc: A diffusion probabilistic model for singing voice conversion S Liu, Y Cao, D Su, H Meng 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2021 | 55 | 2021 |
The singing voice conversion challenge 2023 WC Huang, LP Violeta, S Liu, J Shi, T Toda 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | 52 | 2023 |
End-to-end code-switched tts with mix of monolingual recordings Y Cao, X Wu, S Liu, J Yu, X Li, Z Wu, X Liu, H Meng ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 48 | 2019 |
Fastsvc: Fast cross-domain singing voice conversion with feature-wise linear modulation S Liu, Y Cao, N Hu, D Su, H Meng 2021 ieee international conference on multimedia and expo (icme), 1-6, 2021 | 46 | 2021 |
End-to-end accent conversion without using native utterances S Liu, D Wang, Y Cao, L Sun, X Wu, S Kang, Z Wu, X Liu, D Su, D Yu, ... ICASSP 2020, 2020 | 46 | 2020 |
End-to-end voice conversion via cross-modal knowledge distillation for dysarthric speech reconstruction D Wang, J Yu, X Wu, S Liu, L Sun, X Liu, H Meng ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 44 | 2020 |
Vara-tts: Non-autoregressive text-to-speech synthesis based on very deep vae with residual attention P Liu, Y Cao, S Liu, N Hu, G Li, C Weng, D Su arXiv preprint arXiv:2102.06431, 2021 | 36 | 2021 |
Speech emotion recognition using sequential capsule networks X Wu, Y Cao, H Lu, S Liu, D Wang, Z Wu, X Liu, H Meng IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3280-3291, 2021 | 29 | 2021 |
Code-switched speech synthesis using bilingual phonetic posteriorgram with only monolingual corpora Y Cao, S Liu, X Wu, S Kang, P Liu, Z Wu, X Liu, D Su, D Yu, H Meng ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 26 | 2020 |
Transferring source style in non-parallel voice conversion S Liu, Y Cao, S Kang, N Hu, X Liu, D Su, D Yu, H Meng INTERSPEECH 2020, 2020 | 25 | 2020 |
ASR-GLUE: A new multi-task benchmark for asr-robust natural language understanding L Feng, J Yu, D Cai, S Liu, H Zheng, Y Wang arXiv preprint arXiv:2108.13048, 2021 | 18 | 2021 |