Seuraa
Jian Cong
Jian Cong
ByteDance
Vahvistettu sähköpostiosoite verkkotunnuksessa mail.nwpu.edu.cn
Nimike
Viittaukset
Viittaukset
Vuosi
Naturalspeech: End-to-end text-to-speech synthesis with human-level quality
X Tan, J Chen, H Liu, J Cong, C Zhang, Y Liu, X Wang, Y Leng, Y Yi, L He, ...
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
1892024
Visinger: Variational inference with adversarial learning for end-to-end singing voice synthesis
Y Zhang, J Cong, H Xue, L Xie, P Zhu, M Bi
ICASSP 2022, 2022
812022
Controllable Context-aware Conversational Speech Synthesis
J Cong, S Yang, N Hu, G Li, L Xie, D Su
INTERSPEECH 2021, 2021
352021
Data efficient voice cloning from noisy samples with domain adversarial training
J Cong, S Yang, L Xie, G Yu, G Wan
INTERSPEECH 2020, 2020
342020
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
P Anastassiou, J Chen, J Chen, Y Chen, Z Chen, Z Chen, J Cong, L Deng, ...
arXiv preprint arXiv:2406.02430, 2024
312024
Glow-wavegan: Learning speech representations from gan-based variational auto-encoder for high fidelity flow-based speech synthesis
J Cong, S Yang, L Xie, D Su
INTERSPEECH 2021, 2021
302021
Glow-WaveGAN 2: high-quality zero-shot text-to-speech synthesis and any-to-any voice conversion
Y Lei, S Yang, J Cong, L Xie, D Su
INTERSPEECH2022, 2022
172022
Dspgan: a gan-based universal vocoder for high-fidelity tts by time-frequency domain supervision from dsp
K Song, Y Zhang, Y Lei, J Cong, H Li, L Xie, G He, J Bai
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
152023
DiCLET-TTS: Diffusion model based cross-lingual emotion transfer for text-to-speech—A study between English and Mandarin
T Li, C Hu, J Cong, X Zhu, J Li, Q Tian, Y Wang, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023
82023
AdaVITS: Tiny VITS for low computing resource speaker adaptation
K Song, H Xue, X Wang, J Cong, Y Zhang, L Xie, B Yang, X Zhang, D Su
2022 13th International Symposium on Chinese Spoken Language Processing …, 2022
62022
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
K Song, J Cong, X Wang, Y Zhang, L Xie, N Jiang, H Wu
2022 13th International Symposium on Chinese Spoken Language Processing …, 2022
32022
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
T Li, Z Wang, X Zhu, J Cong, Q Tian, Y Wang, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
2024
Language Model Can Listen While Speaking
Z Ma, Y Song, C Du, J Cong, Z Chen, Y Wang, Y Wang, X Chen
arXiv preprint arXiv:2408.02622, 2024
2024
Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.
Artikkelit 1–13