Large-scale contrastive language-audio pretraining with feature fusion and keyword-to-caption augmentation Y Wu, K Chen, T Zhang, Y Hui, T Berg-Kirkpatrick, S Dubnov ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 366 | 2023 |