A Training-free Sub-quadratic Cost Transformer Model Serving Framework With Hierarchically Pruned Attention H Lee, G Park, Y Lee, J Suh, J Kim, W Jeong, B Kim, H Lee, M Jeon, ... arXiv:2406.09827 [cs.CL], https://arxiv.org/abs/2406.09827, 2024 | 2* | 2024 |