SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning

Bibliographic Details
Main Authors: Wang, Hanrui (Author), Zhang, Zhekai (Author), Han, Song (Author)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers (IEEE), 2022-07-12T14:08:01Z.
Subjects:
Online Access:Get fulltext