Poster
Rethinking Spiking Self-Attention Mechanism: Implementing α-XNOR Similarity Calculation in Spiking Transformers
Yichen Xiao · Shuai Wang · Dehao Zhang · Wenjie Wei · Yimeng Shan · Xiaoli Liu · Yulin Jiang · Malu Zhang
Transformers significantly raise the performance limits across various tasks, spurring research into integrating them into spiking neural networks. However, a notable performance gap remains between existing spiking Transformers and their artificial neural network counterparts. Here, we first analyze the reason for this gap and identify that the dot product ineffectively calculates similarity between spiking Queries (Q) and Keys (K). To address this challenge, we introduce an innovative α-XNOR similarity calculation method tailored for spike trains. α-XNOR similarity redefines the correlation of non-spike pairs as a specific value α, effectively overcoming the limitations of dot-product similarity caused by numerous non-spiking events. Additionally, considering the sparse nature of spike trains where spikes carry more information than non-spikes, the α-XNOR similarity correspondingly highlights the distinct importance of spikes over non-spikes. Extensive experiments demonstrate that our α-XNOR similarity significantly improves performance across different spiking Transformer architectures in various static and neuromorphic datasets. This is the first attempt to develop a spiking self-attention paradigm tailored for the binary characteristics of spike trains.
Live content is unavailable. Log in and register to view live content