Methodology#
The Spiking Decision Transformer (SNN-DT) bridges the gap between the high-level planning capabilities of Transformers and the energy efficiency of Spiking Neural Networks (SNNs).
Architecture Overview#
The SNN-DT architecture replaces the standard Multi-Head Attention (MHA) mechanism with a Spiking Self-Attention (SSA) block. Key components include:
Leaky Integrate-and-Fire (LIF) Neurons: We employ non-leaky and leaky integrate-and-fire dynamics. The membrane potential \(u_t\) evolves as:
\[ u_t = \beta u_{t-1} + W x_t \]\[ s_t = \Theta(u_t - V_{th}) \]where \(\beta\) is the decay factor, \(V_{th}\) is the threshold, and \(s_t\) is the spike output.
Phase-Coding: To preserve temporal information without high-precision floating point values, we utilize phase-shifted spike-based positional encodings.
Dendritic Routing: A lightweight routing module that dynamically sparsifies the attention matrix, ensuring that only relevant tokens trigger synaptic operations.
Local Plasticity#
We incorporate biological three-factor plasticity rules to enable online adaptation:
This allows the network to adapt to distribution shifts in the environment without full backpropagation through time (BPTT).
Training#
The model is trained end-to-end using Surrogate Gradients. Since the spiking function \(\Theta(\cdot)\) is non-differentiable, we approximate its derivative during the backward pass using a fast sigmoid function:
This enables us to optimize the decision control objective:
where \(\hat{a}_t\) is the predicted action and \(a_t\) is the ground truth action from the expert demonstration.