Abstract
Decision Transformers (DT) have emerged as a powerful offline reinforcement learning architecture, demonstrating substantial successes across various complex tasks. Despite their impressive performance, the interpretability and adaptability of these models remain areas requiring significant exploration, particularly when applied to continuous control and multi-discrete action environments. This
thesis provides a comprehensive study of Decision Transformers by addressing these two critical aspects: interpretability in continuous control environments and adaptability through novel tokenisation
strategies in multi-discrete action spaces.
In the first part, we perform a detailed interpretability analysis of Decision Transformers trained on
MuJoCo environments, focusing on continuous control tasks known for their complexity and realism.
We utilize multiple interpretability techniques such as positional encoding (PE) analysis, return-to-go
(RTG) examination, embedding and hidden state analyses, attention visualizations, and perturbation
studies. Our results reveal crucial insights: positional encoding significantly impacts performance only
in environments demanding precise temporal coordination (e.g., Hopper and Walker2d), whereas in simpler, velocity-based tasks (HalfCheetah), performance remains stable without explicit positional encoding. Additionally, RTG analysis indicates that the model’s ability to achieve specified returns is closely
linked to the distribution of training data rewards, highlighting its adaptability. Embedding analysis uncovers the hierarchical abstractions and structured representations learned by the models, and attention
visualization reveals the nuanced role of attention mechanisms in guiding behavior. Perturbation studies
further emphasize the robustness of Decision Transformers, identifying key action dimensions (joints)
critical for successful task execution.
The second part of the thesis addresses Decision Transformer performance limitations in environments characterized by multi-discrete action spaces, specifically image-based domains like ViZDoom.
Here, we introduce Multi-State Action Tokenisation (M-SAT), an innovative method that enhances
decision-making performance by tokenizing actions at the individual action level and incorporating auxiliary state information. M-SAT’s tokenization strategy significantly improves agent performance and
interpretability within attention layers, thereby providing clearer insights into agent decisions and enhancing transparency in dynamic environments involving complex, multi-discrete action choices. Crucially, we demonstrate that M-SAT can achieve superior performance in challenging scenarios such
as Deadly Corridor, My Way Home, and Death Match without the necessity of positional encoding,
occasionally even benefiting from its removal. The granular action tokenization approach employed
by M-SAT facilitates more efficient learning and enables detailed interpretability of individual actions,
showcasing marked improvements in sample efficiency and adaptability over baseline Decision Transformers.
Through the integration of these two studies, this thesis presents a unified narrative emphasizing the
necessity and benefits of interpretability and innovative tokenization strategies in Decision Transformers. Our insights are critical for the future development of more interpretable, adaptable, and reliable
transformer-based decision-making systems capable of excelling across diverse and complex environments. This combined analysis deepens our understanding of the internal mechanisms driving Decision
Transformer performance, setting the groundwork for further advancements in reinforcement learning
architectures.