> Dismantling Transformers

Interactive exploration of the Transformer architecture

>Page 1: ITV[--:--:--]
Interactive Transformer Visualizer

Explore the core "Attention Is All You Need" architecture with mathematically accurate visualizations.

Self-Attention Heatmaps
Positional Encoding
Q, K, V Matrices
Encoder/Decoder Stack
> Deep Dive →
>Page 2: ITT[--:--:--]
Interactive Text Tokenizer

Compare tokenization across BERT, GPT, and T5 with live embedding visualization.

WordPiece (BERT)
BPE (GPT)
SentencePiece (T5)
Embedding Pipeline
> Tokenize Text →
>Page 3: IPC[--:--:--]
Interactive Path Comparator

Compare BERT, GPT, and T5 with side-by-side architecture and attention pattern visualizations.

BERT: Encoder-Only
GPT: Decoder-Only
T5: Encoder-Decoder
Pre-training Tasks
> Compare Models →
>Page 4: LDI[--:--:--]
LLM Deconstructor & Integrator

Interactive PyTorch-style code walkthrough with synchronized math and explanations (CDX feature).

BPE Tokenization
Embedding Layers
Attention Pipeline
Model-Specific Masking
> Code Walkthrough →
ITVITTIPCLDI
Recommended learning path: Architecture → Tokenization → Model Comparison → Implementation