Executive Context
Why This Matters in 2024+
Historical context, connections to modern LLMs, and stakeholder communication.
Why Sequences Matter
The Limitations of Vanilla Neural Networks
Variable-length sequences, 5 architecture types, and Turing completeness.
RNN Architecture
Building Memory into Networks
Core equations, hidden state updates, and the "optimization over programs" insight.
Vanishing Gradients & LSTMs
The Problem and Its Solution
Gradient multiplication, LSTM cell state, forget/input/output gates.
Character-Level Modeling
Next-Character Prediction
One-hot encoding, cross-entropy loss, temperature sampling.
Experiments
What Can RNNs Learn?
Shakespeare, Wikipedia, LaTeX, Linux kernel - and neuron visualization.
Beyond Text
Vision, Speech, and Translation
CNN+RNN for captioning, encoder-decoder, and multimodal applications.
Attention Mechanisms
The Most Important Innovation
Soft vs hard attention, Neural Turing Machines, bridge to Transformers.
Limitations & Path Forward
When to Use (and Not Use) RNNs
RNN limitations, Transformer revolution, and build vs buy decisions.
Implementation Deep Dive
From NumPy to PyTorch to Hugging Face
Three implementation tracks with progressive complexity.
Capstone Project
Train Your Own Model
Three difficulty levels with gamified milestones and achievements.