Back to all lessons
Sequence ModelsIntermediate

Hidden Markov Models

When the real state is hidden from view

Take your time with this one. The interactive parts are here to help you test the idea, not rush through it.

30 min- Explore at your own pace

Before We Begin

What we are learning today

Inferring the invisible. We observe clues and guess the hidden causes behind them, piecing together the story step by step.

How this lesson fits

Time matters. Language, music, weather—they all happen in a sequence. These models learn to remember the past to predict the future.

The big question

How can a model use the past to make sense of what comes next in a sequence?

Explain why sequence order changes meaningCompare probabilistic and neural approaches to sequencesTrack memory and hidden state across time

Why You Should Care

Many important states are hidden. Learning to infer them from evidence is a core AI superpower.

Where this is used today

  • Speech recognition (older systems like Siri)
  • Gesture recognition
  • Predicting gene sequences

Think of it like this

Like diagnosing a cold. You don’t see the virus, but you see the sneezes and sniffles and infer what’s happening.

Easy mistake to make

“Hidden” isn’t random or unknowable. It just isn’t directly observed—we infer it with probability.

By the end, you should be able to say:

  • Distinguish hidden states from visible observations
  • Explain transitions and emissions in a sequence model
  • Describe Viterbi as a method for finding the most likely hidden path

Think about this first

Name something you can’t see directly but can guess from clues. How confident can you get?

Words we will keep using

hidden stateobservationtransitionemissionViterbi

Hidden Markov Models

Think of this as the "Sherlock Holmes" model. You never see the crime (hidden state), only the clues left behind (observations). The HMM is a mathematical tool for working backwards from the clues to the likely truth.

Hidden StatesThe truth. The actual weather, or someone's true health. We never see this directly.
ObservationsThe evidence. An umbrella, a cough, or a credit card transaction.
ParametersThe rules. How likely is rain? How likely is an umbrella if it rains?

The Weather / Activity HMM

Transition Matrix A (State → State)

From\ToSunnyRainy
Sunny0.70.3
Rainy0.40.6

Emission Matrix B (State → Observation)

State\Obs🚶 Walk🛍️ Shop🧹 Clean
Sunny0.60.30.1
Rainy0.10.40.5

Select Observation Sequence

Sequence probability P(O|λ): 0.057816

Viterbi Algorithm — Most Likely State Sequence

The Viterbi algorithm asks: "What is the single most likely story that explains these clues?" It finds the best path through the possibilities without getting lost in the details.

Trellis diagram: columns = time steps, rows = hidden states. Highlighted nodes/edges = Viterbi decoded path. Numbers inside nodes = Viterbi probability.

StepObservationP(Sunny)P(Rainy)Most Likely State
t=0🚶 Walk0.360000.04000Sunny
t=1🚶 Walk0.151200.01080Sunny
t=2🛍️ Shop0.031750.01814Sunny
Decoded weather:SunnySunnySunny

Three Classic HMM Problems

Evaluation (Forward)How likely is this observed sequence under the current model?
Decoding (Viterbi)What hidden sequence is the best explanation for what we saw?
Learning (Baum-Welch)How should the probabilities be adjusted so the model matches the data better?

Applications: speech recognition, gesture recognition, biological sequence analysis, and any situation where an invisible process leaves visible traces behind.