Back to all lessons
Advanced TopicsAdvanced

🔒Federated Learning

Training together without sharing raw data

Take your time with this one. The interactive parts are here to help you test the idea, not rush through it.

25 min- Explore at your own pace

Before We Begin

What we are learning today

Learning without spying. The model visits your device, learns locally, and sends back updates—not your raw data.

How this lesson fits

Here we peek over the horizon: agents that learn by doing, and systems that train together without spilling secrets.

The big question

How can AI learn from its own experience and still respect privacy and real-world limits?

Interpret reward-driven learning and long-term payoffExplain the exploration vs. exploitation balanceDescribe privacy-aware training across many devices

Why You Should Care

Real systems juggle privacy, bandwidth, and fairness. Federated learning is one way to balance those needs.

Where this is used today

  • Predictive text on smartphones (Gboard)
  • Medical research across hospitals
  • Smart home devices learning locally

Think of it like this

Like a potluck dinner: everyone brings a dish (updates), but keeps the family recipe (raw data) at home.

Easy mistake to make

Federated learning boosts privacy but isn’t a perfect shield. Security and fairness still need care.

By the end, you should be able to say:

  • Explain the core idea of local training plus global averaging
  • Describe why privacy concerns motivate federated learning
  • Identify practical challenges such as unequal data and device quality

Think about this first

Why might a hospital or phone user refuse to upload raw data to one server? What worries them?

Words we will keep using

federatedlocal updateaggregationprivacyclient

Federated Learning

Imagine a hospital wants to train an AI to spot diseases, but it can't share patient records because of privacy laws. Federated Learning is the solution: bring the model to the data, not the data to the model.

Privacy FirstYour data never leaves your device. Only the math (model updates) gets shared.
TeamworkThousands of devices work together to build one smart brain.
Messy DataEveryone's data looks different, which makes training tricky but robust.

FedAvg Algorithm

FedAvg works like a potluck dinner. Everyone cooks a dish at home (trains on local data), brings it to the party, and mixes it all together into one giant feast (the global model).

  1. The server sends the current shared model to selected clients
  2. Each client trains on its own private data for a short time
  3. The clients send back updated model weights
  4. The server averages those updates into a new global model
wt+1=k=1Knknwtkw_{t+1} = \sum_{k=1}^{K} \frac{n_k}{n} w_t^k

where nₖ is the number of samples at client k and n = Σnₖ.

Interactive Federated Training

Select participating clients:

Global model — Round 0Loss: 1.0000
w₁
0.800
w₂
-0.500
w₃
0.300
w₄
0.100
b
0.600

Training loss curve

Round 0Round 0

Privacy Enhancements

Differential PrivacyAdding a little bit of noise so that even the math updates can't be traced back to a specific person.
Secure AggregationMixing the updates in a lockbox so the server sees the total, but not who contributed what.
Homomorphic EncryptionDoing math on encrypted data without unlocking it first. Yes, that's possible.
Real-world deployments: keyboard prediction, voice assistants, and multi-hospital medical models are good examples because they all involve useful learning plus sensitive personal data.