🔭Dimensionality Reduction
Keeping the important information
Take your time with this one. The interactive parts are here to help you test the idea, not rush through it.
Pause and experiment as you go.
Before We Begin
What we are learning today
The art of smart simplification. When datasets have hundreds of columns, dimensionality reduction keeps the essence while trimming the clutter.
How this lesson fits
Here’s where the magic shows up: we stop hand-writing every rule and let data teach the model. Think of it as coaching instead of scripting.
The big question
How can a machine spot patterns from examples the way a student learns from practice problems?
Why You Should Care
More columns aren’t always better. Cleaner, lower-dimensional views can make patterns easier to see and models easier to train.
Where this is used today
- ✓Visualizing high-dimensional data (t-SNE, UMAP)
- ✓Compressing images/video
- ✓Preprocessing for other ML models
Think of it like this
Like casting a shadow of a 3D object. You lose some depth, but from the right angle, the important shape remains.
Easy mistake to make
Dimensionality reduction isn’t random column deletion. It’s a careful mathematical compression that preserves structure.
By the end, you should be able to say:
- Explain the curse of dimensionality in plain language
- Describe PCA as finding the directions of greatest variation
- Connect lower-dimensional views to visualization and compression
Think about this first
If you had to summarize a student with only two numbers, which would you choose to keep the most useful story?
Words we will keep using
Why We Shrink the Number of Features
Imagine taking a photo of a 3D statue. The photo is 2D, but if you pick the right angle, you can still recognize the shape. Dimensionality reduction is the art of finding that perfect angle—simplifying the data without destroying the meaning.
PCA — Principal Component Analysis
PCA asks a very practical question: if I had to redraw this dataset using fewer axes, which new directions would keep the most useful information? The first principal component follows the strongest spread in the data, the second follows the next strongest spread, and so on.
Left: rotating 3D view. Right: PCA projection to 2D (always same orientation).
What PCA is trying to do:
- Shift the data so the cloud is centered around the origin
- Measure which features tend to vary together using the covariance matrix
- Find the directions where the data spreads out the most
- Project the data onto the top directions you want to keep