Attention Is a Third of What You Need: QKV, Dot Products, and the Other Two-Thirds

TL;DR: Attention is the mechanism that lets every token look at every other token and decide what matters. It works through three learned matrices – Query, Key, and Value – that produce similarity ...

Jul 10, 2024 Machine Learning, Deep Learning

Transformers: More Than Meets the Eye

TL;DR: RNNs processed language one word at a time. LSTMs added memory gates but stayed sequential. The Transformer threw all of that out — no recurrence, no convolutions — and replaced it with a si...

Jun 13, 2024 Machine Learning, Deep Learning

SIREN: Teaching Networks to Think in Sine Waves

TL;DR: ReLU networks can’t represent fine detail in continuous signals because their derivatives are piecewise constant and their second derivatives are zero. SIREN replaces ReLU with sine activati...

May 15, 2024 Machine Learning, Deep Learning

Semantics and Pragmatics: Where Does Meaning Live?

The standard introductory distinction: semantics is what a sentence means; pragmatics is what a speaker means by saying it. “I’m cold” means there is a person and they are experiencing low tempera...

Apr 1, 2024 Writing, Linguistics

The JEE Aspirant Experience: Power, Bare Life, and Hegemony

What does it mean to be a JEE dropper? You are 18 or 19 years old. You are not a student — no institution claims you. You are not working. You are preparing for an exam that will, in theory, determ...

Apr 1, 2024 Writing, Philosophy

Nothing is True; Nothing is Permitted: A Cultural Analysis of Assassin's Creed: India

Assassin’s Creed’s central philosophical claim — nothing is true, everything is permitted — is one of the more genuinely interesting premises in mainstream gaming. It’s a denial of natural social o...

Apr 1, 2024 Writing, Cultural Studies

GAN-generated faces — none of these people exist

GANs: From a Thought Experiment to Photorealistic Faces

TL;DR: VAEs could generate images, but they were blurry. GANs fixed that by replacing the reconstruction loss with something smarter — a second neural network whose entire job is to call out fakes....

Mar 15, 2024 Machine Learning, Deep Learning

Adversarial perturbation — imperceptible to humans, catastrophic to models

When Neural Networks Lie: Adversarial Examples and the Art of Fooling AI

TL;DR: In 2014, Goodfellow et al. showed that neural networks — no matter how accurate — can be fooled by adding tiny, invisible perturbations to their inputs. A panda becomes a gibbon. A stop sign...

Feb 10, 2024 Machine Learning, Deep Learning

Birth of Tragedy: Appreciation and Criticism

Reading Nietzsche requires a kind of uncoulturing — actively unlearning what you think art is before his framework can land. My default assumption, before this book, was that “art” meant Shakespear...

Feb 1, 2024 Writing, Philosophy

From Autoencoders to VAEs: Learning to Generate, Not Just Compress

TL;DR: Hinton’s autoencoder taught networks to compress data into a small latent vector and reconstruct it back. It worked beautifully — but the latent space was a mess you couldn’t generate from. ...

Jan 24, 2024 Machine Learning, Deep Learning