The Physics of Language Models

And What We Actually Know About Knowledge

Aug 2, 2025 Machine Learning, Large Language Model

The Paper That Started the Neural NLP Revolution: Bengio's Neural Probabilistic Language Model

TL;DR: Yoshua Bengio’s 2003 paper “A Neural Probabilistic Language Model” is the Genesis of modern NLP. Before this paper, language models were statistical counting machines. After it, they became ...

Jan 17, 2025 Machine Learning, Natural Language Processing

2024 in Books

Review of all the books I read in 2024.

Dec 31, 2024 Humanities, Books

Talk Trois

Talk Trois Log

Aug 1, 2024 Random, Discussion

2023 in Books

Review of all the books I read in 2023.

Dec 31, 2023 Humanities, Books

ML Normalization; A Primer

TL;DR: Before normalization, training deep networks was like trying to stack cards in a hurricane—one small change would topple everything. BatchNorm (2015) changed the game by stabilizing training...

Nov 13, 2023 Machine Learning, Deep Learning

ML Regularization; A Primer

TL;DR: Dropout started as a simple trick to prevent overfitting—randomly turn off neurons during training. But it evolved into something profound: a gateway to understanding uncertainty in deep lea...

Sep 11, 2023 Machine Learning, Deep Learning

ML Optimization; A Primer

TL;DR: Every time you train a neural network, you’re solving an optimization problem in a space with millions of dimensions. This is the story of how we went from basic SGD taking baby steps to Ada...

Jul 27, 2023 Machine Learning, Deep Learning

2022 in Books

Review of all the books I read in 2022.

Dec 31, 2022 Humanities, Books

Detroit; Become Human & the Inheritance of Humanity

Detroit; Become Human is a 2018 thrd-person game, built around three "deviant" androids...

May 1, 2022 Humanities, Writing