The Good, The Bad, and Hamiltonian Monte Carlo

Matías Castillo-Aguilar — Wed, 15 May 2024 03:00:00 GMT

Introduction

Hey there, fellow science enthusiasts and stats geeks! Welcome back to the wild world of Markov Chain Monte Carlo (MCMC) algorithms. This is part two of my series on the powerhouse behind Bayesian Inference. If you missed the first post, no worries! Just hop on over here and catch up before we dive deeper into the MCMC madness. Today, we’re exploring the notorious Hamiltonian Monte Carlo (HMC), a special kind of MCMC algorithm that taps into the dynamics of Hamiltonian mechanics.

Stats Meets Physics?

Hold up, did you say Hamiltonian mechanics? What in the world do mechanics and physics have to do with Bayesian stats? I get it, it sounds like a mashup of your wildest nightmares. But trust me, this algorithm sometimes feels like running a physics simulation in a statistical playground. Remember our chat from the last post? In Bayesian stats, we’re all about estimating the shape of a parameter space, aka the posterior distribution.

A Particle Rolling Through Stats Land

Picture this: You drop a tiny particle down a cliff, and it rolls naturally along the landscape’s curves and slopes. Easy, right? Now, swap out the real-world terrain for a funky high-dimensional probability function. That same little particle? It’s cruising through this wild statistical landscape like a boss, all thanks to the rules of Hamiltonian mechanics.

About the animation

The previous animation illustrate the Hamiltonian dynamics of a particle traveling a two-dimensional parameter space. The code for this animation is borrowed from Chi Feng’s github. You can find the original repository with corresponding code here: https://github.com/chi-feng/mcmc-demo

Hamiltonian Mechanics: A Child’s Play?

Let’s break down Hamiltonian dynamics in terms of position and momentum with a fun scenario: Imagine you’re on a swing. When you hit the highest point, you slow down, right? Your momentum’s almost zero. But here’s the kicker: You know you’re about to pick up speed on the way down, gaining momentum in the opposite direction. That moment when you’re at the top, almost motionless? That’s when you’re losing kinetic energy and gaining potential energy, thanks to gravity getting ready to pull you back down.

Swing Animation

So, in this analogy, when your kinetic energy (think swing momentum) goes up, your potential energy (like being at the bottom of the swing) goes down. And vice versa! When your kinetic energy drops (like when you’re climbing back up), your potential energy shoots up, waiting for gravity to do its thing.

This energy dance is captured by the Hamiltonian (), which sums up the total energy in the system. It’s the sum of kinetic energy () and potential energy ():

At its core, Hamiltonian Monte Carlo (HMC) borrows from Hamiltonian dynamics, a fancy term for the rules that govern how physical systems evolve in phase space. In Hamiltonian mechanics, a system’s all about its position () and momentum (), and their dance is choreographed by Hamilton’s equations. Brace yourself, things are about to get a little mathy:

Wrapping Our Heads Around the Math

Okay, I know Hamiltonian dynamics can be a real brain-buster — trust me, it took me a hot minute to wrap my head around it. But hey, I’ve got an analogy that might just make it click. Let’s revisit our swing scenario: remember our picture of a kid on a swing, right? The swing’s angle from the vertical () tells us where the kid is, and momentum () is how fast the swing’s moving.

Now, let’s break down those equations:

This one’s like peeking into the future to see how the angle () changes over time. And guess what? It’s all about momentum (). The faster the swing’s going, the quicker it swings back and forth — simple as that!

Next up:

Now, this beauty tells us how momentum () changes over time. It’s all about the energy game here — specifically, how the swing’s position () affects its momentum. When the swing’s at the highest point, gravity’s pulling hardest, ready to send him back down.

So, picture this:

The kid swings forward, so the angle () goes up thanks to the momentum () building until bam — top of the swing.
At the top, the swing’s momentarily still, but gravity’s pulling to send him flying back down — hence, he is accumulating potential energy.
Zoom! Back down it goes, picking up speed in the opposite direction — and so, the potential energy is then transferred into kinetic energy.

All the while, the Hamiltonian () is keeping tabs on the swing’s total energy — whether it’s zooming at the bottom (high kinetic energy , as a function of momentum ) or pausing at the top (high potential energy , as a function of position ).

This dance between kinetic and potential energy is what we care within Hamiltonian mechanics, and also what we mean when we refer to the phase space, which it’s nothing more than the relationship between position and momentum.

Visualizing Hamilton’s Equations

Okay, I know we’re diving into some physics territory here in a stats blog, but trust me, understanding these concepts is key to unlocking what HMC’s all about. So, let’s take a little side trip and get a feel for Hamilton’s equations with a different example. Check out the gif below — see that weight on a string? It’s doing this cool back-and-forth dance thanks to the tug-of-war between the string pulling up and gravity pulling down.

Simple harmonic oscillator. Within this example we could expect that the potential energy is the greatest at the bottom or top positions , primarely because is in these positions that the force exerted by the string is greater, affecting in consequence the kinetic energy of the mass attached at the bottom of the string.

Now, let’s get a little hands-on with some code. We’re gonna simulate a simple harmonic oscillator — you know, like that weight on a string — and watch how it moves through phase space.

# Define the potential energy function (U) and its derivative (dU/dq)
U <- function(q) {
  k <- 1  # Spring constant
  return(0.5 * k * q^2)
}


dU_dq <- function(q) {
  k <- 1  # Spring constant
  return(k * q)
}

# Kinetic energy (K) used for later
K <- function(p, m) {
  return(p^2 / (2 * m))
}

# Introduce a damping coefficient
b <- 0.1  # Damping coefficient

# Set up initial conditions
q <- -3.0  # Initial position
p <- 0.0   # Initial momentum
m <- 1.0   # Mass

# Time parameters
t_max <- 20
dt <- 0.1
num_steps <- ceiling(t_max / dt)  # Ensure num_steps is an integer

# Initialize arrays to store position and momentum values over time
q_values <- numeric(num_steps)
p_values <- numeric(num_steps)

# Perform time integration using the leapfrog method
for (i in 1:num_steps) {
  # Store the current values
  q_values[i] <- q
  p_values[i] <- p
  
  # Half step update for momentum with damping
  p_half_step <- p - 0.5 * dt * (dU_dq(q) + b * p / m)
  
  # Full step update for position using the momentum from the half step
  q <- q + dt * (p_half_step / m)
  
  # Another half step update for momentum with damping using the new position
  p <- p_half_step - 0.5 * dt * (dU_dq(q) + b * p_half_step / m)
}

Code

Markov Chain Monte What?

Matías Castillo-Aguilar — Thu, 25 Apr 2024 03:00:00 GMT

Introduction

Alright, folks, let’s dive into the wild world of statistics and data science! Picture this: you’re knee-deep in data, trying to make sense of the chaos. But here’s the kicker, sometimes the chaos is just too darn complex. With tons of variables flying around, getting a grip on uncertainty can feel like trying to catch smoke with your bare hands.

Please, have in your consideration that the kind of problems that we’re dealing with, it’s not solely related to the number of dimensions, it’s mostly related to trying to estimate something that we can’t see in full beforehand. For instance, consider the following banana distribution (shown below). How could we map this simple two dimensional surface without computing it all at once?

Code

Welcome to Bayesically Speaking

Matías Castillo-Aguilar — Sat, 10 Jun 2023 03:00:00 GMT

Photo from Jon Tyson at Unsplash.

Hello stranger

First of all, welcome to the first post of “Bayesically Speaking” (which, in case you haven’t noticed, is a word play between “Basically Speaking” and the (hopefully) well-known Bayes’ theorem), and although the web is offline at the time of writing this article, I find myself following the advice of all those people who encouraged me to trust my instinct and dare to do what I have always wanted: to be able to transmit the thrill of using science as a tool to know and understand the reality that surrounds us and that we perceive in a limited way through our senses.

For years, my interests have revolved around understanding the world through the lens of statistics, particularly as a tool to better understand and quantify the relationships between the moving parts that make up many health outcomes. Another aspect that I find fascinating is how certain variables can go unnoticed when viewed separately, but when viewed together can have radically different behaviors.

Code