Posts

From SSA to Machine Code

From SSA to Machine Code

In the previous post , we explored how the compiler transforms IR into SSA—a representation where every variable is assigned exactly once. We saw how the compiler builds SSA using Values and Blocks, then runs 30+ optimization passes. We watched the lowering pass convert generic operations into architecture-specific instructions like AMD64ADDQ and ARM64ADD.

Now we’re at the final stretch. The compiler has optimized SSA with architecture-specific operations. All that’s left is to turn those operations into actual machine code bytes.

The Execution Engine

The Execution Engine

In the previous article , we explored how PostgreSQL’s planner chooses the optimal execution strategy. The planner produces an abstract plan tree—nodes like “Sequential Scan,” “Hash Join,” “Sort”—that describe what to do. Now the execution engine needs to actually do the work: read pages from disk, follow indexes, join tables, and produce results.

The executor uses the Volcano execution model—a beautifully simple pattern where every operation implements the same interface: “give me the next row.” A sort operation doesn’t care whether its input comes from a table scan or a join—it just asks for rows and sorts whatever it gets. This uniform approach allows arbitrarily complex queries to be built from simple, composable pieces.

The SSA Phase

The SSA Phase

In the previous post , we explored the IR—the compiler’s working format where devirtualization, inlining, and escape analysis happen. The IR optimizes your code at a high level, making smart decisions about which functions to inline and where values should live—on the heap or stack.

But the IR still looks a lot like your source code. It has variables that can be assigned multiple times, complex control flow with loops and conditionals, and operations that map closely to Go syntax.

The Query Planner

The Query Planner

In the previous article , we explored how PostgreSQL’s rewriter transforms queries—expanding views, applying security policies, and executing custom rules. By the end of that phase, your query has been fully expanded and secured, ready for execution.

But here’s the million-dollar question: How should PostgreSQL actually execute your query?

Let me show you why this matters. Take this simple query:

SELECT c.first_name, c.last_name, r.rental_date
FROM customer c
JOIN rental r ON c.customer_id = r.customer_id
WHERE c.active = 1;

PostgreSQL could execute this in dozens of different ways:

The IR

The IR

In the previous posts , we’ve explored how the Go compiler processes your code: the scanner breaks it into tokens, the parser builds an Abstract Syntax Tree, the type checker validates everything, and the Unified IR format serializes the type-checked AST into a compact binary representation.

Now we’re at a critical transformation point. The compiler takes that Unified IR—whether it was just serialized from your code or loaded from a cached archive file—and deserializes it directly into IR nodes. This is where your source code truly becomes the compiler’s working format.

Query Rewriting

Query Rewriting

In the previous article , we explored how PostgreSQL transforms SQL text into a validated Query tree through parsing and semantic analysis. By the end of that journey, PostgreSQL knows that your tables exist, your columns are valid, your types match up, and your query makes sense.

But before the planner can figure out how to execute your query, there’s one more critical transformation step: query rewriting.

Let me show you why this matters. When you write this simple query:

The Unified IR Format

The Unified IR Format

In the previous post , we explored how the Go compiler’s type checker analyzes your code. We saw how it resolves identifiers, checks type compatibility, and ensures your program is semantically correct.

Now that we have a fully type-checked AST, the next logical step would be to generate the compiler’s Intermediate Representation (IR)—the form it uses for optimization and code generation. But here’s something interesting: the Go compiler doesn’t immediately transform the AST into IR. Instead, it takes what might seem like a detour—it serializes the type-checked AST into a binary format, then deserializes it back into IR nodes.

Parsing and Analysis

Parsing and Analysis

In the previous article , we explored how PostgreSQL establishes connections and communicates using its wire protocol. Once your connection is established and the backend process is ready, you can finally send queries. But when PostgreSQL receives your SQL, it’s just a string of text—the database can’t execute text directly.

Let me show you what happens when PostgreSQL receives this query:

SELECT name FROM users WHERE id = 42;

PostgreSQL doesn’t see this as a command yet. It sees characters: S, E, L, E, C, T, and so on. The journey from this raw text to something PostgreSQL can execute involves two major transformations: parsing (understanding structure) and semantic analysis (adding meaning).

The Type Checker

The Type Checker

In the previous posts , we explored the scanner—which converts source code into tokens—and the parser —which takes those tokens and builds an Abstract Syntax Tree.

In future posts, I’ll cover the Intermediate Representation (IR)—how the compiler transforms the AST into an intermediate lower-level form. But before we can get there, we need to talk about two crucial intermediate steps: type checking (this post) and the Unified IR (which I’ll cover in a separate post soon).

Connections and Communication

Connections and Communication

In the previous article , we explored the complete journey a SQL query takes through PostgreSQL—from parsing to execution. But before any of that can happen, your application needs to establish a connection with the database.

This might seem like a simple handshake, but there’s actually a sophisticated process happening behind the scenes—involving process management, authentication, and a binary protocol for efficient communication.

Understanding how PostgreSQL handles connections helps explain why connection pooling matters, how to troubleshoot connectivity issues, and why PostgreSQL’s architecture differs from thread-based databases. Let’s trace what happens when your application connects to PostgreSQL.

The Parser

The Parser

In the previous blog post , we explored the scanner—the component that converts your source code from a stream of characters into a stream of tokens.

Now we’re ready for the next step: the parser.

Here’s the challenge the parser solves: right now, we have a flat list of tokens with no relationships between them. The scanner gave us package, main, {, fmt, ., Println… but it has no idea that Println belongs to the fmt package, or that the entire fmt.Println("Hello world") statement lives inside the main function.

Overview

Overview

Ever wonder what happens when you type SELECT * FROM users WHERE id = 42; and hit Enter? That simple query triggers a fascinating journey through PostgreSQL’s internals—a complex series of operations involving multiple processes, sophisticated memory management, and decades of optimization research.

This is the first article in a series where we’ll explore PostgreSQL’s query execution in depth. In this overview, I’ll walk you through the complete journey from SQL text to results, giving you the roadmap. Then, in subsequent articles, we’ll dive deep into each component—the parser, analyzer, rewriter, planner, and executor—exploring the details of how each one works.

The Scanner

The Scanner

This is part of a series where I’ll walk you through the entire Go compiler, covering each phase from source code to executable. If you’ve ever wondered what happens when you run go build, you’re in the right place.

Note: This article is based on Go 1.25.3. The compiler internals may change in future versions, but the core concepts will likely remain the same.

I’m going to use the simplest example possible to guide us through the process—a classic “hello world” program:

Welcome to Internals for Interns

Welcome to Internals for Interns

Welcome! I’m thrilled to finally launch this project—something I’ve been thinking about for almost a decade.

What This Is All About

For over 10 years, I’ve been giving talks at conferences about how things work under the hood. I started in the Python community, exploring topics like the object model, garbage collection, and the CPython interpreter internals. Later, I expanded into other communities—Go, PostgreSQL, and beyond—always with the same goal: making complex internals approachable.