In the previous post
, we explored how the compiler transforms IR into SSA—a representation where every variable is assigned exactly once. We saw how the compiler builds SSA using Values and Blocks, then runs 30+ optimization passes. We watched the lowering pass convert generic operations into architecture-specific instructions like AMD64ADDQ and ARM64ADD.
Now we’re at the final stretch. The compiler has optimized SSA with architecture-specific operations. All that’s left is to turn those operations into actual machine code bytes.
In the previous article
, we explored how PostgreSQL’s planner chooses the optimal execution strategy. The planner produces an abstract plan tree—nodes like “Sequential Scan,” “Hash Join,” “Sort”—that describe what to do. Now the execution engine needs to actually do the work: read pages from disk, follow indexes, join tables, and produce results.
The executor uses the Volcano execution model—a beautifully simple pattern where every operation implements the same interface: “give me the next row.” A sort operation doesn’t care whether its input comes from a table scan or a join—it just asks for rows and sorts whatever it gets. This uniform approach allows arbitrarily complex queries to be built from simple, composable pieces.
In the previous post
, we explored the IR—the compiler’s working format where devirtualization, inlining, and escape analysis happen. The IR optimizes your code at a high level, making smart decisions about which functions to inline and where values should live—on the heap or stack.
But the IR still looks a lot like your source code. It has variables that can be assigned multiple times, complex control flow with loops and conditionals, and operations that map closely to Go syntax.
In the previous article
, we explored how PostgreSQL’s rewriter transforms queries—expanding views, applying security policies, and executing custom rules. By the end of that phase, your query has been fully expanded and secured, ready for execution.
But here’s the million-dollar question: How should PostgreSQL actually execute your query?
Let me show you why this matters. Take this simple query:
In the previous posts
, we’ve explored how the Go compiler processes your code: the scanner breaks it into tokens, the parser builds an Abstract Syntax Tree, the type checker validates everything, and the Unified IR format
serializes the type-checked AST into a compact binary representation.
Now we’re at a critical transformation point. The compiler takes that Unified IR—whether it was just serialized from your code or loaded from a cached archive file—and deserializes it directly into IR nodes. This is where your source code truly becomes the compiler’s working format.
In the previous article
, we explored how PostgreSQL transforms SQL text into a validated Query tree through parsing and semantic analysis. By the end of that journey, PostgreSQL knows that your tables exist, your columns are valid, your types match up, and your query makes sense.
But before the planner can figure out how to execute your query, there’s one more critical transformation step: query rewriting.
Let me show you why this matters. When you write this simple query:
In the previous post
, we explored how the Go compiler’s type checker analyzes your code. We saw how it resolves identifiers, checks type compatibility, and ensures your program is semantically correct.
Now that we have a fully type-checked AST, the next logical step would be to generate the compiler’s Intermediate Representation (IR)—the form it uses for optimization and code generation. But here’s something interesting: the Go compiler doesn’t immediately transform the AST into IR. Instead, it takes what might seem like a detour—it serializes the type-checked AST into a binary format, then deserializes it back into IR nodes.
In the previous article
, we explored how PostgreSQL establishes connections and communicates using its wire protocol. Once your connection is established and the backend process is ready, you can finally send queries. But when PostgreSQL receives your SQL, it’s just a string of text—the database can’t execute text directly.
Let me show you what happens when PostgreSQL receives this query:
SELECTnameFROMusersWHEREid=42;
PostgreSQL doesn’t see this as a command yet. It sees characters: S, E, L, E, C, T, and so on. The journey from this raw text to something PostgreSQL can execute involves two major transformations: parsing (understanding structure) and semantic analysis (adding meaning).
In the previous posts
, we explored the scanner—which converts source code into tokens—and the parser
—which takes those tokens and builds an Abstract Syntax Tree.
In future posts, I’ll cover the Intermediate Representation (IR)—how the compiler transforms the AST into an intermediate lower-level form. But before we can get there, we need to talk about two crucial intermediate steps: type checking (this post) and the Unified IR (which I’ll cover in a separate post soon).
In the previous article
, we explored the complete journey a SQL query takes through PostgreSQL—from parsing to execution. But before any of that can happen, your application needs to establish a connection with the database.
This might seem like a simple handshake, but there’s actually a sophisticated process happening behind the scenes—involving process management, authentication, and a binary protocol for efficient communication.
Understanding how PostgreSQL handles connections helps explain why connection pooling matters, how to troubleshoot connectivity issues, and why PostgreSQL’s architecture differs from thread-based databases. Let’s trace what happens when your application connects to PostgreSQL.
In the previous blog post
, we explored the scanner—the component that converts your source code from a stream of characters into a stream of tokens.
Now we’re ready for the next step: the parser.
Here’s the challenge the parser solves: right now, we have a flat list of tokens with no relationships between them. The scanner gave us package, main, {, fmt, ., Println… but it has no idea that Println belongs to the fmt package, or that the entire fmt.Println("Hello world") statement lives inside the main function.
Ever wonder what happens when you type SELECT * FROM users WHERE id = 42; and hit Enter? That simple query triggers a fascinating journey through PostgreSQL’s internals—a complex series of operations involving multiple processes, sophisticated memory management, and decades of optimization research.
This is the first article in a series where we’ll explore PostgreSQL’s query execution in depth. In this overview, I’ll walk you through the complete journey from SQL text to results, giving you the roadmap. Then, in subsequent articles, we’ll dive deep into each component—the parser, analyzer, rewriter, planner, and executor—exploring the details of how each one works.
This is part of a series where I’ll walk you through the entire Go compiler, covering each phase from source code to executable. If you’ve ever wondered what happens when you run go build, you’re in the right place.
Note: This article is based on Go 1.25.3. The compiler internals may change in future versions, but the core concepts will likely remain the same.
I’m going to use the simplest example possible to guide us through the process—a classic “hello world” program:
Welcome! I’m thrilled to finally launch this project—something I’ve been thinking about for almost a decade.
What This Is All About
For over 10 years, I’ve been giving talks at conferences about how things work under the hood. I started in the Python community, exploring topics like the object model, garbage collection, and the CPython interpreter internals. Later, I expanded into other communities—Go, PostgreSQL, and beyond—always with the same goal: making complex internals approachable.
This website uses cookies to analyze traffic and improve your experience.
Privacy Policy