Formal Semantics

The formal semantics of a programming language is a precise, rigorous, and unambiguous mathematical definition of the behaviors of all programs in that language. Every possible action a program can take—state change, computation, or output—is fully and completely described in terms of mathematical axioms, leaving no room for misinterpretation or ambiguity.

Any language, be it natural or programming, has two key aspects:

Syntax that defines the legitimate structures of the language
Semantics that defines the meaning of the above legitimate structures

You probably encountered the concepts of syntax and semantics before when you studied natural language in grade school. You learned syntax as grammar and semantics as reading. Programming languages are no different. Syntax defines what constitutes a well-formed program, while semantics describes what that program does when executed.

Defining Formal Semantics

Even though languages are different, oftentimes, they are able to express the same ideas. So, why do we need the K framework? Aren’t all languages essentially the same?

The short answer: Not exactly.

The reality is that it’s sometimes easier to say a thing in one language, than in the other. For instance, let’s say you have two sentences in different languages, but with the same meaning:

“It's raining heavily outside” (English) - 8 syllables
“下大雨" (Xià dàyǔ)” (Chinese) - 3 syllables

Now, programming languages work similarly—but with one key difference. While human language semantics are typically static (a word or sentence describes a fixed snapshot in time), programming language semantics describe dynamic processes (a program executes and changes state over time).

Human language semantics: "The cat is on the mat." (Describes a state)
Programming language semantics: x = x + 1 (Describes a process)

When this statement x = x + 1 is executed, the current value stored in variable x is retrieved, 1 is added to it, and the result is then stored back into x, updating its value.

It's important to recognize that while we often use human language to explain what a piece of code does (as we did above for x = x + 1), this is still a description.

Entire books, sometimes thousands of pages long, are written in English (or other natural languages) to describe the syntax and semantics of a programming language. These are called the “specifications” of a language.

The crucial difference with the K framework is that it moves beyond these natural language descriptions. Instead of relying on potentially ambiguous human language, K uses a formal, mathematical approach to define the semantics of a programming language.

This means that the behavior of every construct, like the assignment x = x + 1, is defined through mathematical axioms and rules, leaving no room for ambiguity.

Embracing Formal Semantics

We are firm believers in correctness and thus in formal semantics. Programming language must have formal semantics. Period.

Why?

Because without a formal semantics, by definition, programs have no meaning that is precise, rigorous, and unambiguous. In that sense, we don't even know what the program is supposed to do, not to mention whether it is correct.

If you really care about correctness, embrace formal semantics. They are your friends, and not enemies. As John Reynolds famously put it,

Semanticists should be obstetricians, not coroners of programming languages.

PreviousOverview NextLLVM-K Backend

Last updated 3 months ago

Was this helpful?