Checking Invariants

In Chapter 1 we introduced the distinction between the static and dynamic views of a program. The compiler checks the static view: it reads your source code, analyses your types, and flags inconsistencies before the program runs. But a program that passes the type checker can still produce the wrong results. Types tell you what kind of value a function returns; they do not tell you whether that value is correct.

The properties a correct program must maintain beyond its types are called invariants. This chapter is about working with them: what an invariant is, how to identify the invariants in a problem, how to record them in a function's documentation so they can be detected later, and how to test whether the invariant holds.

In this course we will mainly focus on what are known as unit tests, as they test individual units of a program, usually at the function level.

What Is an Invariant?

An invariant is a property that must hold for a value or a computation to be meaningful. Typically, and in this class, invariants focus on properties that the type system cannot express or enforce.

We have already met an invariant. In the previous chapter, the Song type carried this comment:

typescript

type Song = {
  title: string;
  artist: string;
  durationSeconds: number; // must be positive
};

The comment is hinting at work the type cannot do: number includes -30, but real songs cannot have negative durations. Precisely, the invariant is: durationSeconds must be positive. Note that the type checker will accept an object even though it violates this invariant:

typescript

// passes the type checker; violates the invariant
const broken: Song = {
  title: "Song A",
  artist: "Artist 1",
  durationSeconds: -30
};

This object has the right shape, so the static type check passes. But its meaning is wrong, i.e. it violates the invariant. In this case, any code that trusts the invariant can behave incorrectly. Imagine a function summing the durations in a playlist: a negative duration would decrease a value one would expect to be monotonically increasing. When an invariant fails, a value can no longer be trusted by the operations built on it, even though the code may type check.

Invariants are everywhere once you look for them: durations are positive, percentage scores sit between 0 and 100, counts are whole numbers. None of these facts appear in the types number, number, number. They are constraints that exist in the space between what the type allows and what the problem requires.

Identifying Invariants

At the function level, invariants attach in two places: to a function's inputs and to its output. For the rest of this chapter we will work with a single running example:

As the campus library, I want late fees computed from how many days late a book is returned, with a short grace period and a capped maximum, so that patrons are charged fairly and predictably.

Concretely, the library's late-fee policy is: a book returned up to 2 days late incurs no fee; after that grace period, the fee is $0.50 for each additional day; and the total fee never exceeds $10.

A function computing the fee will have this signature:

typescript

lateFee(daysLate: number): number

A precondition is an invariant that must be true of the arguments when the function is called. The parameter type admits any number: -4, 3.7, 40000. But daysLate is a count of days, so the function is only meaningful when daysLate is a whole number and at least 0. That restriction is the precondition on daysLate.

A postcondition is an invariant about what the function guarantees about its result, assuming the precondition held. The return type says only number, but the policy promises more: the fee is never negative, and it never exceeds $10. Each of those guarantees is a postcondition.

To identify these in your own functions, you need to examine the gap between the type you have included in a signature and the type's meaning:

Identifying preconditions: For each parameter, ask: of all the values this type allows, which are actually meaningful? Any restriction you state is a precondition. Look for ranges, wholeness, non-empty strings, and relationships between parameters (for example, min <= max).
Identifying postconditions: For the result, ask: what can the caller rely on beyond the return type? Any guarantee you state is a postcondition.

A useful invariant statement has three qualities. It is precise: terms must be backed by definitions; words like "valid" or "sensible" without qualification are not valuable. It is testable: you can programmatically validate whether the invariant is true. And it is operational: it is strong enough that an implementation can actually rely on it.

For example, consider the invariant stated as: daysLate is reasonable. This is not precise, testable, or operational: it cannot be checked or relied upon.

In contrast, the invariant daysLate is a whole number and daysLate >= 0 can be turned directly into tests.

Documenting Invariants

Unlike types, which the compiler's type-checker checks, the compiler does not know about, nor check the invariants that restrict the values in your code. The only way a caller, a test author, or a future maintainer can detect invariants later is if they are written down where the function lives. That is, in its documentation.

We record invariants in the function's doc comment, alongside its purpose. Doc comments precede function declarations, and are formatted within /** <text comments> */. Details relevant to the @param elements passed to a function and the @return value are also included. For lateFee, the full documented function is:

typescript

/**
 * Computes the fee (in dollars) for a library book returned
 * daysLate days after its due date.
 *
 * The first 2 days are a grace period: no fee is charged.
 * After the grace period, the fee is $0.50 for each additional
 * day. The total fee never exceeds $10.
 *
 * Precondition: daysLate is a whole number and daysLate >= 0.
 *
 * @param {number} daysLate the number of days past the due date
 * @returns {number} the fee in dollars, between 0 and 10
 */
function lateFee(daysLate: number): number

Function Doc Comments

In TypeScript, // comments out the rest of a line. Anything between /* and */ is also a comment, and these comments can span multile lines.

For function doc comments in this course, we'll use syntax that's consistent with JSDoc:

typescript

/**
 * Here you put a summary of the function foo
 *
 * Precondition: list any preconditions
 * Postcondition: list any precondition
 * 
 * @param {typeofParam1} param1Name a description of param1Name's purpose
 * @param {typeofParam2} param2Name a description of param2Name's purpose
 * @returns {typeofReturn} describe what the return value expresses
 */
 function foo(param1Name: typeofParam1, param2Name: typeofParam2): typeofReturn

The Precondition: line restricts daysLate to the meaningful subset of number, and the clause "the total fee never exceeds $10" is a postcondition on the result.

Together, a function's documented preconditions and postconditions are often called its contract: the caller promises the preconditions, and the function promises the postconditions in return. Writing the contract down is not bureaucracy; it is what makes the invariants detectable. The doc comment is where a test author will look to decide what to check, and as we will see below, every clause of a well-written contract becomes a test.

Invariants

You wrote invariants in CPSC 110 too; they lived in your data definitions and signatures. A signature using Natural instead of Number was a precondition (whole and non-negative): the daysLate precondition above is exactly Natural. Likewise, an interval data definition like:

racket

; Fee is Number[0, 10]
; interp. a late fee in dollars

was an invariant statement: the type is Number, and the meaningful subset is 0 to 10. TypeScript's types are checked, but they cannot express intervals, so these statements move into the function's doc comment instead.

Checking Invariants With Tests

Tests are commonly kept separate from the code they validate. In all of the code we look at in this course, in line with common best practice, production code is stored in the src/ directory and all tests are stored in the test/ directory. The test/ directory can contain any number of test files, often in 1:1 correspondence with the files being tested in src/.

Within each test file is a number of individual test cases. Each test case has a name and a body. The name describes what the test is checking; the body contains one or more assertions. The checkExpect call we have been using in this course is an example of an assertion.

In the contract above, the late fee grace period is two days long. A concrete test case that checks this, by ensuring that lateFee(2) returns 0, looks like:

typescript

test("no fee at the grace period boundary", () => {
    checkExpect(lateFee(2), 0);
});

Assertions are the core of any test case: they validate that a dynamic behaviour emits the expected output for a given input. The checkExpect assertion takes two arguments: an expression to evaluate, and the expected result. If the two values are equal, the test passes silently. If they differ, the framework reports what was expected and what was actually produced, pointing you to the failing test by name.

Tests vs check-expect

ISL used check-expect as a standalone expression at the top level of a file. TypeScript's test wrapper is a small change in form: it names the test and groups related checks together. The underlying idea is the same: write down what you expect and let the framework compare.

racket

(check-expect (late-fee 2) 0)

Running Tests

test, checkExpect, and checkError are provided by the course toolkit; each test file imports them at the top of the file with:

typescript

import { test, checkExpect, checkError } from "@ubccpsc/210-toolkit";

To run the tests, you can either open the testing feature within your IDE (we will demo this in class), or open the terminal view within your IDE (also an in-class demo) and execute pnpm test. The terminal is a text-based interface where you type commands to direct your computer to perform tasks for you, where the input and output are textual.

When executed by either your IDE or your terminal command, the test framework executes every test case it can find in the test/ directory. Passing test cases are printed in green; failing test cases are printed in red, along with what was expected and what was actually returned.

The Testing Process

So far we have treated tests as something you write for code that already exists. When you are learning, it is strongly recommended that you write the tests first. Writing tests first forces you to think about the expected behaviours of the code under test, that is the code your test case is validating, before you spend time implementing it.

Having a precise set of input/output pairs is extremely helpful when you are implementing the code. Before writing the implementation you can execute your tests to confirm they fail; once the implementation has been correctly created, the tests should pass. Confirming that a test fails first is what makes its eventual pass a meaningful signal. A test that passes even when you haven't implemented the function is meaningless.

For lateFee we are already in a position to do this. We have not written a line of the implementation, but the contract we documented above gives us everything we need: each clause from the function documentation becomes a test.

typescript

test("no fee during the grace period", () => {
    checkExpect(lateFee(0), 0);
    checkExpect(lateFee(2), 0);
});

test("fee accrues for each day after the grace period", () => {
    checkExpect(lateFee(3), 0.50);
    checkExpect(lateFee(12), 5.00);
});

test("fee never exceeds the maximum", () => {
    checkExpect(lateFee(30), 10.00);
});

The precondition also guides us towards situations that may not result in a valid output. Since the precondition says daysLate >= 0, what happens if we pass -5 is undefined: the caller has broken their half of the bargain, and the function promises nothing in return. We'll return to what should happen when a precondition is violated anyway, and how to test for such erroneous behaviours, at the end of this chapter.

To run these tests, lateFee must at least exist; otherwise the compiler will refuse to execute the program at all. So we begin with a stub: a function with the right signature that returns a clearly wrong value.

typescript

function lateFee(daysLate: number): number {
    return -1;  // stub
}

We chose -1 deliberately. A fee is never negative, so every test is guaranteed to fail against the stub. (Had the stub returned 0, the grace-period test would have passed before we wrote any real code.) Running the suite now shows three failing tests. This step is important: a test that cannot fail checks nothing, and we have just confirmed that all of ours can fail when they are expected to. Running these tests results in:

✗ no fee during the grace period
      Expected: 0
      Received: -1
✗ fee accrues for each day after the grace period
      Expected: 0.5
      Received: -1
✗ fee never exceeds the maximum
      Expected: 10
      Received: -1

Now we implement the function. The precondition and postconditions give us an idea of which conditions to put in our if statement.

typescript

function lateFee(daysLate: number): number {
    if (daysLate <= 2) {
        return 0;
    }
    return 0.5 * (daysLate - 2);
}

And if we run the tests again:

✓ no fee during the grace period
✓ fee accrues for each day after the grace period
✗ fee never exceeds the maximum
      Expected: 10
      Received: 14

Two tests pass, but the third fails. The failure report tells us exactly where to look: lateFee(30) produced 14. Re-reading the specification reveals the problem: our implementation handles the grace period and the per-day charge, but we forgot the maximum entirely. The fix adds the missing behaviour:

typescript

function lateFee(daysLate: number): number {
    if (daysLate <= 2) {
        return 0;
    }
    const fee = 0.5 * (daysLate - 2);
    if (fee > 10) {
        return 10;
    }
    return fee;
}

All three tests now pass:

✓ no fee during the grace period
✓ fee accrues for each day after the grace period
✓ fee never exceeds the maximum

Ntice what did not change: the tests. They were correct all along, because they were written from the specification, and so the requirement our implementation forgot had nowhere to hide. If we had written our tests after the implementation, by reading our own code and checking that it does what it appears to do, we would probably not have thought to test the maximum: the first prototype of lateFee contained no hint that a maximum should exist. Tests written first keep the specification in charge; tests written after tend to mirror the code, mistakes included.

Deriving Tests from the Specification

We wrote the lateFee suite by instinct: read the specification, turn each clause into a test. That instinct served us well, but instinct alone does not tell you when a suite is complete enough. Two systematic techniques, equivalence class partitioning and boundary value analysis, turn that instinct into a method.

Equivalence Class Partitioning

The most direct way to choose test inputs is to divide the input space into equivalence classes: groups of inputs that the specification says should be handled the same way. You then choose at least one representative from each class.

The lateFee specification divides its input into three classes:

Class	Inputs	Behaviour
Grace period	0 to 2	Fee is 0
Accruing	3 to 21	Fee grows by $0.50 per day
Capped	22 and up	Fee is exactly $10

Note where the table begins: at 0, with no negative inputs anywhere. We got starting at 0 directly from daysLate >= 0 in the precondition. The invariant we wrote in the doc comment defines the input space the suite must cover; without it, we would not know whether lateFee(-5) was a missing class or a meaningless input.

Look back at the suite we wrote: it contains a representative from each class: lateFee(0) and lateFee(2) for the grace period, lateFee(3) and lateFee(12) for accrual, lateFee(30) for the cap. This suite caught our missing-maximum fault because it had a representative from the capped class, and that is precisely the class the implementation forgot.

Within a class, one representative is as informative as another. lateFee(12) and lateFee(15) both exercise the accruing class; testing both adds almost no confidence beyond testing one. Counting tests is therefore a poor measure of a suite: a suite of lateFee(5), lateFee(8), and lateFee(15) has three checks but covers only one class, and would have passed our buggy, cap-free implementation without complaint. What matters is covering the classes, not accumulating checks.

Boundary Value Analysis

Equivalence class partitioning identifies the regions to test. Boundary value analysis identifies where within those regions to look most carefully: at the edges, where one class meets the next.

Bugs cluster at boundaries, because boundaries are implemented with comparisons, and comparisons are easy to get wrong by one. lateFee has two boundaries: between days 2 and 3 (grace ends, accrual begins) and between days 21 and 22 (accrual reaches the maximum). A boundary-focused test checks the last input on each side:

typescript

test("fee changes exactly at the class boundaries", () => {
    checkExpect(lateFee(2), 0);      // last free day
    checkExpect(lateFee(3), 0.50);   // first charged day
    checkExpect(lateFee(21), 9.50);  // last accruing day
    checkExpect(lateFee(22), 10.00); // first day at the maximum
});

Consider a near-miss implementation in which the grace check was written daysLate <= 3 instead of daysLate <= 2.

typescript

// Near-miss implementation example
function lateFee(daysLate: number): number {
    if (daysLate <= 3) { // bug here
        return 0;
    }
    const fee = 0.5 * (daysLate - 2);
    if (fee > 10) {
        return 10;
    }
    return fee;
}

This fault is visible at exactly one input: lateFee(3) returns 0 instead of 0.50. Every other value in the entire domain, including a mid-class representative like lateFee(12), behaves correctly.

Our original suite does catch this fault, but only by luck: we happened to choose the boundary value 3 as a representative of the accruing class. Had we chosen 4 and 12 instead, every test we wrote would have passed.

This example is the essence of boundary value analysis: off-by-one faults are often invisible everywhere except at a single input value, so those values must be in the suite by design rather than by chance.

The whole input space, drawn as a line: three equivalence classes, separated by the two boundaries the suite must pin down.

svgbob Diagram — Figure 03.01: The three equivalence classes for daysLate.

Expected and Unexpected Errors

Not all failures are alike. Think about a bank account: an account whose balance is negative is in a state the system should never allow, so if one is ever observed, the program itself is broken. But a customer trying to withdraw more than their balance is not unusual at all; it is a normal interaction the design must anticipate.

The first is an unexpected error: an invariant has been violated, and no further computation on that data can be trusted. The second is an expected error: an unsuccessful but entirely foreseeable outcome that belongs in the function's contract. The two kinds are handled differently, and tested differently.

To see both kinds in one place, we extend the library example. The library allows each book loan to be renewed at most twice:

typescript

type Loan = {
  title: string;
  // invariant: a whole number, 0 <= renewalsRemaining <= 2
  renewalsRemaining: number;
};

Trying to renew a loan that has no renewals remaining is an expected error: it will happen at the front desk every day, and the contract should say exactly what the caller gets.

How do we encode an expected error? We could encode the result as a null value: but null is not descriptive, and null is an overloaded concept in many languages. We could also model this expected error by returning -1 (since that would never be a valid lateFee value anyways). But, this would open clients to simple unexpected math errors summing lateFee calls, if they were not careful and did not check if the return value is -1 in their code.

So that we can be clear about the failure, and rely on the typechecker to check whether errors are correctly handled, we introduce a result type:

typescript

type Result<T, E> =
  | { ok: true, value: T }
  | { ok: false, error: E };

Result is generic over two type parameters: T is the type of a successful value, and E is the type of the error. This is the same tagged-union idea from the previous chapter, with ok as the discriminator: a caller checks ok to learn whether it received a value or an error. We will express expected errors with an error result type.

By contrast: a Loan whose renewalsRemaining is -1, is an unexpected error. No sequence of correct operations can produce it, so if renewalsRemaining is -1, something else has already gone wrong.

We detect unexpected errors and signal them to our program using the assert function. These calls look like assert(<condition>, <description>). When an assert fails, the program is immediately terminated with the provided description.

assert

Calling the assert function:

typescript

assert(<condition>, <description>)

evaluates <condition>. If <condition> is true, the program continues to execute as if assert was not there. If <condition> is false, the program will terminate and print out <description>.

A tricky thing about asserts: the <condition> describes what should hold, while <description>, which is only printed out when <condition> fails, usually describes what did not hold.

assert is not part of the TypeScript language itself. In this course we use the standard assert function provided by the Node runtime:

typescript

import assert from "node:assert/strict";

Many TypeScript/JavaScript frameworks provide their own assert-like functions as well, and any of them serves the same role. By the end of Part 2, you'll know how to implement assert yourself, so you'll be able to use this concept in whatever code you write.

The presence of assertions in the implementation like this can make the code much easier to write and debug, because your implementation can trust that the invariants are valid for the remainder of the function. This helps reduce defensive checks you might otherwise need to make in your code. assert also communicates to other developers that these are checks for valid input, rather than checks part of the core program logic.

typescript

/**
 * Renews a loan, consuming one renewal.
 *
 * Precondition: loan satisfies the Loan invariant.
 * Postcondition: if any renewals remain, returns ok: true with a new
 * Loan with one fewer renewal remaining; otherwise returns ok: false
 * with an explanatory error.
 *
 * @param {Loan} loan the loan to renew
 * @returns {Result<Loan, string>} the renewed Loan on success, or an
 * error explaining why the loan could not be renewed
 */
function renew(loan: Loan): Result<Loan, string> {
    assert(loan.renewalsRemaining >= 0, "Loan invariant violated: negative renewals");
    assert(loan.renewalsRemaining <= 2, "Loan invariant violated: too many renewals");

    if (loan.renewalsRemaining === 0) {
        // expected: running out of renewals is a normal outcome
        return { ok: false, error: "No further loan renewals available" };
    }
    return {
        ok: true,
        value: {
            title: loan.title,
            renewalsRemaining: loan.renewalsRemaining - 1
        }
    };
}

Note where the assert calls live: unlike checkExpect, which sits in test/ and probes chosen inputs from the outside, assert sits in the source code in src/ and is evaluated on every execution of the function, whoever the caller is. Halting may seem drastic, but it is the right response to an impossible state.

At the start of this chapter, we saw that operations built on a value whose invariant has failed quietly produce nonsense. For instance, lateFee(5.5) returns a value, even though late fees are only defined as whole numbers. An assertion stops the program at the first sign of corruption, before the nonsense can spread or be written somewhere permanent. This is the behaviour we deferred earlier in the chapter: when a caller violates a precondition, an assert is how the function refuses to continue.

To write tests for errors, we must be clear about the difference between unexpected and expected errors. The expected error is a documented outcome: the postcondition names the exact value the caller receives (an ok: false result carrying the reason), so we test it with checkExpect, the same way we test every other clause of the contract:

typescript

test("renewal succeeds while renewals remain", () => {
    const fresh: Loan = { title: "Clean Code", renewalsRemaining: 2 };
    checkExpect(renew(fresh), { ok: true, value: { title: "Clean Code", renewalsRemaining: 1 } });
});

test("renewal is refused when no renewals remain", () => {
    const exhausted: Loan = { title: "Clean Code", renewalsRemaining: 0 };
    checkExpect(renew(exhausted), { ok: false, error: "No further loan renewals available" });
});

The unexpected error has no value to compare against, because the correct behaviour is to not produce a value at all. For this we use checkError, which runs the function it is given and passes only if an error occurs; if the call completes normally, the test fails. An optional second argument states the error message we expect the failure to carry:

typescript

test("renew halts on a loan that violates the non-negative invariant", () => {
    const corrupted: Loan = { title: "Clean Code", renewalsRemaining: -1 };
    checkError(() => renew(corrupted), "Loan invariant violated: negative renewals");
});

checkError

typescript

checkError(() => <expression>, <message>);

the () => <expression> is an anonymous function. It serves as a wrapper, the same one test itself uses. This is because checkError takes a function as its first argument. It is convenient to write this simply with an anonymous function:

typescript

checkError(() => functionUnderTest(arg1, arg2), <message>);

checkError then executes its first function argument and checks whether an error occurs during execution. If it does, checkError passes.

For example, in the above,

typescript

checkError(() => renew(corrupted), "Loan invariant violated: negative renewals");

checkError calls renew(corrupted) and checks whether and error occurs.

By contrast, if we were to try and write something like:

typescript

checkExpect(renew(corrupted), "Loan invariant violated: negative renewals")

renew(corrupted) would execute, and fail, before checkExpect is called, halting the entire execution of the test suite.

In short:

Expected errors should be tested analogously to how a user would interact with a function, which means we should use checkExpect: a refused renewal is not a malfunction but a specified result, and the contract tells you exactly what value to expect.

Unexpected errors though are almost always the result of programming errors, which means validating them with checkError is more appropriate, since you're ensuring the program is refusing to process erroneous requests.

As a rule of thumb, if the specification describes the outcome, check the outcome; if the outcome should be impossible, check that the program halts.

Triangulating Quality: Type Checking, Testing, and Assertions

Types give us structure, but tests give us confidence. The type checker and the test suite operate at different times. The type checker works statically on the source code, ruling out whole categories of invalid calls before the program runs. Tests work dynamically, verifying specific behaviours by actually executing the function. They are complementary approaches: a program that passes every type check can still return the wrong value for a given input. But, a program that passes all its tests may still fail on an input the test suite did not evaluate. The combination is what gives confidence: types narrow the space of programs that can even be written, and tests verify that the program you wrote does what you intended.

Documented invariants are the bridge between the two. The preconditions and postconditions in a function's doc comment record exactly the part of the specification the compiler cannot see, and they are exactly what the tests should check. Assertions add a third layer of protection: where types check structure before the program runs and tests probe chosen inputs from the outside, assertions watch the invariants from inside the implementation, on every execution.

An invariant that is written down can be turned into a test suite and into assertions; an invariant that lives only in someone's head cannot be checked by anything.

Checking Invariants ​

What Is an Invariant? ​

Identifying Invariants ​

Documenting Invariants ​

Checking Invariants With Tests ​

The Testing Process ​

Deriving Tests from the Specification ​

Equivalence Class Partitioning ​

Boundary Value Analysis ​

Expected and Unexpected Errors ​

Triangulating Quality: Type Checking, Testing, and Assertions ​

Checking Invariants

What Is an Invariant?

Identifying Invariants

Documenting Invariants

Checking Invariants With Tests

The Testing Process

Deriving Tests from the Specification

Equivalence Class Partitioning

Boundary Value Analysis

Expected and Unexpected Errors

Triangulating Quality: Type Checking, Testing, and Assertions