Mutation and Side Effects

In Chapter 5, you may have noticed that our arrays never actually changed. Every map and filter produced a new array, every reduce produced a new value, and day itself came through every example unchanged. That is also true of every program in this course so far, and of every program you wrote in CPSC 110: values were created, used, and combined into new values, but an existing value was never modified.

This chapter introduces the ability to change existing values, called mutation. The syntax that enables mutation is short (most of it is a single = sign) but this small syntax change has huge consequences for how you think about software.

Mutation introduces the dimension of time into our programs: the answer to "what does this variable hold?" stops being something we can read directly from the source code and becomes a feature of a particular instant in time during the program's execution. No longer can you reason about functions by replacing variable names with values, as you did in mathematics courses. This requires a shift in how we view our programs: we will need to read the static text and simulate the effect of time on the code's behaviour. We'll step through this thought process over the course of this chapter.

Reassignment

So far, every variable we have declared has been using the const keyword. const guarantees that the value associated with the name will remain unchanged for the duration of a program. TypeScript provides a second way to declare a variable: let, which allows variables to be reassigned zero or more times as the program runs.

typescript

const absZero: number = -273;
absZero = -273.15;              // error: absZero is defined const and cannot be reassigned
let temperature: number = -4;           // temperature holds -4 after this line executes
temperature = 1;                // temperature now holds 1 after this line executes
temperature = temperature + 2;  // temperature now holds 3 after this line executes

Declaring Variables with let

The statement

typescript

let x: T = e

declares a variable x of type T and initializes it to the value that expression e evaluates to. Variables declared with let can be reassigned to different values later. But as with const, you cannot use let to declare the same variable multiple times.

A habit to build: declare everything with const. When a value turns out to need changing, the compiler will tell you so (it refuses to compile a reassignment to a const), and at that moment you make a deliberate decision to change that one declaration to let. A const tells every reader "this value is settled"; the fewer lets a program contains, the less state there is to trace.

Reassignment with =

The statement

typescript

x = <expression>

evaluates <expression>, then assigns that value to the variable named x. For this to pass the compiler, x must have been previously declared by a let statement.

Every declaration you have written in this course uses = to store a first value into a brand-new name, an operation called assignment. The last two lines, where = stores a value into a name that already exists, is different as the old value temperature used to hold is replaced. This is reassignment, and it is only permitted for variables declared with let.

A reassignment is performed in two steps: first the right-hand side is evaluated, using the values the variables hold right now; then the result is stored into the name on the left, replacing whatever it held. This makes the = operator very different than the equals sign of mathematics. The last line above, temperature = temperature + 2;, makes no sense as a math equation (no number equals itself plus two). But, as an instruction it is perfectly clear: take the value temperature currently holds (1), add 2, and store the result (3) back into temperature

Reassignment is a statement, like if and return from the first chapter: it produces no value, it performs an action. And because each reassignment replaces a value, the order of statements now matters in a way it never did before:

typescript

let x: number = 1;
x = x * 2;
x = x + 3;
// x holds 5; if the two reassignments were swapped, x would hold 8

The state of a program is the value every variable holds at a particular instant during execution.

Before mutation, a program had no state worth describing: a name meant one value, forever. With mutation, understanding a program means tracing its state over time: running the program in your head, statement by statement, the way we traced temperature above. When a program with mutation surprises you, the cause is almost always a difference between the state you thought the program was in and the state it was actually in.

State Gives Loops a Memory

The arrays chapter promised that loops have a second strength we were not ready for: values that change as the loop runs. Here is a problem that needs it.

As a weather forecaster, I want to find the longest unbroken stretch of below-freezing hours in a day, so that I can report the severity of overnight cold snaps.

No single map, filter, or find computes this, because the answer depends on runs of consecutive elements: the computation has to remember how long the current cold streak is, and reset that memory every time the temperature rises above freezing.

Here's a solution that uses mutation to introduce state that keeps track of the current freezing streak, and the longest freezing streak seen:

typescript

/**
 * Computes the length of the longest run of consecutive
 * below-freezing readings in a day.
 *
 * @param {Reading[]} day the readings to examine, in hour order
 * @returns {number} the length of the longest freezing streak
 */
function longestFreezingStreak(day: Reading[]): number {
    let current: number = 0; // consecutive freezing readings ending here
    let longest: number = 0; // best streak seen so far

    for (const reading of day) {
        if (reading.tempCelsius < 0) {
            current = current + 1;
            if (current > longest) {
                longest = current;
            }
        } else {
            current = 0; // the streak is broken
        }
    }
    return longest;
}

typescript

test("longest freezing streak spans the early morning", () => {
    checkExpect(longestFreezingStreak(day), 2);
});

The counters current and longest are the loop's state: values that survive from one element to the next and change as the loop runs. To see the state evolve, trace the loop over our day of readings, whose temperatures are -4, -1, 3, 8, 2, -2:

Reading	Freezing?	`current` after	`longest` after
Hour 6, -4°	Yes	1	1
Hour 9, -1°	Yes	2	2
Hour 12, 3°	No	0	2
Hour 15, 8°	No	0	2
Hour 18, 2°	No	0	2
Hour 21, -2°	Yes	1	2

The morning streak of two readings is recorded in longest, survives the warm afternoon, and is not beaten by the single freezing reading in the evening. A trace table like this one is a standard tool for understanding stateful code, when programs are small enough. Writing one out by hand is a reliable way to debug: it makes the program's state visible.

Changing Objects and Arrays

Reassigning a variable is one kind of mutation. The second kind is changing the contents of an object or array, and it needs no let at all:

typescript

const reading: Reading = { hour: 6, tempCelsius: -4 };

reading.tempCelsius = -3;                  // allowed: the object's contents changed
reading = { hour: 6, tempCelsius: -3 };    // compile error: reading is a const

That this is allowed can feel surprising. const froze the variable (the name reading will refer to this object forever) but it says nothing about the object itself. In TypeScript, a raw object value's properties remain assignable. The distinction between a name and the thing it refers to is the subject of the next section; for now, notice that the two lines above really do different things, and the compiler treats them differently.

Arrays are objects, and they mutate the same ways. Elements can be replaced through their index, and the classic array mutations are the pair that grow and shrink the array itself: push adds an element to the end, and pop removes the last element and returns it.

typescript

day.push({ hour: 22, tempCelsius: -3 });   // adds a seventh reading to the end
const removed: Reading = day.pop();                 // removes that reading; day has six again
day[0] = { hour: 5, tempCelsius: -6 };     // replaces the first element entirely
day[1].tempCelsius = -2;                   // reaches into the second element and changes it

The complexity mutation brings is justified by what it models: in the real world things change, and to model them effectively, programs need to be able to change too. Our weather station does not receive its day of readings all at once: a new reading arrives every hour, and push is exactly how the day grows. Rebuilding the entire array to add one element would say something false about the problem, and at scale it is also wasteful: updating one reading in a year of data by copying the other thousands of readings does more measurable work than updating in place.

Copies and References

To understand what a variable holds, we unfortunately need to deal with the fact that programming languages sometimes make design decisions for the sake of performance. While we do not like to think about performance too much while we are making our first initial systems, performance is the reason this section, and the confusion it imparts, exists.

To predict which changes are visible where, we need a precise picture of what a variable actually holds. There are two cases. In the first case, a variable holding a primitive value (a number, string, or boolean) holds the value itself. Assigning it to another variable copies the value, and from then on the two variables are entirely independent:

typescript

let a: number = 5;
let b: number = a;          // b receives its own copy of 5
b = 6;
checkExpect(a, 5);  // a is unaffected

In the second case, a variable holding an object or array does not hold the object itself. It holds a reference: a value that says where the object is. (You can think of a reference as holding the address of the actual object value.)

Storing references is the performance decision we referenced at the top of this section. In a simpler world, assigning an object would copy it exactly the way assigning a number does, and every variable would be independent of every other. The language declines to do this because of what copying costs.

A primitive has a small, fixed size, so copying one is essentially free. An object has no size limit: a single Reading is small, but an array holding a year of readings, or an object whose properties are themselves objects, can occupy enormous amounts of memory, and the language cannot know at a given = sign whether the copy would be cheap or extremely expensive.

So, objects are never copied on assignment. What is copied instead is the reference, which stays the same small size no matter how large the object it leads to. This efficiency has a consequence: assigning an object to another variable copies the reference, not the object, so both variables now refer to the same object:

typescript

const r: Reading = { hour: 6, tempCelsius: -4 };
const t: Reading = { hour: 6, tempCelsius: -4 };
const s: Reading = r;                     // s receives r's reference
s.tempCelsius = 0;
checkExpect(r.tempCelsius, 0);   // change visible to r and s

graphviz Diagram — Figure 06.01: variable names r and s reference the same object.

One way to think about this is in terms of boxes. A variable is a labelled box. For a primitive, the box contains the value. For an object, the box contains an arrow pointing to the object, which lives elsewhere. const s = r copies the arrow. There is still exactly one Reading; it simply has two arrows pointing at it, and a change made through either arrow is visible through both. Two variables referring to the same object are called aliases. Aliasing is the single most common source of mutation surprises: code changes an object through one name, and the change appears under another name somewhere else entirely.

References explain the const surprise from the previous section: const locks the box, not the object the arrow points to. The arrow cannot be redirected, but the object at the end of it remains as mutable as ever.

Reference Equality vs Value Equality

=== (strict equality, from Chapter 2) means different things for primitives and objects, and the difference is exactly the visibility distinction from this section.

For primitives, === compares values. Two numbers that happen to be equal are ===, whether or not they were declared together:

typescript

let x: number = 5;
let y: number = 5;
checkExpect(x === y, true);   // equal values

For objects, === compares identity: it asks whether two variables refer to the same object in memory, not whether their contents match (value).

typescript

const r: Reading = { hour: 6, tempCelsius: -4 };
const s: Reading = r;                              // s refers to r's object
const t: Reading = { hour: 6, tempCelsius: -4 };   // a separate object with equal contents

checkExpect(r === s, true);    // the same object
checkExpect(r === t, false);   // different objects, even though their contents are identical

This is the visibility rule restated as a comparison. Because r and s are the same object, a mutation through one is seen through the other; because t is a different object, it is untouched:

typescript

s.tempCelsius = 0; // mutate s
checkExpect(r.tempCelsius === 0, true);    // r sees the change made through s
checkExpect(t.tempCelsius === -4, true);   // t, a separate object, does not

So r === t being false is not a technicality. It is the runtime telling you that r and t are independent, and that changing one will never change the other.

What a Function Can and Cannot Change

(TODO: need to think about whether this is parameter/arguments. I think parameter is correct but I don't trust my 6:20pm brain.)

The copy-versus-reference distinction matters because calling a function performs assignment as we just studied: each argument is assigned to its parameter, copying boxes. Everything about what a function can change in its caller follows from that one fact.

There are three cases. We will walk through them slowly, because mutation through function parameter passing is where mutation most often defies expectations. Note that these rules differ in different programming languages---when encountering a new language, you should learn how exactly it passes parameters.

1. Passing a primitive: the function gets a copy. The parameter is a new box holding a copy of the value, so nothing the function does to it can affect the caller:

typescript

function bump(n: number): void {
    n = n + 1;          // changes only the function's own copy
}

let hour: number = 6;
bump(hour);
checkExpect(hour, 6);   // unchanged: bump accomplished nothing observable

This behaviour is called pass-by-value: the function receives the value, not the variable. bump compiles and runs without complaint, and does nothing at all.

2. Passing an object: the function gets a copy of the reference. The parameter is a new box, but it holds a copy of the arrow, and the arrow points at the caller's object. Mutation through the parameter changes the one object both arrows share, and the caller sees it:

typescript

/**
 * Corrects a reading from a sensor that is known to measure
 * offset degrees away from the true temperature.
 * Modifies the given reading in place.
 */
function calibrate(reading: Reading, offset: number): void {
    reading.tempCelsius = reading.tempCelsius + offset;
}

const morning: Reading = { hour: 6, tempCelsius: -4 };
calibrate(morning, 1);                  // the sensor reads one degree low
checkExpect(morning.tempCelsius, -3);   // the caller's object changed

This behaviour is commonly called pass-by-reference: the function is operating on the caller's object, not a private copy. The change calibrate makes is externally visible, and it outlives the call.

3. Reassigning an object parameter: still invisible. One tricky aspect of pass-by-reference comes up when we reassign a method's parameters. The only parameter holds a copy of the arrow. Re-assigning that arrow points the function's own box somewhere new, but leaves the original arrow unchanged. It is worth ensuring you fully understand this example, as many languages have these kinds of tricky semantics:

typescript

function reset(reading: Reading): void {
    reading = { hour: reading.hour, tempCelsius: 0 };  // redirects the local arrow only
}

const evening: Reading = { hour: 21, tempCelsius: -2 };
reset(evening);
checkExpect(evening.tempCelsius, -2);   // unchanged

Compare calibrate and reset carefully: one writes reading.tempCelsius = ..., the other writes reading = .... Mutating through a reference (reading.tempCelsius) changes the shared object and is visible to the caller. Reassigning the reference itself (reading) merely rebinds the function's local name and is invisible. The dot is the difference.

The three cases, summarised:

Argument passed	The parameter receives	Reassigning the parameter	Mutating the object it refers to
`number`, `string`, `boolean`	a copy of the value	invisible to the caller	n/a (primitives have no parts to change)
object or array	a copy of the reference	invisible to the caller	visible to the caller

The same distinction, drawn out:

ditaa Diagram — Figure 06.02: A primitive argument is copied; an object argument shares one object through a copied reference.

Scope: Where Names Live

Mutation makes it newly important to know exactly where each variable exists, because every variable that can change is something a reader must keep track of. Where a variable exists is called its scope.

The rule in TypeScript is block scope: a variable exists from its declaration to the end of the block (the { ... }) that encloses it, and outside that block the name simply is not there. The compiler enforces this statically:

typescript

function describe(reading: Reading): string {
    if (reading.tempCelsius < 0) {
        const label: string = "freezing";
        return label;        // fine: label is in scope here
    }
    return label;            // compile error: Cannot find name 'label'
}

Blocks nest, and the rule works one way: an inner block can use names declared in the blocks that enclose it, but never the reverse. longestFreezingStreak relies on this. Its loop body reads and reassigns current and longest, which are declared outside the loop in the function's own block; that is what lets their values survive from one iteration to the next. To see why their position matters, consider what happens if current is declared inside the loop body instead:

typescript

for (const reading of day) {
    let current: number = 0;             // a brand-new current for every element
    if (reading.tempCelsius < 0) {
        current = current + 1;   // always computes 0 + 1
    }
}                                // ...and current is gone again

Each pass through a loop body is a fresh copy of the block: this current is created holding 0, exists for one iteration, and is discarded at the closing brace, along with its value. A variable declared inside a block cannot remember anything across runs of that block.

Choosing where to declare a variable is thus important: it is choosing how long the state lives. The design rule of thumb is to declare each variable in the smallest block that still spans every use of it.

So, values do not escape their blocks. There are three exceptions, and we have already met all three:

The value is returned. longest the name dies when longestFreezingStreak ends, but its final value escapes through return into the caller's hands.
The value is assigned to a variable declared outside. The loop body's assignments to current and longest outlive each iteration precisely because those variables live in the enclosing block.
The block mutates an object that is visible outside. This one is the easiest to miss. Consider calibrating an entire day:

typescript

function calibrateDay(day: Reading[], offset: number): void {
    for (const reading of day) {
        reading.tempCelsius = reading.tempCelsius + offset;
    }
}

Every name in sight here is short-lived: reading is re-created each iteration, and the day parameter vanishes when the function returns. Yet every change survives, because the objects those names pointed at belong to the caller's array, which is still in scope outside the function.

Scope governs names; it does not govern objects. In TypeScript, an object lives as long as anything, anywhere, still refers to it. Mutations made to it through a short-lived name are permanent all the same. Block structure determines whether a variable name still exists; the arrows determine whether a change persists.

Side Effects

We now have a name for what calibrate and calibrateDay do. A side effect is any observable change a function makes besides returning a value: mutating an object its caller can see, reassigning a variable outside its own scope, or interacting with the world outside the program entirely (writing a file, printing output, sending a network request). A function with no side effects, one that only computes a value from its inputs, is called pure. Every function in this course before today was pure.

Side effects change what we must do as readers, as documenters, and as testers of code:

Reading. A pure function can be understood from its signature: Reading[] in, number out. A signature like calibrateDay's (void out!) says nothing about what the function is for; its entire purpose is the effect. You must read the body, or trust the documentation.
Documenting. Because the signature is silent, the documentation has to say what the function changes: notice the line Modifies the given reading in place in calibrate's comment. A mutating function whose documentation does not mention the mutation is a trap for every caller who reasonably assumes their arguments come back untouched.
Testing. A pure function is tested by checking its return value. A mutating function is tested by checking state: call it, then assert on the object afterwards.

typescript

test("calibrateDay shifts every reading by the offset", () => {
    const readings: Reading[] = [
        { hour: 6, tempCelsius: -4 },
        { hour: 9, tempCelsius: -1 }
    ];
    calibrateDay(readings, 1);
    checkExpect(readings[0].tempCelsius, -3);
    checkExpect(readings[1].tempCelsius, 0);
});

(TODO: I'm a bit confused by this paragraph... doesn't the closure thing save us from these nasty side effects of mutation? Again, too late in the day to be totally sure.)

There is one more consequence, and it is the one this part of the course has been building toward. The invariants chapters established a comfortable discipline: validate a value when it is constructed, and rely on the invariant afterwards. Mutation breaks the "afterwards". reading.hour = 99 is a perfectly legal statement that violates the Reading invariant long after construction, and aliasing means any part of the program holding a reference can do it, at any time, from anywhere. Under mutation, an invariant is no longer established once; it must be preserved by every operation that touches the data, forever. Keeping that promise when references can travel anywhere in the program requires controlling who is allowed to mutate at all, and that problem (restricting mutation to a trusted set of operations) is precisely the focus of part 2 of the course.

Until then, the working guidance falls back on a discipline-based approach:

Declare every variable with const. When a value turns out to genuinely need changing, change that one declaration to let, deliberately. Every let is a value your reader must trace through time.
Prefer the non-mutating operations (map, filter) when they fit; reach for mutation when the problem is genuinely about change, as the arriving readings and the streak counters were.
Keep mutable state in the smallest scope that works: a counter local to one function is easy to reason about; a mutable value visible to the whole program can be changed by the whole program.
Clearly document mutation when it happens: in a function's name, its documentation, and its tests.

Mutating the World

Mutation is worth the extra mental burden it induces: real programs model a changing world. Programs must now be understood by tracing values through time, and every reader needs to be able to answer a new set of questions precisely: is this a copy or a reference? Does this change escape this block, this function, this module? Who else holds an arrow to this object? The box-and-arrow model and the scope rules in this chapter are the tools for answering these questions.

Side effects add new complexity we have not encountered yet: effects that reach outside of specific functions, to other parts of the program, to files, databases, networks, and users. But since the point of programs is to do useful work for people, side effects are a fundamental part of real software systems. Side effects require careful thought and design to use effectively without making a program too hard to understand or brittle to evolve.

Exercise: Moving a Robot

Here we practise in-place mutation, references and aliasing on a small moving object.

As a game engine, I want a robot's position updated as it moves, so that the rest of the game can read its current location.

A robot is just a position:

typescript

type Robot = { x: number; y: number };

Write step(robot: Robot, dx: number, dy: number): void that moves the robot by adding dx to its x and dy to its y, changing the robot in place. Create a robot at { x: 0, y: 0 }, call step(robot, 1, 2), and use checkExpect to confirm the caller's robot now has an x of 1 and a y of 2.
Write teleport(robot: Robot, x: number, y: number): void that instead reassigns the parameter, with robot = { x: x, y: y }. Predict what the caller's robot looks like after teleport(robot, 9, 9), then confirm it with checkExpect. Why does step change the caller's robot while teleport does not?
Give a robot a second name with const other = robot. Call step(other, 3, 0), and use checkExpect to show that robot sees the move, because other is an alias for the same object. Then build a separate robot twin with the same coordinates, and use === to confirm that robot === other is true but robot === twin is false.
Write walk(robot: Robot, steps: number[]): void that uses a for of loop to apply each number in steps as an eastward move (one step(robot, s, 0) per element), so the position carries forward from one iteration to the next. Check the robot's final x against the sum of steps.

Mutation and Side Effects ​

Reassignment ​

State Gives Loops a Memory ​

Changing Objects and Arrays ​

Copies and References ​

What a Function Can and Cannot Change ​

Scope: Where Names Live ​

Side Effects ​

Mutating the World ​

Mutation and Side Effects

Reassignment

State Gives Loops a Memory

Changing Objects and Arrays

Copies and References

What a Function Can and Cannot Change

Scope: Where Names Live

Side Effects

Mutating the World