### 18` `State and Equality

In State, Change, and Testing, we introduced the notion of mutable
data. We also saw the impact it has on testing. Underlying testing is
some notion of equality: when we write a test in Pyret using
`is`

, we are implicitly making a statement about equality between
the two sides. Here we will examine equality in the presence of state
in more detail.

#### 18.1` `Boxes: A Canonical Mutable Structure

box consumes a value and creates a mutable box containing that value.

unbox-now consumes a box and returns the value contained in the box.

set-box-now consumes a box, a new value, and changes the box to contain the value. All subsequent unbox-nows of that box will now return the new value—

unless it is mutated again.

```
data Box<T>:
| box(ref v :: T)
end
fun unbox-now<T>(b :: Box<T>) -> T:
b!v
end
fun set-box-now<T>(b :: Box<T>, new-v :: T) -> Box<T>:
b!{v: new-v}
end
```

`b!v`

to extract the current value, and use
the naming convention of `-now`

to make clear these are stateful
operations, so the value now may not be the same as the value later.#### 18.2` `Mutation and Types

In terms of types, whenever we replace the value in a box, we want it to be type-consistent with what was previously there. Otherwise it would be very difficult to program against a box, because the type of its content would keep changing.

```
check:
n1 = box(1)
n2 = box(2)
set-box-now(n1, 3)
set-box-now(n2, 4)
unbox-now(n1) is 3
unbox-now(n2) is 4
end
```

`set-box-now(n1, "hi")`

, because that
would violate the type of `n1`

, which is `Box<Number>`

. We
could make this explicit by writing
`n1 :: Box<Number> = box(1)`

`n1`

being a box
of numbers does not preclude us from having a box of strings:
`n3 :: Box<String> = box("hello")`

This is a general rule we want to follow with mutable data: the new
value must be the same type as the old value. This gives programs a
consistent interface to program against. For instance, above, we know
that we can always perform numeric operations against the value
extracted from `n1`

—

#### 18.3` `Mutation and Equality

We’ve already seen [Re-Examining Equality] that equality is subtle. It’s about to become much subtler with the introduction of mutation!

b1 = box(7) |

b2 = box(7) |

b3 = b1 |

`b1`

and `b3`

are referring to the same
box, while `b2`

is referring to a different one. We can see this
from a memory diagram:Directory

`b1`

→ 1001`b2`

→ 1002`b3`

→ 1001

Heap

1001:

`box(7)`

1002:

`box(7)`

```
check:
b1 is-not%(identical) b2
b1 is%(identical) b3
b2 is-not%(identical) b3
end
```

`b1`

and `b3`

are aliases for the same box,
but neither is an alias to the box referred to by `b2`

. Since
`identical`

is transitive, it follows from the first two
tests that the third test must also pass (and thankfully, Pyret
confirms this for us!).Now, you might wonder why we have used `identical`

and not
`equal-always`

[Notations for Equality], i.e., plain old
`is`

.

Do Now!

Let’s try that:`check: b1 is b3 b1 is b2 end`

What do you see?

`b1 is b3`

,
passes. However, the second, `b1 is b2`

, fails! And the name
suggests why: the two are not guaranteed to always be
equal. That is, suppose we were to modify the box referred to by
`b1`

:
`set-box-now(b1, 8)`

`b1`

and `b3`

are aliases, their values change in lock-step
(more accurately, there is only one value—```
check:
unbox-now(b1) is-not unbox-now(b2)
unbox-now(b1) is unbox-now(b3)
end
```

#### 18.4` `Another Equality Predicate

Suppose we return to the state where we have defined the three boxes
[<three-boxes>] but not mutated `b1`

. That is, when
printed, all three boxes have the same value, `box(7)`

. We
have seen that `b1`

and `b3`

are both
`equal-always`

and `identical`

to each other. However, we
have also seen that `b1`

and `b2`

are neither of
those. This is somewhat frustrating, because there is clearly some
sense in which they are “equal”: at the moment, they contain the
same value, even if later on one of them might not.

`equal-now`

:
```
check:
b1 is%(equal-now) b2
b2 is%(equal-now) b3
end
```

`-now`

in the name reminds us that these values are equal at
the moment, but may not be equal later. Sure enough, if we add
`set-box-now(b1, 8)`

`equal-now`

tests fail: now,
they are no longer equal!`==`

for `equal-always`

and `<=>`

for
`identical`

. Similarly, `equal-now`

has the binary operator
`=~`

. You should view that as `=`

with hand-waving `~`

:
it’s equal for now, but don’t expect it to remain so. That is, we can
rewrite the above tests as:
```
check:
equal-now(b1, b2) is true
(b2 =~ b3) is true
end
```

`b1`

, `b2`

, or `b3`

has had its content modified.#### 18.5` `A Hierarchy of Equality

As you might guess, the equality operators have a hierarchy of implication. That is, if one operator is true of two expressions, the other necessarily is, but not vice versa.

Do Now!

Can you work out this hierarchy of implication?

Observe that if two expressions are `identical`

, then they are
aliases, i.e., they are referring to one and the same
value. Therefore, the values produced by those expressions must be
`equal-always`

. If they are always equal, then clearly at any
given moment, they must also be `equal-now`

.

Even if two expressions are not `identical`

, they may be
`equal-always`

. This would never be true of mutable data (because
there is the possibility of a future mutation), but it can be true of
immutable data that have the same structure and contents. In that
case, if they are always equal, then again they must be
`equal-now`

.

However, the converses are not true.

If two data are `equal-now`

, they may not be
`equal-always`

: if they are mutable, a future mutation may change
the equality, as we have seen above. Similarly, two data may be
`equal-always`

but not be `identical`

, because they reside
at different heap addresses and are therefore truly different data.

In most languages, it is common to have two equality operators,
corresponding to `identical`

(known as reference equality)
and `equal-now`

(known as structural equality). Pyret is
rare in having a third operator, `equal-always`

. For
most programs, this is in fact the most useful equality operator: it
is not overly bothered with details of aliasing, which can be
difficult to predict; at the same time it makes decisions that stand
the test of time, thereby forming a useful basis for various
optimizations (which may not even be conscious of their temporal
assumptions). This is why `is`

in testing uses
`equal-always`

by default, and forces users to explicitly pick a
different primitive if they want it.

#### 18.6` `Space and Time Complexity

`identical`

always takes constant time. Indeed, some programs use
`identical`

precisely because they want constant-time
equality, carefully structuring their program so that values that
should be considered equal are aliases to the same value. Of course,
maintaining this programming discipline is tricky.

`equal-always`

and `equal-now`

both must traverse at least
the immutable part of data. Therefore, they take time proportional to
the smaller datum (because if the two data are of different size, they
must not be equal anyway, so there is no need to visit the extra
data). The difference is that `equal-always`

reduces to
`identical`

at references, thereby performing less computation
than `equal-now`

would.

#### 18.7` `What it Means to be Identical

```
hold-b1-value = unbox-now(b1)
set-box-now(b1, hold-b1-value + 1)
```

```
b1-id-b2 = unbox-now(b1) == unbox-now(b2)
b1-id-b3 = unbox-now(b1) == unbox-now(b3)
```

`set-box-now(b1, hold-b1-value)`

`b1-id-b2`

would be `false`

but `b1-id-b3`

would be `true`

. And notice that this would always be true when
the two expressions are identical, but not otherwise.Thus, at the end there has been no change, but by making the change we
can check which values are and aren’t aliases of others. In other
words, thisrepresents the essence of `identical`

.

In practice, `identical`

does not behave this way: it would be
too disruptive. It is also not the most efficient implementation
possible, when Pyret can simply check the memory addresses being the
same. Nevertheless, it does demonstrate the basic idea behind
`identical`

: two values are `identical`

precisely when, when
you make changes to one, you see the changes manifest on the “other”
(i.e., there is really only one value, but with potentially multiple
names for it).

#### 18.8` `Comparing Functions

We haven’t actually provided the full truth about equality because we
haven’t discussed functions. Defining equality for functions—

Because of this, most languages have tended to use approximations for
function equality, most commonly reference equality. This is, however,
a very weak approximation: even if the exact same function text in the
same environment is allocated as two different closures, these would
not be reference-equal. At least when this is done as part of the
definition of `identical`

, it makes sense; if other operators do
this, however, they are actively lying, which is something the
equality operators do not usually do.

There is one other approach we can take: simply disallow function
comparison. This is what Pyret does: all three equality operators
above will result in an error if you try to compare two
functions. (You can compare against just one function, however, and
you will get the answer `false`

.) This ensures that the
language’s comparison operators are never trusted falsely.

Pyret did have the choice of allowing reference equality for
functions inside `identical`

and erroring only in the other two
cases. Had it done so, however, it would have violated the chain of
implication above [A Hierarchy of Equality]. The present design
is arguably more elegant. Programmers who do want to use reference
equality on functions can simply embed the functions inside a mutable
structure like boxes.

There is one problem with erroring when comparing two functions: a
completely generic procedure that compares two arbitrary values may
error if both of the values given are functions. Because this can
cause unpredictable program failure, Pyret offers a three-valued
version of each of the above three operators (`identical3`

,
`equal-always3`

and `equal-now3`

), all of which return
`EqualityResult`

values that correspond to truth, falsity, and
ignorance (returned in the case when both arguments are
functions). Programmers can use this in place of the Boolean-valued
comparison operators if they are uncertain about the types of the
parameters.