#### 11.2Understanding Equality

##### 11.2.1Equality of Data

Now that we have the ability to mutate data, it’s worth asking what it means for two pieces of data to be equal. We’ll motivate this through a concrete example. Following the naming convention of Data Mutation and the Directory, we will write every name only once, using the upper-case name from Python, but everything we write will equally be true for Pyret.

First, consider these three statements:

a1 = Account(8603, 500)
a2 = Account(8603, 500)
a3 = Account(8603, 250)

Do Now!

Which of the above Accounts do you consider “equal”?

The third Account has a different balance than the first two, so it can’t be considered equal to either of the first two. The first two have the same contents, so arguably they can be considered equal.

Now, let’s consider the directory and heap that would result from running these three statements:

Directory

• a1

1120

• a2

1121

• a3

1122

Heap

• 1120:

Account(8603, 500)

• 1121:

Account(8603, 500)

• 1122:

Account(8603, 250)

From the perspective of the heap, each account ends up at its own address. Those different addresses are a way in which the two values are not the same: they have the same contents, but not the same address. Is that relevant? To explore this, let’s associate another name (a4) with the same address as a2, then change the balance in a2. For now we will show just the Python version:

a1 = Account(8603, 500)
a2 = Account(8603, 500)
a3 = Account(8603, 250)
a4 = a2
# checkpoint 1
a2.balance = 800
# checkpoint 2

What does memory look like before and after checkpoint 1? Before the checkpoint:

Directory

• a1

1130

• a2

1131

• a3

1132

• a4

1131

Heap

• 1130:

Account(8603, 500)

• 1131:

Account(8603, 500)

• 1132:

Account(8603, 250)

a1 and a2 refer to two different Accounts with the same contents. After checkpoint 1, those contents are different because we modified the contents of the balance field in a2:

Directory

• a1

1130

• a2

1131

• a3

1132

• a4

1131

Heap

• 1130:

Account(8603, 500)

• 1131:

Account(8603, 800)

• 1132:

Account(8603, 250)

In contrast, a2 and a4 are aliases for the same Account. Therefore, their values change in lockstep: asking to display the value of either one would now show an account with a balance of 800.

Do Now!

What do you think now? Are the first two accounts equal?

##### 11.2.2Different Equality Operations

This sequence of examples points out that we seem to be raising three possible notions of equality:

1. Whether two values have the same contents. This is formally called structural equality; you can think of it as a “print equality”, namely, when displayed, do the two values look the same.

2. Whether two values live at the same address, i.e., there is actually only one value in memory. This is formally called reference equality. Usually, we would refer to the two values by different names (so there is the possibility that they are different), and reference equality checks whether the names are aliases. Observe that a given value always prints the same way, so any two names that have reference equality also have structural equality, but not vice versa.

Which notion of equality is “correct”? It turns out that they are valuable in different contexts. For this reason, programming languages generally provide multiple equality operations, letting the programmer indicate which kind of equality they mean in their context.

Unfortunately, the names of equality operations, and their exact meaning, vary across languages. Therefore, we will examine each of Pyret and Python separately.

##### 11.2.2.1Equality in Python

The == operator that you learned in Pyret and we carried into Python checks for structural equality, independent of addresses:

 a1 == a2

True

 a2 == a4

True

However, note that this will no longer be true at checkpoint 2:

 a1 == a2

False

 a2 == a4

True

If we instead want to check for aliasing, we instead use an operation called is (not to be confused with Pyret’s is, which is used for writing tests):

 a1 is a2

False

 a2 is a4

True

This explains why a2 == a4 was true both before and after the mutation, but a1 == a2 was no longer true after it. The latter seems to violate a very basic meaning of “equality”; the problem here is caused by the introduction of mutation.

As we go forward, you’ll get more practice with when to use each kind of equality. The == operator is more accepting, so it is usually the right default. If you actually need to know whether two expressions evaluate to the same address, you should instead use is.

##### 11.2.2.2Equality in Pyret

Equality in Pyret is somewhat more detailed, because the language wants you to think harder about what is happening in your programs.

Recall that we are using the datatype in Example: Bank Accounts and have written the following definitions:

a1 = account(8603, 500)
a2 = account(8603, 500)
a3 = account(8603, 250)
a4 = a2
# checkpoint 1
a2!{balance: 800}
# checkpoint 2


In Python, we saw that a1 == a2 before the mutation. However, in Pyret, this produces false! Why?

The reason is because structural equality is actually complicated; there are two different questions we could be asking:
1. Are these two values structurally equal right now?

2. Will these two values be structurally equal always?

Pyret makes a distinction between these two.

By default, Pyret tends towards safer programming practices. Therefore, the standard (structural) equality predicate, ==, will only return true if the two values will always be equal. Thus:

 a2 == a4

true

Because the two values are actually aliases, no matter how one changes, the “other” will always change in the same way. Therefore, they will always “print the same”. We can confirm that they are aliases by using Pyret’s reference equality operator, <=>:

 a1 <=> a2

false

 a2 <=> a4

true

In contrast, that guarantee does not apply to a1 and a2; and indeed, at checkpoint 2, we see that they are no longer equal. Hence

 a1 == a2

false

However, there is a time when a1 and a2 do print the same, namely before checkpoint 1. Therefore, Pyret provides another equality operator that checks whether values are equal at the moment, =~. If we ask this before checkpoint 1, we get:

 a1 =~ a2

true

But if we ask the same question at checkpoint 2, we get:

 a1 =~ a2

false

These operators and their funny symbols may be hard to remember, but Pyret also gives them useful (if longer) names, and they can be used as ordinary functions:
 Symbol Function Type Meaning == equal-always Structural If it returns true, they will always be equal, irrespective of any future mutations. =~ equal-now Structural If it returns true they are currently equal, but that may change after future mutations. <=> identical Reference Returns true if the two arguments are aliases, false otherwise.
Thus, before checkpoint 1:

 equal-now(a1, a2)

true

 equal-now(a2, a4)

true

 equal-always(a1, a2)

false

 equal-always(a2, a4)

true

 identical(a1, a2)

false

 identical(a2, a4)

true

After checkpoint 2, we no longer need to check any of the equal-always or identical relationships again, because by definition they cannot change. But we should check equal-now again. Sure enough:

 equal-now(a1, a2)

false

 equal-now(a2, a4)

true

Therefore, in Pyret, the == operator is the same as equal-always. When data contain mutable fields, this will always produce false, because even if the values are structurally equal now, it’s possible that a future mutation will change that. This is to remind you to be careful in the presence of mutation. In situations where we really care only about equality at that instant, we can use =~, i.e., equal-now.

The examples above might suggest that only aliased values are equal-always. This is not true! If our data are immutable (which is the default in the language), then if two values are structurally equal now, they must remain structurally equal forever. For such data, equal-always will return true even when they are not aliases. This is a reminder that we get stronger guarantees about immutable data.

It is worth noting that upto this point we have used equal-alwaysin the form of both == and Pyret’s is in testing—without really bothering to understand very much about how it works, and yet have always gotten predictable answers. This suggests that there is something natural about working with immutable data. In contrast, with mutable data, something has to give. Pyret made a conscious design choice to reflect this in the distinction between equal-always and equal-now. Python made a different choice, which results in “equality” having a perhaps surprising meaning. (Python has no notion of equal-always, only equal-now or =~, which is written as ==, and identical or <=>, which is written as is.)