### 10` `Processing Lists

We have already seen [From Tables to Lists] several examples of
list-processing functions. They have been especially useful for
advanced processing of tables. However, lists arise frequently in
programs, and they do so naturally because so many things in our
lives—

some list functions are generic and operate on any kind of list: e.g., the length of a list is the same irrespective of what kind of values it contains;

some are specific at least to the type of data: e.g., the sum assumes that all the values are numbers (though they may be ages or prices or other information represented by numbers); and

some are somewhere in-between: e.g., a maximum function applies to any list of comparable values, such as numbers or strings.

This seems like a great variety, and we might worry about how we can handle this many different kinds of functions. Fortunately, and perhaps surprisingly, there is one standard way in which we can think about writing all these functions! Understanding and internalizing this process is the goal of this chapter.

#### 10.1` `Making Lists and Taking Them Apart

So far we’ve seen one way to make a list: by writing
`[list: …]`

. While useful, writing lists this way actually
hides their true nature. Every list actually has two parts: a
first element and the rest of the list. The rest of the
list is itself a list, so it too has two parts…and so on.

`[list: 1, 2, 3]`

. Its first element is `1`

, and
the rest of it is `[list: 2, 3]`

. For this second list, the first element
is `2`

and the rest is `[list: 3]`

.
Do Now!

Take apart this third list.

`3`

and the rest is
`[list: ]`

, i.e., the empty list. In Pyret, we have another way
of writing the empty list: `empty`

.Lists are an instance of structured data: data with component parts and a well-defined format for the shape of the parts. Lists are formatted by the first element and the rest of the elements. Tables are somewhat structured: they are formatted by rows and columns, but the column names aren’t consistent across all tables. Structured data is valuable in programming because a predictable format (the structure) lets us write programs based on that structure. What do we mean by that?

`first`

and
`rest`

. We use an accessor by writing an expression, followed by
a dot (`.`

), followed by the accessor’s name. As we saw with
tables, the dot means "dig into". Thus:
```
l1 = [list: 1, 2, 3]
e1 = l1.first
l2 = l1.rest
e2 = l2.first
l3 = l2.rest
e3 = l3.first
l4 = l3.rest
check:
e1 is 1
e2 is 2
e3 is 3
l2 is [list: 2, 3]
l3 is [list: 3]
l4 is empty
end
```

Do Now!

What are the accessors for tables?

Accessors give a way to take data apart based on their structure
(there is another way that we will see shortly). Is there a way to
also build data based on its structure? So far, we have been
building lists using the `[list: ...]`

form, but that doesn’t
emphasize the structural constraint that the `rest`

is itself a
list. A structured operator for building lists would clearly show both
a `first`

element and a `rest`

that is itself a
list. Operators for building structured data are called
constructors.

The constructor for lists is called `link`

. It takes two
arguments: a `first`

element, and the list to build on (the
`rest`

part). Here’s an example of using `link`

to create a
three-element list.

`link(1, link(2, link(3, empty)))`

The `link`

form creates the same underlying list datum as our
previous `[list: ...]`

operation, as confirmed by the following
check:

```
check:
[list: 1, 2, 3] is link(1, link(2, link(3, empty)))
end
```

Do Now!

Look at these two forms of writing lists: what differences do you notice?

Do Now!

Use the

`link`

form to write a four-element list of fruits containing`"lychee"`

,`"dates"`

,`"mango"`

, and`"durian"`

.

After doing this exercise, you might wonder why anyone would use the
`link`

form: it’s more verbose, and makes the individual elements
harder to discern. This form is not very convenient to
humans. But it will prove very valuable to programs!

`link`

form highlights that we really have
two different structures of lists.
Some lists are empty. All other lists are non-empty
lists, meaning they have at least one `link`

. There may be
more interesting structure to some lists (as we will see later), but all lists have this much
in common. Specifically, a list is either
empty (written

`empty`

or`[list: ]`

), ornon-empty (written

`link(…, …)`

or`[list: ]`

with at least one value inside the brackets), where the rest is also a list (and hence may in turn be empty or non-empty, …).

Lists can be empty or non-empty

Non-empty lists have a first element and a rest of the list

#### 10.2` `Some Example Exercises

To illustrate our thinking, let’s work through a few concrete examples
of list-processing functions. All of these will consume lists; some
will even produce them. Some will transform their inputs (like
`map`

), some will select from their inputs (like `filter`

),
and some will aggregate their inputs. Since some of these functions already exist in
Pyret, we’ll name them with the prefix `my-`

to avoid
errors.Be sure to use the `my-`

name consistently,
including inside the body of the function. As we will see, there is a
standard strategy that we can use to approach writing all of these
functions: having you learn this strategy is the goal of this chapter.

#### 10.3` `Structural Problems with Scalar Answers

Let’s write out examples for a few of the functions described
above. We’ll approach writing examples in a very specific, stylized
way. First of all, we should always construct at least two examples:
one with `empty`

and the other with at least one `link`

, so
that we’ve covered the two very broad kinds of lists. Then, we should
have more examples specific to the kind of list stated in the
problem. Finally, we should have even more examples to illustrate how
we think about solving the problem.

##### 10.3.1` ``my-len`

: Examples

We have’t precisely defined what it means to be “the length” of a
list. We confront this right away when trying to write an
example. What is the length of the list `empty`

?

Do Now!

What do you think?

`0`

and `1`

. The latter, `1`

,
certainly looks reasonable. However, if you write the list as
`[list: ]`

, now it doesn’t look so right: this is clearly (as the
name `empty`

also suggests) an empty list, and an empty
list has zero elements in it. Therefore, it’s conventional to
declare that
`my-len(empty) is 0`

`[list: 7]`

? Well, it’s clearly got one
element (`7`

) in it, so
`my-len([list: 7]) is 1`

`[list: 7, 8, 9]`

, we would say
`my-len([list: 7, 8, 9]) is 3`

`[list: 7, 8, 9]`

. Its first element is `7`

and
the rest of it is `[list: 8, 9]`

. Well, `7`

is a number, not
a list; but `[list: 8, 9]`

certainly is a list, so we can ask for
its length. What is `my-len([list: 8, 9])`

? It has two
elements, so
`my-len([list: 8, 9]) is 2`

`8`

while its rest is
`[list: 9]`

. What is its length?
Note that we asked a very similar question before, for the length of
the list `[list: 7]`

. But `[list: 7]`

is not a
sub-list of `[list: 7, 8, 9]`

, which we started with,
whereas `[list: 9]`

is. And using the same reasoning as before,
we can say
`my-len([list: 9]) is 1`

`0`

.`empty`

in its
other form, here’s what we get:
```
my-len([list: 7, 8, 9]) is 3
my-len([list: 8, 9]) is 2
my-len([list: 9]) is 1
my-len([list: ]) is 0
```

```
my-len([list: 7, 8, 9]) is 1 + 2
my-len([list: 8, 9]) is 1 + 1
my-len([list: 9]) is 1 + 0
my-len([list: ]) is 0
```

`2`

, `1`

, and `0`

on the right sides of
each `+`

operation come from? Those are the lengths of the
`rest`

component of the input list. In the previous example
block, we wrote those lengths as explicit examples. Let’s substitute
the numbers `2`

, `1`

, and `0`

with the `my-len`

expressions that produce them:
```
my-len([list: 7, 8, 9]) is 1 + my-len([list: 8, 9])
my-len([list: 8, 9]) is 1 + my-len([list: 9])
my-len([list: 9]) is 1 + my-len([list: ])
my-len([list: ]) is 0
```

`0`

. For a non-empty list, it’s the sum of `1`

(the first element’s “contribution” to the list’s length) to the
length of the rest of the list. In other words, we can use the result
of computing `my-len`

on the rest of the list to compute the
answer for the entire list.Do Now!

Each of our examples in this section has written a different check on the expression`my-len([list: 7, 8, 9])`

. Here are those examples presented together, along with one last one that explicitly uses the`rest`

operation:`my-len([list: 7, 8, 9]) is 3 my-len([list: 7, 8, 9]) is 1 + 2 my-len([list: 7, 8, 9]) is 1 + my-len([list: 8, 9]) my-len([list: 7, 8, 9]) is 1 + my-len([list: 7, 8, 9].rest)`

Check that you agree with each of these assertions. Also check whether you understand how the right-hand side of each`is`

expression derives from the right-hand-side just above it. The goal of this exercise is to make sure that you believe that the last check (which we will turn into code) is equivalent to the first (which we wrote down when understanding the problem).

##### 10.3.2` ``my-sum`

: Examples

Let’s repeat this process of developing examples on a second function,
this time one that computes the sum of the elements in a list of numbers.
What is the
sum of the list `[list: 7, 8, 9]`

? Just adding up the numbers by hand,
the result should be `24`

. Let’s see how that works out through
the examples.

```
my-sum([list: 7, 8, 9]) is 7 + 8 + 9
my-sum([list: 8, 9]) is 8 + 9
my-sum([list: 9]) is 9
```

```
my-sum([list: 7, 8, 9]) is 7 + my-sum([list: 8, 9])
my-sum([list: 8, 9]) is 8 + my-sum([list: 9])
my-sum([list: 9]) is 9 + my-sum([list: ])
```

`0`

:Zero is called the additive identity: a
fancy way of saying, adding zero to any number N gives you
N. Therefore, it makes sense that it would be the length of the
empty list, because the empty list has no items to contribute to a
sum. Can you figure out what the multiplicative identity is?
`my-sum(empty) is 0`

Observe, again, how we can use the result of computing `my-sum`

of the rest of the list to compute its result for the whole list.

##### 10.3.3` `From Examples to Code

Having developed these examples, we now want to use them to develop a program that can compute the length or the sum of any list, not just the specific ones we used in these examples. As we have done up in earlier chapters, we will leverage patterns in the examples to figure out how to define the general-purpose function.

`my-len`

, this time
making the `rest`

explicit on the right-hand sides of `is`

:
```
my-len([list: 7, 8, 9]) is 1 + my-len([list: 7, 8, 9].rest)
my-len([list: 8, 9]) is 1 + my-len([list: 8, 9].rest)
my-len([list: 9]) is 1 + my-len([list: 9].rest)
my-len([list: ]) is 0
```

`[list: ]`

(`empty`

) case. So let’s separate this into two sets of examples:```
my-len([list: 7, 8, 9]) is 1 + my-len([list: 7, 8, 9].rest)
my-len([list: 8, 9]) is 1 + my-len([list: 8, 9].rest)
my-len([list: 9]) is 1 + my-len([list: 9].rest)
my-len([list: ]) is 0
```

`someList`

), we compute its length via the
expression:`1 + my-len(someList.rest)`

`my-len`

program needs to determine whether
its input list is empty or non-empty, using this expression with
`.rest`

in the non-empty case. How do we indicate different code
based on the structure of the list?`cases`

which is used to distinguish
different forms within a structured datatype. When working with lists,
the general shape of a `cases`

expression is:
```
cases (List) e:
| empty => …
| link(f, r) => … f … r …
end
```

`e`

is an expression whose value needs to be a list; it could be a variable bound to a list, or some complex expression that evaluates to a list.`f`

and`r`

are names given to the first and rest of the list. You can choose any names you like, though in Pyret, it’s conventional to use`f`

and`r`

.Occasionally using different names can help students recall that they can choose how to label the`first`

and`rest`

components. This can be particularly useful for`first`

, which has a problem-specific meaning (such as`price`

in a list of prices, and so on).

`=>`

is an expression.Here’s how `cases`

works in this instance. Pyret first evaluates
`e`

. It then checks that the resulting value truly is a list;
otherwise it halts with an error. If it is a list, Pyret examines what
kind of list it is. If it’s an empty list, it runs the
expression after the `=>`

in the `empty`

clause. Otherwise,
the list is not empty, which means it has a first and rest; Pyret
binds `f`

and `r`

to the two parts, respectively, and then
evaluates the expression after the `=>`

in the `link`

clause.

Exercise

Try using a non-list—

e.g., a number— in the `e`

position and see what happens!

`cases`

to define `my-len`

:
```
fun my-len(l):
cases (List) l:
| empty => 0
| link(f, r) => 1 + my-len(r)
end
end
```

`my-len`

produces `0`

; when it is not empty, we add one to the length of
the rest of the list (here, `r`

).Note that while our most recent collection of `my-len`

examples
explicitly said `.rest`

, when using `cases`

we instead use
just the name `r`

, which Pyret has already defined (under the
hood) to be `l.rest`

.

`my-sum`

:
```
fun my-sum(l):
cases (List) l:
| empty => 0
| link(f, r) => f + my-sum(r)
end
end
```

Strategy: Developing Functions Over Lists

Leverage the structure of lists and the power of concrete examples to develop list-processing functions.

Pick a concrete list with (at least) three elements. Write a sequence of examples for each of the entire list and each suffix of the list (including the empty list).

Rewrite each example to express its expected answer in terms of the

`first`

and`rest`

data of its input list. You don’t have to use the`first`

and`rest`

operators in the new answers, but you should see the`first`

and`rest`

values represented explicitly in the answer.Look for a pattern across the answers in the examples. Use these to develop the code: write a

`cases`

expression, filling in the right side of each`=>`

based on your examples.This strategy applies to structured data in general, leveraging components of each datum rather than specifically`first`

and`rest`

as presented so far.

#### 10.4` `Structural Problems that Transform Lists

Now that we have a systematic way to develop functions that take lists as input, let’s apply that same strategy to functions that produce a list as the answer.

##### 10.4.1` ``my-doubles`

: Examples and Code

As always, we’ll begin with some examples. Given a list of numbers, we want a list that doubles each number (in the order of the original list). Here’s a reasonable example with three numbers:

`my-doubles([list: 3, 5, 2]) is [list: 6, 10, 4]`

As before, let’s write out the answers for each suffix of our example
list as well, including for the `empty`

list:

```
my-doubles([list: 5, 2]) is [list: 10, 4]
my-doubles([list: 2]) is [list: 4]
my-doubles([list: ]) is [list: ]
```

Now, we rewrite the answer expressions to include the concrete
`first`

and `rest`

data for each example. Let’s start with
just the `first`

data, and just on the first example:

```
my-doubles([list: 3, 5, 2]) is [list: 3 * 2, 10, 4]
my-doubles([list: 5, 2]) is [list: 10, 4]
my-doubles([list: 2]) is [list: 4]
my-doubles([list: ]) is [list: ]
```

Next, let’s include the `rest`

data (`[list: 5, 2]`

) in the
first example. The current answer in the first example is

`[list: 3 * 2, 10, 4]`

and that `[list: 10, 4]`

is the result of using the function on
`[list: 5, 2]`

. We might therefore be tempted to replace the
right side of the first example with:

`[list: 3 * 2, my-doubles([list: 5, 2])]`

Do Now!

What value would this expression produce? You might want to try this example that doesn’t use`my-doubles`

directly:`[list: 3 * 2, [list: 10, 4]]`

Oops! We want a single (flat) list, not a list-within-a-list. This
feels like it is on the right track in terms of reworking the answer
to use the `first`

and `rest`

values, but we’re clearly not
quite there yet.

Do Now!

What value does the following expression produce?`link(3 * 2, [list: 10, 4])`

Notice the difference between the two expressions in these last two
exercises: the latter used `link`

to put the value involving
`first`

into the conversion of the `rest`

, while the former
tried to do this with `list:`

.

Do Now!

How many elements are in the lists that result from each of the following expressions?

`[list: 25, 16, 32] [list: 25, [list: 16, 32]] link(25, [list: 16, 32])`

Do Now!

Summarize the difference between how

`link`

and`list:`

combine an element and a list. Try additional examples at the interactions prompt if needed to explore these ideas.

The takeaway here is that we use `link`

to insert an
element into an existing list, whereas we use `list:`

to make a
new list that contains the old list as an element. Going back
to our examples, then, we include `rest`

in the first example by
writing it as follows:

```
my-doubles([list: 3, 5, 2]) is link(3 * 2, [list: 10, 4])
my-doubles([list: 5, 2]) is [list: 10, 4]
my-doubles([list: 2]) is [list: 4]
my-doubles([list: ]) is [list: ]
```

which we then convert to

```
my-doubles([list: 3, 5, 2]) is link(3 * 2, my-doubles([list: 5, 2])
my-doubles([list: 5, 2]) is [list: 10, 4]
my-doubles([list: 2]) is [list: 4]
my-doubles([list: ]) is [list: ]
```

Applying this idea across the examples, we get:

```
my-doubles([list: 3, 5, 2]) is link(3 * 2, my-doubles([list: 5, 2])
my-doubles([list: 5, 2]) is link(5 * 2, my-doubles([list: 2])
my-doubles([list: 2]) is link(2 * 2, my-doubles([list: ])
my-doubles([list: ]) is [list: ]
```

Now that we have examples that explicitly use the `first`

and
`rest`

elements, we can produce to write the `my-doubles`

function:

```
fun my-doubles(l):
cases (List) l:
| empty => empty
| link(f, r) =>
link(f * 2, my-doubles(r))
end
end
```

##### 10.4.2` ``my-str-len`

: Examples and Code

`my-doubles`

, the input and output lists have the same type of
element. Functions can also produce lists whose contents have a
different type from the input list. Let’s work through an example.
Given a list of strings, we
want the lengths of each string (in the same order as in the input list). Thus, here’s a
reasonable example:
`my-str-len([list: "hi", "there", "mateys"]) is [list: 2, 5, 6]`

```
my-str-len([list: "there", "mateys"]) is [list: 5, 6]
my-str-len([list: "mateys"]) is [list: 6]
```

```
my-str-len([list: "hi", "there", "mateys"]) is link(2, [list: 5, 6])
my-str-len([list: "there", "mateys"]) is link(5, [list: 6])
my-str-len([list: "mateys"]) is link(6, [list: ])
```

`empty`

:
`my-str-len(empty) is empty`

The next step is to rework the answers in the examples to make the
`first`

and `rest`

parts explicit. Hopefully by now you are
starting to detect a pattern: The result on the rest of the list
appears explicitly as another example. Therefore, we’ll start by
getting the `rest`

value of each example input into the answer:

```
my-str-len([list: "hi", "there", "mateys"]) is link(2, my-str-len([list: "there", "mateys"]))
my-str-len([list: "there", "mateys"]) is link(5, my-str-len([list: "mateys"]))
my-str-len([list: "mateys"]) is link(6, my-str-len([list: ]))
my-str-len(list: ]) is [list: ]
```

All that remains now is to figure out how to work the `first`

values into the outputs. In the context of this problem, this means we
need to convert `"hi"`

into `2`

, `"there"`

into
`5`

, and so on. From the problem statement, we know that `2`

and `5`

are meant to be the lengths (character counts) of the
corresponding strings. The operation that determines the length of a
string is called `string-length`

. Thus, our examples appear as:

```
my-str-len([list: "hi", "there", "mateys"]) is link(string-length("hi"), my-str-len([list: "there", "mateys"]))
my-str-len([list: "there", "mateys"]) is link(string-length("there"), my-str-len([list: "mateys"]))
my-str-len([list: "mateys"]) is link(string-length("mateys"), my-str-len([list: ]))
my-str-len(list: ]) is [list: ]
```

From here, we write a function that captures the pattern developed across our examples:

```
fun my-str-len(l):
cases (List) l:
| empty => empty
| link(f, r) =>
link(string-length(f), my-str-len(r))
end
end
```

#### 10.5` `Structural Problems that Select from Lists

In the previous section, we saw functions that transform list elements (by doubling numbers or counting characters). The type of the output list may or may not be the same as the type of the input list. Other functions that produce lists instead select elements: every element in the output list was in the input list, but some input-list elements might not appear in the output list. This section adapts our method of deriving functions from examples to accommodate selection of elements.

##### 10.5.1` ``my-pos-nums`

: Examples and Code

As our first example, we will select the positive numbers from a list that contains both positive and non-positive numbers.

Do Now!

Construct the sequence of examples that we obtain from the input

`[list: 1, -2, 3, -4]`

.

```
my-pos-nums([list: 1, -2, 3, -4]) is [list: 1, 3]
my-pos-nums([list: -2, 3, -4]) is [list: 3]
my-pos-nums([list: 3, -4]) is [list: 3]
my-pos-nums([list: -4]) is [list: ]
my-pos-nums([list: ]) is [list: ]
```

```
my-pos-nums([list: 1, -2, 3, -4]) is link(1, [list: 3])
my-pos-nums([list: -2, 3, -4]) is [list: 3]
my-pos-nums([list: 3, -4]) is link(3, [list: ])
my-pos-nums([list: -4]) is [list: ]
my-pos-nums([list: ]) is [list: ]
```

```
my-pos-nums([list: 1, -2, 3, -4]) is link(1, my-pos-nums([list: -2, 3, -4]))
my-pos-nums([list: -2, 3, -4]) is my-pos-nums([list: 3, -4])
my-pos-nums([list: 3, -4]) is link(3, my-pos-nums([list: -4]))
my-pos-nums([list: -4]) is my-pos-nums([list: ])
my-pos-nums([list: ]) is [list: ]
```

`link`

, while others simply process the `rest`

of the
list. Whenever we need different shapes of outputs across a set of
examples, we will need an `if`

expression in our code to
distinguish the conditions that yield each shape.```
my-pos-nums([list: 1, -2, 3, -4]) is link(1, my-pos-nums([list: -2, 3, -4]))
my-pos-nums([list: 3, -4]) is link(3, my-pos-nums([list: -4]))
my-pos-nums([list: -2, 3, -4]) is my-pos-nums([list: 3, -4])
my-pos-nums([list: -4]) is my-pos-nums([list: ])
```

`link`

have a
positive number in the `first`

position, while the ones that
don’t simply process the `rest`

of the list. That indicates that
our `if`

expression needs to ask whether the `first`

element
in the list is positive. This yields the following program:
```
fun my-pos-nums(l):
cases (List) l:
| empty => empty
| link(f, r) =>
if f > 0:
link(f, my-pos-nums(r))
else:
my-pos-nums(r)
end
end
end
```

Do Now!

Is our set of examples comprehensive?

Not really. There are many examples we haven’t considered, such
as lists that end with positive numbers and lists with `0`

.

Exercise

Work through these examples and see how they affect the program!

##### 10.5.2` ``my-alternating`

:
Examples and Code

Do Now!

Work out the results for

`my-alternating`

starting from the list`[list: 1, 2, 3, 4, 5, 6]`

.

```
check:
my-alternating([list: 1, 2, 3, 4, 5, 6]) is [list: 1, 3, 5]
my-alternating([list: 2, 3, 4, 5, 6]) is [list: 2, 4, 6]
my-alternating([list: 3, 4, 5, 6]) is [list: 3, 5]
my-alternating([list: 4, 5, 6]) is [list: 4, 6]
end
```

What would something like this look like in code? Before we try to write the function, let’s rewrite the first example in terms of the third:

```
my-alternating([list: 1, 2, 3, 4, 5, 6]) is [list: 1, 3, 5]
my-alternating([list: 3, 4, 5, 6]) is [list: 3, 5]
my-alternating([list: 1, 2, 3, 4, 5, 6]) is link(1, my-alternating([list: 3, 4, 5, 6]))
```

Note that in the rewritten version, we are dropping two
elements from the list before using `my-alternating`

again, not
just one. We will have to figure out how to handle that in our code.

Let’s start with our usual function pattern with a `cases`

expression:

```
fun my-alternating(l):
cases (List) l:
| empty => [list:]
| link(f, r) => link(f, … r …)
end
end
```

`my-alternating`

on `r`

,
because `r`

excludes only one item from the list, not two as this
problem requires. We have to break down `r`

as well, in order to
get to the `rest`

of the `rest`

of the original list. To do
this, we use another `cases`

expression, nested within the first
`cases`

expression:
```
fun my-alternating(l):
cases (List) l:
| empty => [list:]
| link(f, r) =>
cases (List) r: # note: deconstructing r, not l
| empty => ??? # note the ???
| link(fr, rr) =>
# fr = first of rest, rr = rest of rest
link(f, my-alternating(rr))
end
end
end
```

`empty`

case of the inner `cases`

expression (marked by `???`

in the code).A common temptation at this point is to replace the `???`

with
`[list:]`

. After all, haven’t we always returned `[list:]`

in
the `empty`

cases?

Do Now!

Replace`???`

with`[list:]`

and test the program on our original examples:`my-alternating([list: 1, 2, 3, 4, 5, 6]) is [list: 1, 3, 5] my-alternating([list: 2, 3, 4, 5, 6]) is [list: 2, 4, 6] my-alternating([list: 3, 4, 5, 6]) is [list: 3, 5] my-alternating([list: 4, 5, 6]) is [list: 4, 6]`

What do you observe?

Oops! We’ve written a program that appears to work on lists with an
even number of elements, but not on lists with an odd number of
elements. How did that happen? The only part of this code that we
guessed at was how to fill in the `empty`

case of the inner
`cases`

, so the issue must be there. Rather than focus on the
code, however, focus on the examples. We need a simple example
that would land on that part of the code. We get to that spot when the
list `l`

is not empty, but `r`

(the rest of `l`

) is
empty. In other words, we need an example with only one element.

Do Now!

Finish the following example:`my-alternating([list: 5]) is ???`

`my-alternating([list: 5]) is [list: 5]`

Do Now!

Use this example to update the result of

`my-alternating`

when`r`

is`empty`

in our code.

`my-alternating`

is as follows:
```
fun my-alternating(l):
cases (List) l:
| empty => empty
| link(f, r) =>
cases (List) r: # note: deconstructing r, not l
| empty => # the list has an odd number of elements
[list: f]
| link(fr, rr) =>
# fr = first of rest, rr = rest of rest
link(f, my-alternating(rr))
end
end
end
```

Don’t skip the small examples: the result of a list-processing function on the

`empty`

case won’t always be`empty`

.If a problem asks you to work with multiple elements from the front of a list, you can nest

`cases`

expressions to access later elements.

#### 10.6` `Structural Problems Over Relaxed Domains

##### 10.6.1` ``my-max`

: Examples

`[list: 1, 2, 3]`

a
good example? Well, there’s nothing wrong with it, but we should also
consider lists where the maximum at the beginning rather than at the
end; the maximum might be in the middle; the maximum might be
repeated; the maximum might be negative; and so on. While not
comprehensive, here is a small but interesting set of examples:
```
my-max([list: 1, 2, 3]) is 3
my-max([list: 3, 2, 1]) is 3
my-max([list: 2, 3, 1]) is 3
my-max([list: 2, 3, 1, 3, 2]) is 3
my-max([list: 2, 1, 4, 3, 2]) is 4
my-max([list: -2, -1, -3]) is -1
```

`my-max(empty)`

?
Do Now!

Could we define

`my-max(empty)`

to be`0`

? Returning`0`

for the empty list has worked well twice already!

`num-max`

already defined in Pyret, that compares two numbers:
```
num-max(1, 2) is 2
num-max(-1, -2) is -1
```

Exercise

Suppose

`num-max`

were not already built in. Can you define it? You will find what you learned about Booleans handy. Remember to write some tests!

`my-max`

at work:
```
my-max([list: 1, 2, 3]) is 3
my-max([list: 2, 3]) is 3
my-max([list: 3]) is 3
```

`empty`

.```
my-max([list: 3, 2, 1]) is 3
my-max([list: 2, 1]) is 2
my-max([list: 1]) is 1
```

```
my-max([list: 2, 1, 4, 3, 2]) is 4
my-max([list: 1, 4, 3, 2]) is 4
my-max([list: 4, 3, 2]) is 4
my-max([list: 3, 2]) is 3
my-max([list: 2]) is 2
```

```
my-max([list: 2, 1, 4, 3, 2]) is num-max(2, 4)
my-max([list: 1, 4, 3, 2]) is num-max(1, 4)
my-max([list: 4, 3, 2]) is num-max(4, 3)
my-max([list: 3, 2]) is num-max(3, 2)
my-max([list: 2]) is …
```

`my-max([list: 2]) is num-max(2, …)`

`num-max`

. Therefore, leaving out that dodgy case, we’re left
with
```
my-max([list: 2, 1, 4, 3, 2]) is num-max(2, my-max([list: 1, 4, 3, 2]))
my-max([list: 1, 4, 3, 2]) is num-max(1, my-max([list: 4, 3, 2]))
my-max([list: 4, 3, 2]) is num-max(4, my-max([list: 3, 2]))
my-max([list: 3, 2]) is num-max(3, my-max([list: 2]))
```

`empty`

case. The real
problem is that we don’t have a maximum for the empty list: for any
number we might provide, there is always a number bigger than it
(assuming our computer is large enough) that could have been the
answer instead. In short, it’s nonsensical to ask for the maximum (or
minimum) of the empty list: the concept of “maximum” is only defined
on non-empty lists! That is, when asked for the maximum of an empty
list, we should signal an error:
`my-max(empty) raises ""`

##### 10.6.2` ``my-max`

: From Examples to Code

```
fun my-max(l):
cases (List) l:
| empty => raise("not defined for empty lists")
| link(f, r) => num-max(f, my-max(r))
end
end
```

Do Now!

What’s wrong with this?

`[list: 2]`

. This turns into
`num-max(2, my-max([list: ]))`

`my-max`

, is check whether the rest of the list is
empty. If it is, we do not want to call `my-max`

at all. That is:
```
fun my-max(l):
cases (List) l:
| empty => raise("not defined for empty lists")
| link(f, r) =>
cases (List) r:
| empty => …
| …
end
end
end
```

`l`

is empty, our examples above tell us
that the maximum is the first element in the list. Therefore, we can
fill this in:
```
fun my-max(l):
cases (List) l:
| empty => raise("not defined for empty lists")
| link(f, r) =>
cases (List) r:
| empty => f
| …
end
end
end
```

`my-max`

. If the list
is not empty, however, our examples above tell us that `my-max`

will give us the maximum of the rest of the list, and we just need to
compare this answer with the first element (`f`

):
```
fun my-max(l):
cases (List) l:
| empty => raise("not defined for empty lists")
| link(f, r) =>
cases (List) r:
| empty => f
| else => num-max(f, my-max(r))
end
end
end
```

#### 10.7` `More Structural Problems with Scalar Answers

##### 10.7.1` ``my-avg`

: Examples

Let’s now try to compute the average of a list of numbers. Let’s start
with the example list `[list: 1, 2, 3, 4]`

and work out more
examples from it. The average of numbers in this list is clearly
`(1 + 2 + 3 + 4)/4`

, or `10/4`

.

`[list: 2, 3, 4]`

, and the rest of that is `[list: 3, 4]`

,
and so on. The resulting averages are:
```
my-avg([list: 1, 2, 3, 4]) is 10/4
my-avg([list: 2, 3, 4]) is 9/3
my-avg([list: 3, 4]) is 7/2
my-avg([list: 4]) is 4/1
```

The average of the remainder of the list is

`9/3`

, i.e.,`3`

.The first number in the list is

`1`

.

`10/4`

? If it’s not clear to you, don’t worry: with just those
two pieces of information, it’s impossible!`1`

, and the average of the rest of the list
is `2`

. Here are two very different lists that fit this
description:
```
[list: 1, 2] # the rest has one element with sum 2
[list: 1, 4, 0] # the rest has two elements with sum 4
```

`3/2`

, while the average
of the entire second list is `5/3`

, and the two are not the same.That is, to compute the average of a whole list, it’s not even useful to
know the average of the rest of the list. Rather, we need to
know the sum and the length of the rest of the
list. With these two, we can add the first to the sum, and `1`

to
the length, and compute the new average.

`average`

function that
returns all this information. Instead, it will be a lot simpler to
simply decompose the task into two smaller tasks. After all, we
have already seen how to compute the length and how to compute the
sum. The average, therefore, can just use these existing functions:
```
fun my-avg(l):
my-sum(l) / my-len(l)
end
```

Do Now!

What should be the average of the empty list? Does the above code produce what you would expect?

Just as we argued earlier about the maximum [Structural Problems Over Relaxed Domains], the average of the empty list isn’t a well-defined concept. Therefore, it would be appropriate to signal an error. The implementation above does this, but poorly: it reports an error on division. A better programming practice would be to catch this situation and report the error right away, rather than hoping some other function will report the error.

Exercise

Alter

`my-avg`

above to signal an error when given the empty list.

Therefore, we see that the process we’ve used—`10/4`

, `9/3`

, and `7/2`

, which correspond to the
sum of the numbers divided by the length. Thus, writing the answers in
this form (as opposed, for instance, to writing the second of those as
`3`

) already reveals a structure for a solution.

#### 10.8` `Structural Problems with Accumulators

##### 10.8.1` ``my-running-sum`

: First Attempt

One more time, we’ll begin with an example.

Do Now!

Work out the results for

`my-running-sum`

starting from the list`[list: 1, 2, 3, 4, 5]`

.

```
check:
my-running-sum([list: 1, 2, 3, 4, 5]) is [list: 1, 3, 6, 10, 15]
my-running-sum([list: 2, 3, 4, 5]) is [list: 2, 5, 9, 14]
my-running-sum([list: 3, 4, 5]) is [list: 3, 7, 12]
end
```

```
my-running-sum([list: 1, 2, 3, 4, 5]) is [list: 1, 3, 6, 10, 15]
my-running-sum([list: 2, 3, 4, 5]) is [list: 2, 5, 9, 14]
my-running-sum([list: 3, 4, 5]) is [list: 3, 7, 12]
```

`link`

ing the first element to the front. In principle, we can
compute this solution directly, but for
now that may be more work than finding a simpler way to answer it.)##### 10.8.2` ``my-running-sum`

: Examples and Code

Recall how we began in `my-running-sum`

: First Attempt. Our
examples [<running-sum-egs-1>] showed the following
problem. When we process the rest of the list, we have forgotten
everything about what preceded it. That is, when processing the list
starting at `2`

we forget that we’ve seen a `1`

earlier;
when starting from `3`

, we forget that we’ve seen both `1`

and `2`

earlier; and so on. In other words, we keep
forgetting the past. We need some way of avoiding that.

`my-rs`

. It
will consume a list of numbers and produce a list of numbers, but in
addition it will also take the sum of numbers preceding the
current list.
Do Now!

What should the initial sum be?

`0`

. The type of `my-rs`

is
`my-rs :: Number, List<Number> -> List<Number>`

`my-rs`

instead. The examples use the `+`

operator to append two lists into one (the elements of the first list
followed by the elements of the second):
```
my-rs( 0, [list: 1, 2, 3, 4, 5]) is [list: 0 + 1] + my-rs( 0 + 1, [list: 2, 3, 4, 5])
my-rs( 1, [list: 2, 3, 4, 5]) is [list: 1 + 2] + my-rs( 1 + 2, [list: 3, 4, 5])
my-rs( 3, [list: 3, 4, 5]) is [list: 3 + 3] + my-rs( 3 + 3, [list: 4, 5])
my-rs( 6, [list: 4, 5]) is [list: 6 + 4] + my-rs( 6 + 4, [list: 5])
my-rs(10, [list: 5]) is [list: 10 + 5] + my-rs(10 + 5, [list: ])
my-rs(15, [list: ]) is empty
```

`my-rs`

translates into the following code:
```
fun my-rs(acc, l):
cases (List) l:
| empty => empty
| link(f, r) =>
new-sum = acc + f
link(new-sum, my-rs(new-sum, r))
end
end
```

`my-running-sum`

:
```
fun my-running-sum(l):
my-rs(0, l)
end
```

Observe that we do not change `my-running-sum`

itself to take
extra arguments. The correctness of our code depends on the initial
value of `acc`

being 0. If we added a parameter for `acc`

,
any code that calls `my-running-sum`

could supply an unexpected
value, which would distort the result. In addition, since the value is
fixed, adding the parameter would amount to shifting additional (and
needless) work onto others who use our code.

##### 10.8.3` ``my-alternating`

: Examples and Code

Recall our examples in `my-alternating`

:
Examples and Code. There, we
noticed that the code built on every-other example. We might have
chosen our examples differently, so that from one example to the next
we skipped two elements rather than one.
Here we will see another way to think about the same problem.

`my-alternating`

to traverse the list essentially two elements at a time. Another option is to traverse it just one
element at a time, but keeping track of whether we’re at an odd
or even element—`Boolean`

to do it. Let’s define a new function for this purpose:
`my-alt :: List<Any>, Boolean -> List<Any>`

`true`

we `link`

it to the answer;
otherwise we ignore it. As we process the rest of the list, however,
we have to remember to update the accumulator: if we kept an element
we don’t wish to keep the next one, and vice versa.
```
fun my-alt(l, keep):
cases (List) l:
| empty => empty
| link(f, r) =>
if keep:
link(f, my-alt(r, false))
else:
my-alt(r, true)
end
end
end
```

`true`

:
```
fun my-alternating(l):
my-alt(l, true)
end
```

Exercise

Define

`my-max`

using an accumulator. What does the accumulator represent? Do you encounter any difficulty?

#### 10.9` `Dealing with Multiple Answers

Our discussion above has assumed there is only one answer for a given input. This is often true, but it also depends on how the problem is worded and how we choose to generate examples. We will study this in some detail now.

##### 10.9.1` ``uniq`

: Problem Setup

Consider the task of writing `uniq`

:uniq is the
name of a Unix utility with similar behavior; hence the spelling of
the name. given a list of values, it produces a collection of the
same elements while avoiding any duplicates (hence `uniq`

, short
for “unique”).

Consider the following input: `[list: 1, 2, 1, 3, 1, 2, 4, 1]`

.

Do Now!

What is the sequence of examples this input generates? It’s really important you stop and try to do this by hand. As we will see there are multiple solutions, and it’s useful for you to consider what you generate. Even if you can’t generate a sequence, trying to do so will better prepare you for what you read next.

How did you obtain your example? If you just “thought about it for a moment and wrote something down”, you may or may not have gotten something you can turn into a program. Programs can only proceed systematically; they can’t “think”. So, hopefully you took a well-defined path to computing the answer.

##### 10.9.2` ``uniq`

: Examples

It turns out there are several possible answers, because we have (intentionally) left the problem unspecified. Suppose there are two instances of a value in the list; which one do we keep, the first or the second? On the one hand, since the two instances must be equivalent it doesn’t matter, but it does for writing concrete examples and deriving a solution.

```
examples:
uniq([list: 1, 2, 1, 3, 1, 2, 4, 1]) is [list: 3, 2, 4, 1]
uniq([list: 2, 1, 3, 1, 2, 4, 1]) is [list: 3, 2, 4, 1]
uniq([list: 1, 3, 1, 2, 4, 1]) is [list: 3, 2, 4, 1]
uniq([list: 3, 1, 2, 4, 1]) is [list: 3, 2, 4, 1]
uniq([list: 1, 2, 4, 1]) is [list: 2, 4, 1]
uniq([list: 2, 4, 1]) is [list: 2, 4, 1]
uniq([list: 4, 1]) is [list: 4, 1]
uniq([list: 1]) is [list: 1]
uniq([list: ]) is [list: ]
end
```

`uniq([list: 1, 2, 1, 3, 1, 2, 4, 1]) is [list: 1, 2, 3, 4]`

`uniq([list: 1, 2, 1, 3, 1, 2, 4, 1]) is [list: 4, 3, 2, 1]`

##### 10.9.3` ``uniq`

: Code

What is the systematic approach that gets us to this answer?
When given a non-empty list, we split it into its first element and
the rest of the list. Suppose we have the answer to `uniq`

applied to the rest of the list. Now we can ask: is the first element
in the rest of the list? If it is, then we can ignore it, since it is
certain to be in the `uniq`

of the rest of the list. If, however,
it is not in the rest of the list, it’s critical that we `link`

it to the answer.

This translates into the following program. For the empty list, we return the empty list. If the list is non-empty, we check whether the first is in the rest of the list. If it is not, we include it; otherwise we can ignore it for now.

```
fun uniq-rec(l :: List<Any>) -> List<Any>:
cases (List) l:
| empty => empty
| link(f, r) =>
if r.member(f):
uniq-rec(r)
else:
link(f, uniq-rec(r))
end
end
end
```

`uniq-rec`

instead of `uniq`

to
differentiate it from other versions of `uniq`

.Exercise

Note that we’re using

`.member`

to check whether an element is a member of the list. Write a function`member`

that consumes an element and a list, and tells us whether the element is a member of the list.

Exercise

Uniqueness checking has many practical applications. For example, one might have a list of names of people who have registered to vote in an election. To keep the voting fair, with only one vote allowed per person, we should remove duplicate names from the list.

Propose a set of examples for a function

`rem-duplicate-voters`

that takes a list of voter names and returns a list in which duplicate registrations have been removed. In developing your examples, consider real-world scenarios that you can imagine arising when identifying duplicate names. Can you identify cases in which two names might appear to be the same person, but not be? Cases in which two names might appear different but be referring to the same person?What might you need to change about our current

`uniq-rec`

function to handle a situation like removing duplicate voters?

Responsible Computing: Context Matters When Comparing Values

The data de-duplication context in the above exercise reminds us that different contexts may call for different notions of when two data values are the same. Sometimes, we want exact matching to determine that two strings are equal. Sometimes, we need methods that normalize data, either in simple ways like capitalization or subtler ways based on middle initials. Sometimes, we need more information (like street addresses in addition to names) in order to determine whether two items in a list should be considered “the same”.

It is easy to write programs that encode assumptions about our data that might not apply in practice. This is again a situation that can be helped by thinking about the concrete examples on which your code needs to work in context.

##### 10.9.4` ``uniq`

: Reducing Computation

```
fun uniq-rec2(l :: List<Any>) -> List<Any>:
cases (List) l:
| empty => empty
| link(f, r) =>
ur = uniq-rec2(r)
if r.member(f):
ur
else:
link(f, ur)
end
end
end
```

`uniq-rec(r)`

to before the
conditional, we have actually changed the program’s behavior in a
subtle way (by creating what are known as tail calls, which you
can learn about in a programming-languages course).You might think, because we replaced two function calls with one, that
we’ve reduced the amount of computation the program does. It does not!
The two function calls are both in the two branches of the same
conditional; therefore, for any given list element, only one or the
other call to `uniq`

happens. In fact, in both cases, there was
one call to `uniq`

before, and there is one now. So we have
reduced the number of calls in the source program, but not the number
that take place when the program runs. In that sense, the name of this
section was intentionally misleading!

`uniq-rec2`

. We currently check
whether `f`

is a member of `r`

, which is the list of
all the remaining elements. In our example, this means that in
the very second turn, we check whether `2`

is a member of the list
`[list: 1, 3, 1, 2, 4, 1]`

. This is a list of six elements,
including three copies of `1`

. We compare `2`

against
two copies of `1`

. However, we gain nothing from the
second comparison. Put differently, we can think of `uniq(r)`

as
a “summary” of the rest of the list that is exactly as good as
`r`

itself for checking membership, with the advantage that it
might be significantly shorter. This, of course, is exactly what
`ur`

represents. Therefore, we can encode this intuition as
follows:
```
fun uniq-rec3(l :: List<Any>) -> List<Any>:
cases (List) l:
| empty => empty
| link(f, r) =>
ur = uniq-rec3(r)
if ur.member(f):
ur
else:
link(f, ur)
end
end
end
```

`ur`

rather than in `r`

.Exercise

Later [Predicting Growth] we will study how to formally study how long a program takes to run. By the measure introduced in that section, does the change we just made make any difference? Be careful with your answer: it depends on how we count “the length” of the list.

Observe that if the list never contained duplicates in the first
place, then it wouldn’t matter which list we check membership in—`uniq`

in the first place! We will return to the issue of
lists and duplicate elements in Sets Appeal.

##### 10.9.5` ``uniq`

: Example and Code Variations

Start with the entire given list and with the empty answer (so far).

For each list element, check whether it’s already in the answer so far. If it is, ignore it, otherwise extend the answer with it.

When there are no more elements in the list, the answer so far is the answer for the whole list.

##### 10.9.6` ``uniq`

: Why Produce a List?

If you go back to the original statement of the `uniq`

problem
[`uniq`

: Problem Setup], you’ll notice it said nothing about what order the
output should have; in fact, it didn’t even say the output needs to be
a list (and hence have an order). In that case, we should think about
whether a list even makes sense for this problem. In fact, if we don’t
care about order and don’t want duplicates (by definition of
`uniq`

), then there is a much simpler solution, which is to
produce a set. Pyret already has sets built in, and converting
the list to a set automatically takes care of duplicates. This is of
course cheating from the perspective of learning how to write
`uniq`

, but it is worth remembering that sometimes the right data
structure to produce isn’t necessarily the same as the one we were
given. Also, later [Sets Appeal], we will see how to build sets
for ourselves (at which point, `uniq`

will look familiar, since
it is at the heart of set-ness).

#### 10.10` `Monomorphic Lists and Polymorphic Types

```
my-len :: List<Any> -> Number
my-max :: List<Any> -> Any
```

`my-max`

. The contract suggests that any kind of element can be
in the input list, but in fact that isn’t true: the input
`[list: 1, "two", 3]`

is not valid, because we can’t compare
`1`

with `"two"`

or `"two"`

with `3`

.Exercise

What happens if we run

`1 > "two"`

or`"two" > 3`

?

Rather, what we mean is a list where all the elements are of the
same kind,Technically, elements that are also comparable.
and the contract has not captured that. Furthermore, we don’t mean
that `my-max`

might return any old type: if we supply it with a
list of numbers, we will not get a string as the maximum element!
Rather, it will only return the kind of element that is in the
provided list.

In short, we mean that all elements of the list are of the same type,
but they can be of any type. We call the former monomorphic:
“mono” meaning one, and “morphic” meaning shape, i.e., all values
have one type. But the function `my-max`

itself can operate over
many of these kinds of lists, so we call it polymorphic
(“poly” meaning many).

`my-max`

will
return an element of that type. We write this syntactically as
follows:
`fun my-max<T>(l :: List<T>) -> T: … end`

`<T>`

says that `T`

is a type variable
parameter that will be used in the rest of the function (both the
header and the body).`my-len`

. Its header now
becomes:
`fun my-len<T>(l :: List<T>) -> Number: … end`

`my-len`

did not actually “care” that whether all the
values were of the same type or not: it never looks at the individual
elements, much less at pairs of them. However, as a convention
we demand that lists always be monomorphic. This is important because
it enables us to process the elements of the list uniformly: if we
know how to process elements of type `T`

, then we will know how
to process a `List<T>`

. If the list elements can be of truly any
old type, we can’t know how to process its elements.