### 11Introduction to Structured Data

Earlier we had our first look at types. Until now, we have only seen the types that Pyret provides us, which is an interesting but nevertheless quite limited set. Most programs we write will contain many more kinds of data.

#### 11.1Understanding the Kinds of Compound Data

##### 11.1.1A First Peek at Structured Data

There are times when a datum has many attributes, or parts. We need to keep them all together, and sometimes take them apart. For instance:
• An iTunes entry contains a bunch of information about a single song: not only its name but also its singer, its length, its genre, and so on.

• Your GMail application contains a bunch of information about a single message: its sender, the subject line, the conversation it’s part of, the body, and quite a bit more.

In examples like this, we see the need for structured data: a single datum has structure, i.e., it actually consists of many pieces. The number of pieces is fixed, but may be of different kinds (some might be numbers, some strings, some images, and different types may be mixed together in that one datum). Some might even be other structured data: for instance, a date usually has at least three parts, the day, month, and year. The parts of a structured datum are called its fields.

##### 11.1.2A First Peek at Conditional Data

Then there are times when we want to represent different kinds of data under a single, collective umbrella. Here are a few examples:
• A traffic light can be in different states: red, yellow, or green.Yes, in some countries there are different or more colors and color-combinations. Collectively, they represent one thing: a new type called a traffic light state.

• A zoo consists of many kinds of animals. Collectively, they represent one thing: a new type called an animal. Some condition determines which particular kind of animal a zookeeper might be dealing with.

• A social network consists of different kinds of pages. Some pages represent individual humans, some places, some organizations, some might stand for activities, and so on. Collectively, they represent a new type: a social media page.

• A notification application may report many kinds of events. Some are for email messages (which have many fields, as we’ve discussed), some are for reminders (which might have a timestamp and a note), some for instant messages (similar to an email message, but without a subject), some might even be for the arrival of a package by physical mail (with a timestamp, shipper, tracking number, and delivery note). Collectively, these all represent a new type: a notification.

We call these “conditional” data because they represent an “or”: a traffic light is red or green or yellow; a social medium’s page is for a person or location or organization; and so on. Sometimes we care exactly which kind of thing we’re looking at: a driver behaves differently on different colors, and a zookeeper feeds each animal differently. At other times, we might not care: if we’re just counting how many animals are in the zoo, or how many pages are on a social network, or how many unread notifications we have, their details don’t matter. Therefore, there are times when we ignore the conditional and treat the datum as a member of the collective, and other times when we do care about the conditional and do different things depending on the individual datum. We will make all this concrete as we start to write programs.

#### 11.2Defining and Creating Structured and Conditional Data

We have used the word “data” above, but that’s actually been a bit of a lie. As we said earlier, data are how we represent information in the computer. What we’ve been discussing above is really different kinds of information, not exactly how they are represented. But to write programs, we must wrestle concretely with representations. That’s what we will do now, i.e., actually show data representations of all this information.

##### 11.2.1Defining and Creating Structured Data

Let’s start with defining structured data, such as an iTunes song record. Here’s a simplified version of the information such an app might store:
• The song’s name, which is a String.

• The song’s singer, which is also a String.

• The song’s year, which is a Number.

Let’s now introduce the syntax by which we can teach this to Pyret:

data ITunesSong: song(name, singer, year) end

This tells Pyret to introduce a new type of data, in this case called ITunesSongWe follow a convention that types always begin with a capital letter.. The way we actually make one of these data is by calling song with three parameters; for instance:It’s worth noting that music managers that are capable of making distinctions between, say, Dance, Electronica, and Electronic/Dance, classify two of these three songs by a single genre: “World”.
<structured-examples> ::=

song("La Vie en Rose", "Édith Piaf", 1945)
song("Stressed Out", "twenty one pilots", 2015)
song("Waqt Ne Kiya Kya Haseen Sitam", "Geeta Dutt", 1959)

Always follow a data definition with a few concrete instances of the data! This makes sure you actually do know how to make data of that form. Indeed, it’s not essential but a good habit to give names to the data we’ve defined, so that we can use them later:

lver = song("La Vie en Rose", "Édith Piaf", 1945)
so = song("Stressed Out", "twenty one pilots", 2015)
wnkkhs = song("Waqt Ne Kiya Kya Haseen Sitam", "Geeta Dutt", 1959)

In terms of the directory, structured data are no different from simple data. Each of the three definitions above creates an entry in the directory, as follows:

Directory

• lver

song("La Vie en Rose", "Édith Piaf", 1945)

• so

song("Stressed Out", "twenty one pilots", 2015)

• wnkkhs

song("Waqt Ne Kiya Kya Haseen Sitam","Geeta Dutt", 1959)

##### 11.2.2Annotations for Structured Data

Recall that in [Type Annotations] we discussed annotating our functions. Well, we can annotate our data, too! In particular, we can annotate both the definition of data and their creation. For the former, consider this data definition, which makes the annotation information we’d recorded informally in text a formal part of the program:

data ITunesSong: song(name :: String, singer :: String, year :: Number) end

Similarly, we can annotate the variables bound to examples of the data. But what should we write here?

lver :: ___ = song("La Vie en Rose", "Édith Piaf", 1945)

Recall that annotations takes names of types, and the new type we’ve created is called ITunesSong. Therefore, we should write

lver :: ITunesSong = song("La Vie en Rose", "Édith Piaf", 1945)

Do Now!

What happens if we instead write this?

lver :: String = song("La Vie en Rose", "Édith Piaf", 1945)

What error do we get? How about if instead we write these?

lver :: song = song("La Vie en Rose", "Édith Piaf", 1945)
lver :: 1 = song("La Vie en Rose", "Édith Piaf", 1945)

Make sure you familiarize yourself with the error messages that you get.

##### 11.2.3Defining and Creating Conditional Data

The data construct in Pyret also lets us create conditional data, with a slightly different syntax. For instance, say we want to define the colors of a traffic light:

data TLColor:
| Red
| Yellow
| Green
end

Conventionally, the names of the options begin in lower-case, but if they have no additional structure, we often capitalize the initial to make them look different from ordinary variables: i.e., Red rather than red. Each | (pronounced “stick”) introduces another option. You would make instances of traffic light colors as

Red
Green
Yellow

A more interesting and common example is when each condition has some structure to it; for instance:

data Animal:
| boa(name :: String, length :: Number)
| armadillo(name :: String, liveness :: Boolean)
end

“In Texas, there ain’t nothin’ in the middle of the road except yellow stripes and a dead armadillo.”—Jim Hightower We can make examples of them as you would expect:

b1 = boa("Ayisha", 10)
b2 = boa("Bonito", 8)
a1 = armadillo("Glypto", true)

We call the different conditions variants.

Do Now!

How would you annotate the three variable bindings?

Notice that the distinction between boas and armadillos is lost in the annotation.

b1 :: Animal = boa("Ayisha", 10)
b2 :: Animal = boa("Bonito", 8)
a1 :: Animal = armadillo("Glypto", true)

When defining a conditional datum the first stick is actually optional, but adding it makes the variants line up nicely. This helps us realize that our first example

data ITunesSong: song(name, singer, year) end

is really just the same as

data ITunesSong:
| song(name, singer, year)
end

i.e., a conditional type with just one condition, where that one condition is structured.

#### 11.3Programming with Structured and Conditional Data

So far we’ve learned how to create structured and conditional data, but not yet how to take them apart or write any expressions that involve them. As you might expect, we need to figure out how to
• take apart the fields of a structured datum, and

• tell apart the variants of a conditional datum.

As we’ll see, Pyret also gives us a convenient way to do both together.

##### 11.3.1Extracting Fields from Structured Data

Let’s write a function that tells us how old a song is. First, let’s think about what the function consumes (an ITunesSong) and produces (a Number). This gives us a rough skeleton for the function:
<song-age> ::=

fun song-age(s :: ITunesSong) -> Number:
<song-age-body>
end

We know that the form of the body must be roughly:
<song-age-body> ::=

2016 - <get the song year>

We can get the song year by using Pyret’s field access, which is a . followed by a field’s name—in this case, yearfollowing the variable that holds the structured datum. Thus, we get the year field of s (the parameter to song-age) with

s.year

So the entire function body is:

fun song-age(s :: ITunesSong) -> Number:
2016 - s.year
end

It would be good to also record some examples (<structured-examples>), giving us a comprehensive definition of the function:

fun song-age(s :: ITunesSong) -> Number:
2016 - s.year
where:
song-age(lver) is 71
song-age(so) is 1
song-age(wnkkhs) is 57
end

##### 11.3.2Telling Apart Variants of Conditional Data

Now let’s see how we tell apart variants. For this, we again use cases, as we saw for lists. We create one branch for each of the variants. Thus, if we wanted to compute advice for a driver based on a traffic light’s state, we might write:

fun advice(c :: TLColor) -> String:
cases (TLColor) c:
| Red => "wait!"
| Green => "go!"
end
end

Do Now!

What happens if you leave out the =>?

Do Now!

What if you leave out a variant? Leave out the Red variant, then try both advice(Yellow) and advice(Red).

##### 11.3.3Processing Fields of Variants

In this example, the variants had no fields. But if the variant has fields, Pyret expects you to list names of variables for those fields, and will then automatically bind those variables—so you don’t need to use the .-notation to get the field values.

To illustrate this, assume we want to get the name of any animal:
<animal-name> ::=

fun animal-name(a :: Animal) -> String:
<animal-name-body>
end

Because an Animal is conditionally defined, we know that we are likely to want a cases to pull it apart; furthermore, we should give names to each of the fields:Note that the names of the variables do not have to match the names of fields. Conventionally, we give longer, descriptive names to the field definitions and short names to the corresponding variables.

<animal-name-body> ::=

cases (Animal) a:
| boa(n, l) => ...
end

In both cases, we want to return the field n, giving us the complete function:

fun animal-name(a :: Animal) -> String:
cases (Animal) a:
| boa(n, l) => n
end
where:
animal-name(b1) is "Ayisha"
animal-name(b2) is "Bonito"
animal-name(a1) is "Glypto"
end

Let’s look at how Pyret would evaluate a function call like

animal-name(boa("Bonito", 8))

The argument boa("Bonito", 8) is a value. In the same way as we substitute simple data types like strings and numbers for parameters when we evaluate a function, we do the same thing here. After substituting, we are left with the following expression to evaluate:

cases (Animal) boa("Bonito", 8):
| boa(n, l) => n
end
Next, Pyret determines which case matches the data (the first one, for boa, in this case). It then substitutes the field names with the corresponding components of the datum result expression for the matched case. In this case, we will substitute uses of n with "Bonito" and uses of l with 8. In this program, the entire result expression is a use of n, so the result of the program in this case is "Bonito".