23 Staging

23.1 Problem Definition

Earlier, we saw a detailed development of binary trees representing ancestry [Creating a Datatype for Ancestor Trees]. In what follows we don’t need a lot of detail, so we will give ourselves a simplified version of essentially the same data definition:

data ABT:
  | unknown
  | person(name :: String, bm :: ABT, bf :: ABT)
end

We can then write functions over such as this:

fun abt-size(p :: ABT):
  doc: "Compute the number of known people in the ancestor tree"
  cases (ABT) p:
    | unknown => 0
    | person(n, p1, p2) => 1 + abt-size(p1) + abt-size(p2)
  end
end

Now let’s think about a slightly different function: how-many-named, which tells us how many people in a family have a particular name. Not only can more than one person have the same name, in some cultures it’s not uncommon to use the same name across generations, either in successive generations or skipping one.

Do Now!
What is the contract for how-many-named? The contract for this function will be crucial, so make sure you do this step!

Here is one meaningful contract:

how-many-named :: ABT, String -> Number

It takes a tree in which to search, a name to search for, and returns a count.

Do Now!
Define how-many-named.

23.2 Initial Solution

Presumably you ended up with something like this:

fun how-many-named(p :: ABT, looking-for :: String):
  cases (ABT) p:
    | unknown => 0
    | person(n, p1, p2) =>
      if n == looking-for:
        1 + how-many-named(p1, looking-for) + how-many-named(p2, looking-for)
      else:
        how-many-named(p1, looking-for) + how-many-named(p2, looking-for)
      end
  end
end

Let’s say you have defined this person:

p =
  person("A",
    person("B",
      person("A", unknown, unknown),
      person("C",
        person("A", unknown, unknown),
        person("B", unknown, unknown))),
    person("C", unknown, unknown))

With that, we can write a test like

check:
  how-many-named(p, "A") is 3
end

23.3 Refactoring

Now let’s apply some transformations, sometimes called code refactorings, to this function.

First, notice the repeated expression. What the whole conditional is essentially saying is that we want to know how much this person is contributing to the overall count; the rest of the count stays the same regardless. We can therefore write this more concisely (and, if we know how to read such code, more meaningfully) as the following instead:

fun how-many-named(p :: ABT, looking-for :: String):
  cases (ABT) p:
    | unknown => 0
    | person(n, p1, p2) =>
      (if n == looking-for: 1 else: 0 end)
      +
      how-many-named(p1, looking-for) + how-many-named(p2, looking-for)
  end
end

If you have prior programming experience, this may look a bit odd to you, but if is in fact an expression, which has a value; in this case the value is either 0 or 1. This value can then be used in an addition.

Now let’s look at this code even more closely. Notice something interesting. We keep passing two parameters to how-many-named; however, only one of those parameters (p) is actually changing. The name we are looking for does not change, as we would expect: we are looking for the same name in the entire tree. How can we reflect this in the code?

First, we’ll do something that looks a little useless, but it’s also an innocent change, so it shouldn’t irk us too much: we’ll change the order of the arguments.

fun how-many-named(looking-for :: String, p :: ABT):
  cases (ABT) p:
    | unknown => 0
    | person(n, p1, p2) =>
      (if n == looking-for: 1 else: 0 end)
      +
      how-many-named(looking-for, p1) + how-many-named(looking-for, p2)
  end
end

What we have now done is put the “constant” argument first, and the “varying” argument second. That is, our contract has changed from

how-many-named :: ABT, String -> Number

how-many-named :: String, ABT -> Number

so the way we call it also has to change: how-many-named("A", p) instead. Observe that the function calls inside how-many-named have also been altered in the same way.

23.4 Separating Parameters

This sets us up for the next stage. The parameters of functions are meant to indicate what might vary in a function. Because the name we’re looking for is a constant once we initially have it, we’d like the actual search function to take only one argument: where in the tree we’re searching. That is, we want its contract to be (ABT -> Number). To achieve that, we need another function that will take the String part.

We achieve this by breaking up the parameter list:

fun how-many-named(looking-for :: String) -> (ABT -> Number):
  lam(p :: ABT) -> Number:
    cases (ABT) p:
      | unknown => 0
      | person(n, p1, p2) =>
        (if n == looking-for: 1 else: 0 end)
        +
        how-many-named(looking-for, p1) + how-many-named(looking-for, p2)
    end
  end
end

Thus, the contract has become

how-many-named :: String -> (ABT -> Number)

where how-many-named consumes a name and returns a function that will consume the actual tree to check.

However, this function body is not okay: the Pyret type-checker will give us type errors. That’s because how-many-named takes one parameter, not two, as in the two recursive calls.

How do we fix this? Remember, the whole point of this change is we don’t want to change the name, only the tree. That means we want to recur on the inner function. We currently can’t do this because it doesn’t have a name! So we have to give it a name:

fun how-many-named(looking-for :: String) -> (ABT -> Number):
  fun search-in(p :: ABT) -> Number:
    cases (ABT) p:
      | unknown => 0
      | person(n, p1, p2) =>
        (if n == looking-for: 1 else: 0 end)
        +
        search-in(p1) + search-in(p2)
    end
  end
end

This now lets us recur on just the part that should vary, leaving the name we’re looking for unchanged (and hence, fixed for the duration of the search). However, the above body has a syntax error, because how-many-named does not seem to return any kind of value.

What should it return? Once we provide the function with a name, we should get back a function that searches for that name in a tree. But we already have exactly such a function: search-in. Therefore, how-many-named should just … return search-in.

fun how-many-named(looking-for :: String) -> (ABT -> Number):
  fun search-in(p :: ABT) -> Number:
    cases (ABT) p:
      | unknown => 0
      | person(n, p1, p2) =>
        (if n == looking-for: 1 else: 0 end)
        +
        search-in(p1) + search-in(p2)
    end
  end
  search-in
end

And we’re set! We would call this function as

how-many-named("A")(p) is 3

The outer function application returns a function value (bound to search-in), is then applied to the particular tree.

23.5 Context

The transformation we just applied is generally called currying, in honor of Haskell Curry, who was one of the early people to describe it, though it was earlier discovered by Moses Schönfinkel and even earlier by Gottlob Frege. The particular use of currying here, where we move more “static” arguments earlier and more “dynamic” ones later, and split on the static-dynamic divide, is called staging. It’s a very useful programming technique, and furthermore, one that enables some compilers to produce more time-efficient programs.

Even more subtly but importantly, the staged computation tells a different story than the unstaged one, and we can read this off just from the contract:

how-many-named :: (String, ABT -> Number)
how-many-named :: (String -> (ABT -> Number))

The first one says the string could co-vary with the person. The second one rules out that interpretation.

Do Now!
Is the former useful? When might we have the name also changing?

Imagine a slightly different problem: we want to know how often a child has the same name as a parent. Then, as we traverse the tree, as the name of the person (potentially) keeps changing, the name we’re looking for in the parent also changes.

Exercise
Write this function.

In contrast, the staged type rules out that interpretation and that behavior. In that way, it sends a signal to the reader about how the computation might behave just from the type. In the same way, the unstaged type can be read as giving the reader a hint that the behavior could depend on both parameters changing, therefore accommodating a much broader range of behaviors (e.g., checking for parent-child or grandparent-child name reuse).

There’s another very nice example of staging here: A Little Calculus.

Finally, it’s worth knowing that some languages, like Haskell and OCaml, do this transformation automatically. In fact, they don’t even have multiple-parameter functions: what look like multiple arguments are actually a sequence of staged functions. This can, in extremis, lead to a very elegant and powerful programming style. Pyret chose to not do this because, while this is a powerful tool in the hands of advanced programmers, for less experienced programmers, finding out about a mismatch in the number of parameters and arguments is very useful.

contents ← prev up next →

I	Introduction
II	Introduction to Programming
III	From Pyret to Python
IV	Programming With State
V	Algorithm Analysis
VI	Data Structures with Analysis
VII	Advanced Topics
VIII	Interactive Programs
IX	Appendices

18	State and Equality
19	Recursion and Cycles from Mutation
20	Detecting Cycles
21	Avoiding Recomputation by Remembering Answers
22	Partial Domains
23	Staging
24	Factoring Numbers
25	Deconstructing Loops

23.1	Problem Definition
23.2	Initial Solution
23.3	Refactoring
23.4	Separating Parameters
23.5	Context