11.3 Arrays and Lists in Memory

In Understanding Equality, we drew memory diagrams to show how values appear in the heap. At that time, we looked only at structured data. Now we will look at aggregate data: lists in Python and arrays in Pyret.

Both Python lists and Pyret arrays are stored in memory with a starting value that indicates how many values there are, followed by the actual elements in subsequent addresses. For instance, suppose we write the following value:

Python

sl = ['bread', 'coffee', 'eggs']

Pyret

sl = [array: 'bread', 'coffee', 'eggs']

Here’s what memory might look like:

Directory

  • sl

      1001

Heap

  • 1001: (length:3)

  • 1002: "bread"

  • 1003: "coffee"

  • 1004: "eggs"

If these terms mean anything to you: Python’s default list implementation is array-based, not a linked list. So at 1001 there’s an indication of how many entries follow, and the next memory locations have those values. There are no directory entries for the individual elements, but they can be reached by referring to sl followed by an offset: for instance,

Python

Pyret

sl[2]

sl.get-now(2)

Internally, this turns into: “obtain the address of sl, then add 1 and the offset 2 to it”. This produces the address 1001 + 1 + 2 = 1004. Looking up the value stored at address 1004, in both languages, produces 'eggs'.You may find it odd that the offsets begin at 0. While it is indeed confusing—the “first” value is at offset 0, the “third” value at offset 2, and so on—this is a convention both Python and Pyret chose to be consistent with most other programming languages.

Similarly, suppose we try to change the shopping list’s content to replace the second value with 'tea'. Recall that the second value is at offset 1:

Python

Pyret

sl[1] = 'tea'

sl.set-now(1, 'tea')

Again, we obtain the address of sl, which is 1001; add 1 and the offset (1) to it; this gives us the address 1003. Now, we modify the value at 1003 to be the new value:

Directory

  • sl

      1001

Heap

  • 1001: (length:3)

  • 1002: "bread"

  • 1003: "tea"

  • 1004: "eggs"

If we now ask the language for the list as a whole, we see the change: in Python,

sl

['bread', 'tea', 'eggs']

and in Pyret,

sl

[array: "bread", "tea", "eggs"]

Observe that what we have learned about aliasing applies here, too. Suppose Shaunae and Jonella share a shopping list, where sl is Shaunae’s and Jonella writes:

Python

Pyret

jl = sl

jl = sl

Now jl and sl are aliases for the same data in the heap:

Directory

  • sl

      1001

  • jl

      1001

Heap

  • 1001: (length:3)

  • 1002: "bread"

  • 1003: "tea"

  • 1004: "eggs"

Thus, modifying the list jl has the same impact as modifying it via sl:

Python

Pyret

jl[0] = 'butter'

jl.set-now(0, 'butter')

means the memory looks like

Directory

  • sl

      1001

  • jl

      1001

Heap

  • 1001: (length:3)

  • 1002: "butter"

  • 1003: "tea"

  • 1004: "eggs"

Thus, if we ask for the value of jl, we will see in Python:

sl

['butter', 'tea', 'eggs']

and in Pyret:

sl

[array: "butter", "tea", "eggs"]

even though we did not make a modification through the name sl.