On this page:
A Data-Centric Introduction to Computing
Sunday, August 28th, 2022 11:51:31am

A Data-Centric Introduction to Computing

Kathi Fisler, Shriram Krishnamurthi, Benjamin S. Lerner, Joe Gibbs Politz

    I Introduction

      1 Overview

        1.1 What This Book is About

        1.2 The Values That Drive This Book

        1.3 Our Perspective on Data

        1.4 What Makes This Book Unique

        1.5 Who This Book is For

        1.6 The Structure of This Book

        1.7 Organization of the Material

        1.8 Our Programming Language Choice

        1.9 Sending Feedback, Errors, and Comments

      2 Acknowledgments

    II Foundations

      3 Getting Started

        3.1 Motivating Example: Flags

        3.2 Numbers

        3.3 Expressions

        3.4 Terminology

        3.5 Strings

        3.6 Images

          3.6.1 Combining Images

          3.6.2 Making a Flag

        3.7 Stepping Back: Types, Errors, and Documentation

          3.7.1 Types and Contracts

          3.7.2 Format and Notation Errors

          3.7.3 Finding Other Functions: Documentation

      4 Naming Values

        4.1 The Definitions Pane

        4.2 Naming Values

          4.2.1 Names Versus Strings

          4.2.2 Expressions versus Statements

        4.3 The Program Directory

          4.3.1 Understanding the Run Button

        4.4 Using Names to Streamline Building Images

      5 From Repeated Expressions to Functions

        5.1 Example: Similar Flags

        5.2 Defining Functions

          5.2.1 How Functions Evaluate

          5.2.2 Type Annotations

          5.2.3 Documentation

        5.3 Functions Practice: Moon Weight

        5.4 Documenting Functions with Examples

        5.5 Functions Practice: Cost of pens

        5.6 Recap: Defining Functions

      6 Conditionals and Booleans

        6.1 Motivating Example: Shipping Costs

        6.2 Conditionals: Computations with Decisions

        6.3 Booleans

          6.3.1 Other Boolean Operations

          6.3.2 Combining Booleans

        6.4 Asking Multiple Questions

        6.5 Evaluating by Reducing Expressions

        6.6 Composing Functions

          6.6.1 How Function Compositions Evaluate

          6.6.2 Function Composition and the Directory

        6.7 Nested Conditionals

        6.8 Recap: Booleans and Conditionals

      7 Introduction to Tabular Data

        7.1 Creating Tabular Data

        7.2 Extracting Rows and Cell Values

        7.3 Functions over Rows

        7.4 Processing Rows

          7.4.1 Finding Rows

          7.4.2 Ordering Rows

          7.4.3 Adding New Columns

          7.4.4 Calculating New Column Values

        7.5 Examples for Table-Producing Functions

      8 Processing Tables

        8.1 Cleaning Data Tables

          8.1.1 Loading Data Tables

          8.1.2 Dealing with Missing Entries

          8.1.3 Normalizing Data

          8.1.4 Normalization, Systematically

            8.1.4.1 Using Programs to Detect Data Errors

        8.2 Task Plans

        8.3 Preparing Data Tables

          8.3.1 Creating bins

          8.3.2 Splitting Columns

        8.4 Managing and Naming Data Tables

        8.5 Visualizations and Plots

        8.6 Summary: Managing a Data Analysis

      9 From Tables to Lists

        9.1 Basic Statistical Questions

        9.2 Extracting a Column from a Table

        9.3 Understanding Lists

          9.3.1 Lists as Anonymous Data

          9.3.2 Creating Literal Lists

        9.4 Operating on Lists

          9.4.1 Built-In Operations on Lists of Numbers

          9.4.2 Built-In Operations on Lists in General

          9.4.3 An Aside on Naming Conventions

          9.4.4 Getting Elements By Position

          9.4.5 Transforming Lists

          9.4.6 Recap: Summary of List Operations

        9.5 Lambda: Anonymous Functions

        9.6 Combining Lists and Tables

      10 Processing Lists

        10.1 Making Lists and Taking Them Apart

        10.2 Some Example Exercises

        10.3 Structural Problems with Scalar Answers

          10.3.1 my-len: Examples

          10.3.2 my-sum: Examples

          10.3.3 From Examples to Code

        10.4 Structural Problems that Transform Lists

          10.4.1 my-doubles: Examples and Code

          10.4.2 my-str-len: Examples and Code

        10.5 Structural Problems that Select from Lists

          10.5.1 my-pos-nums: Examples and Code

          10.5.2 my-alternating: Examples and Code

        10.6 Structural Problems Over Relaxed Domains

          10.6.1 my-max: Examples

          10.6.2 my-max: From Examples to Code

        10.7 More Structural Problems with Scalar Answers

          10.7.1 my-avg: Examples

        10.8 Structural Problems with Accumulators

          10.8.1 my-running-sum: First Attempt

          10.8.2 my-running-sum: Examples and Code

          10.8.3 my-alternating: Examples and Code

        10.9 Dealing with Multiple Answers

          10.9.1 uniq: Problem Setup

          10.9.2 uniq: Examples

          10.9.3 uniq: Code

          10.9.4 uniq: Reducing Computation

          10.9.5 uniq: Example and Code Variations

          10.9.6 uniq: Why Produce a List?

        10.10 Monomorphic Lists and Polymorphic Types

      11 Introduction to Structured Data

        11.1 Understanding the Kinds of Compound Data

          11.1.1 A First Peek at Structured Data

          11.1.2 A First Peek at Conditional Data

        11.2 Defining and Creating Structured and Conditional Data

          11.2.1 Defining and Creating Structured Data

          11.2.2 Annotations for Structured Data

          11.2.3 Defining and Creating Conditional Data

        11.3 Programming with Structured and Conditional Data

          11.3.1 Extracting Fields from Structured Data

          11.3.2 Telling Apart Variants of Conditional Data

          11.3.3 Processing Fields of Variants

      12 Collections of Structured Data

        12.1 Lists as Collective Data

        12.2 Sets as Collective Data

          12.2.1 Picking Elements from Sets

          12.2.2 Computing with Sets

        12.3 Combining Structured and Collective Data

        12.4 Data Design Problem: Representing Quizzes

      13 Recursive Data

        13.1 Functions to Process Recursive Data

        13.2 A Template for Processing Recursive Data

      14 Trees

        14.1 Data Design Problem – Ancestry Data

          14.1.1 Computing Genetic Parents from an Ancestry Table

          14.1.2 Computing Grandparents from an Ancestry Table

          14.1.3 Creating a Datatype for Ancestor Trees

        14.2 Programs to Process Ancestor Trees

        14.3 Summarizing How to Approach Tree Problems

        14.4 Study Questions

      15 Functions as Data

        15.1 A Little Calculus

        15.2 A Helpful Shorthand for Anonymous Functions

        15.3 Streams From Functions

        15.4 Combining Forces: Streams of Derivatives

      16 Interactive Games as Reactive Systems

        16.1 About Reactive Animations

        16.2 Preliminaries

        16.3 Version: Airplane Moving Across the Screen

          16.3.1 Updating the World State

          16.3.2 Displaying the World State

          16.3.3 Observing Time (and Combining the Pieces)

        16.4 Version: Wrapping Around

        16.5 Version: Descending

          16.5.1 Moving the Airplane

          16.5.2 Drawing the Scene

          16.5.3 Finishing Touches

        16.6 Version: Responding to Keystrokes

        16.7 Version: Landing

        16.8 Version: A Fixed Balloon

        16.9 Version: Keep Your Eye on the Tank

        16.10 Version: The Balloon Moves, Too

        16.11 Version: One, Two, ..., Ninety-Nine Luftballons!

      17 Examples, Testing, and Program Checking

        17.1 From Examples to Tests

        17.2 More Refined Comparisons

        17.3 When Tests Fail

        17.4 Oracles for Testing

    III Foundations: Bonus Materials

      18 Partial Domains

        18.1 A Non-Solution

        18.2 Exceptions

        18.3 The Option Type

        18.4 Total Domains, Dynamically

        18.5 Total Domains, Statically

        18.6 Summary

        18.7 A Note on Notation

      19 Staging

      20 Orderability and Hashing

        20.1 Ordering and Equality

        20.2 Remembering Where We Came From

        20.3 Hashing in Practice

      21 Queues from Lists

        21.1 Using a Wrapper Datatype

        21.2 Combining Answers

        21.3 Using a Picker

        21.4 Using Tuples

        21.5 A Picker Method

    IV Algorithms

      22 Predicting Growth

        22.1 A Little (True) Story

        22.2 The Analytical Idea

        22.3 A Cost Model for Pyret Running Time

        22.4 The Size of the Input

        22.5 The Tabular Method for Singly-Structurally-Recursive Functions

        22.6 Creating Recurrences

        22.7 A Notation for Functions

        22.8 Comparing Functions

        22.9 Combining Big-Oh Without Woe

        22.10 Solving Recurrences

      23 Sets Appeal

        23.1 Representing Sets by Lists

          23.1.1 Representation Choices

          23.1.2 Time Complexity

          23.1.3 Choosing Between Representations

          23.1.4 Other Operations

        23.2 Making Sets Grow on Trees

          23.2.1 Converting Values to Ordered Values

          23.2.2 Using Binary Trees

          23.2.3 A Fine Balance: Tree Surgery

            23.2.3.1 Left-Left Case

            23.2.3.2 Left-Right Case

            23.2.3.3 Any Other Cases?

      24 Halloween Analysis

        24.1 A First Example

        24.2 The New Form of Analysis

        24.3 An Example: Queues from Lists

          24.3.1 List Representations

          24.3.2 A First Analysis

          24.3.3 More Liberal Sequences of Operations

          24.3.4 A Second Analysis

          24.3.5 Amortization Versus Individual Operations

        24.4 Reading More

      25 Sharing and Equality

        25.1 Re-Examining Equality

        25.2 The Cost of Evaluating References

        25.3 Notations for Equality

        25.4 On the Internet, Nobody Knows You’re a DAG

        25.5 It’s Always Been a DAG

        25.6 From Acyclicity to Cycles

      26 Graphs

        26.1 Understanding Graphs

        26.2 Representations

          26.2.1 Links by Name

          26.2.2 Links by Indices

          26.2.3 A List of Edges

          26.2.4 Abstracting Representations

        26.3 Measuring Complexity for Graphs

        26.4 Reachability

          26.4.1 Simple Recursion

          26.4.2 Cleaning up the Loop

          26.4.3 Traversal with Memory

          26.4.4 A Better Interface

        26.5 Depth- and Breadth-First Traversals

        26.6 Graphs With Weighted Edges

        26.7 Shortest (or Lightest) Paths

        26.8 Moravian Spanning Trees

          26.8.1 The Problem

          26.8.2 A Greedy Solution

          26.8.3 Another Greedy Solution

          26.8.4 A Third Solution

          26.8.5 Checking Component Connectedness

    V Algorithms: Bonus Materials

      27 The Size of a DAG

        27.1 Stage 1

        27.2 Stage 2

        27.3 Stage 3

        27.4 Stage 4

        27.5 Stage 5

        27.6 What We’ve Learned

        27.7 More on Value Printing: An Aside from Racket

    VI From Pyret to Python

      28 From Pyret to Python

        28.1 Expressions, Functions, and Types

        28.2 Returning Values from Functions

        28.3 Examples and Test Cases

        28.4 An Aside on Numbers

        28.5 Conditionals

        28.6 Creating and Processing Lists

          28.6.1 Filters, Maps, and Friends

        28.7 Data with Components

          28.7.1 Accessing Fields within Dataclasses

        28.8 Traversing Lists

          28.8.1 Introducing For Loops

            28.8.1.1 An Aside on Order of Processing List Elements

          28.8.2 Using For Loops in Functions that Produce Lists

          28.8.3 Summary: The List-Processing Template for Python

    VII Programming with State

      29 Modifying Structured Data

        29.1 Modifying Fields of Structured Data

        29.2 Modifications to Shared Data

        29.3 Understanding Memory

        29.4 Variables and Equality

        29.5 Basic Data in Memory

      30 Modifying Variables

        30.1 Modifying Variables in Memory

        30.2 Modifying Variables Associated with Lists

        30.3 Writing Functions that Modify Variables

          30.3.1 The global annotation

        30.4 Testing Functions that Modify global Variables

          30.4.1 The Internal Structure of a Test Function

          30.4.2 Takeaways on Testing Modifications

      31 Revisiting Lists and Variables

        31.1 Sharing List Updates

          31.1.1 Operations that Mutate Lists

        31.2 Lists in Memory

        31.3 Practice: Data for Shared Bank Accounts

        31.4 Circular References

          31.4.1 Testing Circular Data

          31.4.2 Revisiting Variables: A Function to Create Accounts for New Customers

        31.5 The Many Roles of Variables

        31.6 Managing All Accounts

      32 Hashtables and Dictionaries

        32.1 Searching by Criteria Other than Keys

        32.2 Dictionaries with More Complex Values

        32.3 Using Structured Data as Keys

    VIII Advanced Topics

      33 Algorithms That Exploit State

        33.1 Disjoint Sets Redux

          33.1.1 Optimizations

          33.1.2 Analysis

        33.2 Set Membership by Hashing Redux

          33.2.1 Improving Access Time

          33.2.2 Better Hashing

          33.2.3 Bloom Filters

        33.3 Avoiding Recomputation by Remembering Answers

          33.3.1 An Interesting Numeric Sequence

            33.3.1.1 Using State to Remember Past Answers

            33.3.1.2 From a Tree of Computation to a DAG

            33.3.1.3 The Complexity of Numbers

            33.3.1.4 Abstracting Memoization

          33.3.2 Edit-Distance for Spelling Correction

          33.3.3 Nature as a Fat-Fingered Typist

          33.3.4 Dynamic Programming

            33.3.4.1 Catalan Numbers with Dynamic Programming

            33.3.4.2 Levenshtein Distance and Dynamic Programming

          33.3.5 Contrasting Memoization and Dynamic Programming

    IX Appendices

      34 Pyret for Racketeers and Schemers

        34.1 Numbers, Strings, and Booleans

        34.2 Infix Expressions

        34.3 Function Definition and Application

        34.4 Tests

        34.5 Variable Names

        34.6 Data Definitions

        34.7 Conditionals

        34.8 Lists

        34.9 First-Class Functions

        34.10 Annotations

        34.11 What Else?

      35 Pyret vs. Python

      36 Comparing This Book to HtDP

      37 Release Notes

      38 Glossary