On this page:
A Data-Centric Introduction to Computing
Tuesday, February 21st, 2023 4:43:42pm

A Data-Centric Introduction to Computing

Kathi Fisler, Shriram Krishnamurthi, Benjamin S. Lerner, Joe Gibbs Politz

    I Introduction

      1 Overview

        1.1 What This Book is About

        1.2 The Values That Drive This Book

        1.3 Our Perspective on Data

        1.4 What Makes This Book Unique

        1.5 Who This Book is For

        1.6 The Structure of This Book

        1.7 Organization of the Material

        1.8 Our Programming Language Choice

        1.9 Sending Us Feedback, Errors, and Comments

        1.10 Staying Up-To-Date

      2 Acknowledgments

    II Introduction to Programming

      3 Basic Data

        3.1 Getting Started

          3.1.1 Motivating Example: Flags

          3.1.2 Numbers

          3.1.3 Expressions

          3.1.4 Terminology

          3.1.5 Strings

          3.1.6 Images

   Combining Images

   Making a Flag

          3.1.7 Stepping Back: Types, Errors, and Documentation

   Types and Contracts

   Format and Notation Errors

   Finding Other Functions: Documentation

        3.2 Naming Values

          3.2.1 The Definitions Pane

          3.2.2 Naming Values

   Names Versus Strings

   Expressions versus Statements

          3.2.3 The Program Directory

   Understanding the Run Button

          3.2.4 Using Names to Streamline Building Images

        3.3 From Repeated Expressions to Functions

          3.3.1 Example: Similar Flags

          3.3.2 Defining Functions

   How Functions Evaluate

   Type Annotations


          3.3.3 Functions Practice: Moon Weight

          3.3.4 Documenting Functions with Examples

          3.3.5 Functions Practice: Cost of pens

          3.3.6 Recap: Defining Functions

        3.4 Conditionals and Booleans

          3.4.1 Motivating Example: Shipping Costs

          3.4.2 Conditionals: Computations with Decisions

          3.4.3 Booleans

   Other Boolean Operations

   Combining Booleans

          3.4.4 Asking Multiple Questions

          3.4.5 Evaluating by Reducing Expressions

          3.4.6 Composing Functions

   How Function Compositions Evaluate

   Function Composition and the Directory

          3.4.7 Nested Conditionals

          3.4.8 Recap: Booleans and Conditionals

      4 Tabular Data

        4.1 Introduction to Tabular Data

          4.1.1 Creating Tabular Data

          4.1.2 Extracting Rows and Cell Values

          4.1.3 Functions over Rows

          4.1.4 Processing Rows

   Finding Rows

   Ordering Rows

   Adding New Columns

   Calculating New Column Values

          4.1.5 Examples for Table-Producing Functions

        4.2 Processing Tables

          4.2.1 Cleaning Data Tables

   Loading Data Tables

   Dealing with Missing Entries

   Normalizing Data

   Normalization, Systematically

     Using Programs to Detect Data Errors

          4.2.2 Task Plans

          4.2.3 Preparing Data Tables

   Creating bins

   Splitting Columns

          4.2.4 Managing and Naming Data Tables

          4.2.5 Visualizations and Plots

          4.2.6 Summary: Managing a Data Analysis

      5 Lists

        5.1 From Tables to Lists

          5.1.1 Basic Statistical Questions

          5.1.2 Extracting a Column from a Table

          5.1.3 Understanding Lists

   Lists as Anonymous Data

   Creating Literal Lists

          5.1.4 Operating on Lists

   Built-In Operations on Lists of Numbers

   Built-In Operations on Lists in General

   An Aside on Naming Conventions

   Getting Elements By Position

   Transforming Lists

   Recap: Summary of List Operations

          5.1.5 Lambda: Anonymous Functions

          5.1.6 Combining Lists and Tables

        5.2 Processing Lists

          5.2.1 Making Lists and Taking Them Apart

          5.2.2 Some Example Exercises

          5.2.3 Structural Problems with Scalar Answers

   my-len: Examples

   my-sum: Examples

   From Examples to Code

          5.2.4 Structural Problems that Transform Lists

   my-doubles: Examples and Code

   my-str-len: Examples and Code

          5.2.5 Structural Problems that Select from Lists

   my-pos-nums: Examples and Code

   my-alternating: Examples and Code

          5.2.6 Structural Problems Over Relaxed Domains

   my-max: Examples

   my-max: From Examples to Code

          5.2.7 More Structural Problems with Scalar Answers

   my-avg: Examples

          5.2.8 Structural Problems with Accumulators

   my-running-sum: First Attempt

   my-running-sum: Examples and Code

   my-alternating: Examples and Code

          5.2.9 Dealing with Multiple Answers

   uniq: Problem Setup

   uniq: Examples

   uniq: Code

   uniq: Reducing Computation

   uniq: Example and Code Variations

   uniq: Why Produce a List?

          5.2.10 Monomorphic Lists and Polymorphic Types

        5.3 Recursive Data

          5.3.1 Functions to Process Recursive Data

          5.3.2 A Template for Processing Recursive Data

      6 Structured Data

        6.1 Introduction to Structured Data

          6.1.1 Understanding the Kinds of Compound Data

   A First Peek at Structured Data

   A First Peek at Conditional Data

          6.1.2 Defining and Creating Structured and Conditional Data

   Defining and Creating Structured Data

   Annotations for Structured Data

   Defining and Creating Conditional Data

          6.1.3 Programming with Structured and Conditional Data

   Extracting Fields from Structured Data

   Telling Apart Variants of Conditional Data

   Processing Fields of Variants

        6.2 Collections of Structured Data

          6.2.1 Lists as Collective Data

          6.2.2 Sets as Collective Data

   Picking Elements from Sets

   Computing with Sets

          6.2.3 Combining Structured and Collective Data

          6.2.4 Data Design Problem: Representing Quizzes

      7 Trees

        7.1 Trees

          7.1.1 Data Design Problem – Ancestry Data

   Computing Genetic Parents from an Ancestry Table

   Computing Grandparents from an Ancestry Table

   Creating a Datatype for Ancestor Trees

          7.1.2 Programs to Process Ancestor Trees

          7.1.3 Summarizing How to Approach Tree Problems

          7.1.4 Study Questions

      8 Foundations: Bonus Materials

        8.1 Functions as Data

          8.1.1 A Little Calculus

          8.1.2 A Helpful Shorthand for Anonymous Functions

          8.1.3 Streams From Functions

          8.1.4 Combining Forces: Streams of Derivatives

        8.2 Queues from Lists

          8.2.1 Using a Wrapper Datatype

          8.2.2 Combining Answers

          8.2.3 Using a Picker

          8.2.4 Using Tuples

          8.2.5 A Picker Method

        8.3 Examples, Testing, and Program Checking

          8.3.1 From Examples to Tests

          8.3.2 More Refined Comparisons

          8.3.3 When Tests Fail

          8.3.4 Oracles for Testing

    III From Pyret to Python

      9 From Pyret to Python

        9.1 From Pyret to Python

          9.1.1 Expressions, Functions, and Types

          9.1.2 Returning Values from Functions

          9.1.3 Examples and Test Cases

          9.1.4 An Aside on Numbers

          9.1.5 Conditionals

          9.1.6 Creating and Processing Lists

   Filters, Maps, and Friends

          9.1.7 Data with Components

   Accessing Fields within Dataclasses

          9.1.8 Traversing Lists

   Introducing For Loops

     An Aside on Order of Processing List Elements

   Using For Loops in Functions that Produce Lists

   Summary: The List-Processing Template for Python

        9.2 Dictionaries

          9.2.1 Creating and Using a Dictionary

          9.2.2 Searching Through the Values in a Dictionary

          9.2.3 Dictionaries with More Complex Values

          9.2.4 Dictionaries versus Dataclasses


        9.3 Arrays

          9.3.1 Two Memory Layouts for Ordered Items

          9.3.2 Iterating Partly through an Ordered Datum

      10 Tables in Python via Pandas

        10.1 Introduction to Pandas

          10.1.1 Pandas Table Basics

   Core Datatypes: DataFrame and Series

   Creating and Loading DataFrames

   Using Labels and Indices to Access Cells

          10.1.2 Filtering Rows

          10.1.3 Cleaning and Normalizing Data

   Clearing out unknown values

   Repairing Values and Column Types

          10.1.4 Computing New Columns

          10.1.5 Aggregating and Grouping Columns

          10.1.6 Wide Versus Tall Data

            Converting Between Wide and Tall Data

          10.1.7 Plotting Data

          10.1.8 Takeaways

        10.2 Reshaping Tables

          10.2.1 Binning Rows

          10.2.2 Wide versus Tall Datasets

    IV Programming With State

      11 Programming with State (in Both Pyret and Python)

        11.1 State, Change, and Testing

          11.1.1 Example: Bank Accounts

          11.1.2 Testing Functions that Mutate Data

          11.1.3 Aliasing

          11.1.4 Data Mutation and the Directory

   Introducing the Heap

   Basic Data and the Heap

        11.2 Understanding Equality

          11.2.1 Equality of Data

          11.2.2 Different Equality Operations

   Equality in Python

   Equality in Pyret

        11.3 Arrays and Lists in Memory

        11.4 Cyclic Data

          11.4.1 Creating Cyclic Data

          11.4.2 Testing Cyclic Data

          11.4.3 Cycles in Practice

      12 More Programming with State: Python

        12.1 Modifying Variables

          12.1.1 Modifying Variables in Memory

          12.1.2 Variable Updates and Aliasing

          12.1.3 Updating Variables versus Updating Data Fields

          12.1.4 Updating Parameters in Function Calls

          12.1.5 Updating Top-Level Variables within Function Calls

          12.1.6 The Many Roles of Variables

        12.2 Mutable Lists

    V Algorithm Analysis

      13 Predicting Growth

        13.1 A Little (True) Story

        13.2 The Analytical Idea

        13.3 A Cost Model for Pyret Running Time

        13.4 The Size of the Input

        13.5 The Tabular Method for Singly-Structurally-Recursive Functions

        13.6 Creating Recurrences

        13.7 A Notation for Functions

        13.8 Comparing Functions

        13.9 Combining Big-Oh Without Woe

        13.10 Solving Recurrences

      14 Halloween Analysis

        14.1 A First Example

        14.2 The New Form of Analysis

        14.3 An Example: Queues from Lists

          14.3.1 List Representations

          14.3.2 A First Analysis

          14.3.3 More Liberal Sequences of Operations

          14.3.4 A Second Analysis

          14.3.5 Amortization Versus Individual Operations

        14.4 Reading More

    VI Data Structures with Analysis

      15 Directed Acyclic Graphs

        15.1 Sharing and Equality

          15.1.1 Re-Examining Equality

          15.1.2 The Cost of Evaluating References

          15.1.3 Notations for Equality

          15.1.4 On the Internet, Nobody Knows You’re a DAG

          15.1.5 It’s Always Been a DAG

          15.1.6 From Acyclicity to Cycles

        15.2 The Size of a DAG

          15.2.1 Stage 1

          15.2.2 Stage 2

          15.2.3 Stage 3

          15.2.4 Stage 4

          15.2.5 Stage 5

          15.2.6 What We’ve Learned

          15.2.7 More on Value Printing: An Aside from Racket

      16 Graphs

        16.1 Introducing Graphs

          16.1.1 Understanding Graphs

          16.1.2 Representations

   Links by Name

   Links by Indices

   A List of Edges

   Abstracting Representations

          16.1.3 Measuring Complexity for Graphs

        16.2 Basic Graph Traversals

          16.2.1 Reachability

   Simple Recursion

   Cleaning up the Loop

   Traversal with Memory

   A Better Interface

          16.2.2 Depth- and Breadth-First Traversals

        16.3 Graphs With Weighted Edges

        16.4 Shortest (or Lightest) Paths

        16.5 Moravian Spanning Trees

          16.5.1 The Problem

          16.5.2 A Greedy Solution

          16.5.3 Another Greedy Solution

          16.5.4 A Third Solution

          16.5.5 Checking Component Connectedness

      17 Several Variations on Sets

        17.1 Representing Sets as Lists

          17.1.1 Representation Choices

          17.1.2 Time Complexity

          17.1.3 Choosing Between Representations

          17.1.4 Other Operations

        17.2 Making Sets Grow on Trees

          17.2.1 Using Binary Trees

          17.2.2 Checking the Complexity

          17.2.3 A Fine Balance: Tree Surgery

   Left-Left Case

   Left-Right Case

   Any Other Cases?

        17.3 Union-Find

          17.3.1 Implementing with State

          17.3.2 Optimizations

          17.3.3 Analysis

        17.4 Hashes, Sets, and Key-Values

          17.4.1 A Hash Function for Strings

          17.4.2 Sets from Hashing

          17.4.3 Arrays

          17.4.4 Sets from Hashing and Arrays

          17.4.5 Collisions

          17.4.6 Resolving Collisions

          17.4.7 Complexity

          17.4.8 Bloom Filters

          17.4.9 Generalizing from Sets to Key-Values

        17.5 Equality, Ordering, and Hashing

          17.5.1 Converting Values to Ordered Values

          17.5.2 Hashing in Practice

          17.5.3 Equality and Ordering

        17.6 Sets as a Case Study

          17.6.1 Nature of the Data

          17.6.2 Nature of the Operations

          17.6.3 Nature of the Guarantee

    VII Advanced Topics

      18 State and Equality

        18.1 Boxes: A Canonical Mutable Structure

        18.2 Mutation and Types

        18.3 Mutation and Equality

        18.4 Another Equality Predicate

        18.5 A Hierarchy of Equality

        18.6 Space and Time Complexity

        18.7 What it Means to be Identical

        18.8 Comparing Functions

      19 Recursion and Cycles from Mutation

        19.1 Partial Definitions

        19.2 Recursive Functions

        19.3 Premature Evaluation

        19.4 Cyclic Lists Versus Streams

      20 Detecting Cycles

        20.1 A Running Example

        20.2 Types

        20.3 A First Checker

        20.4 Complexity

        20.5 A Fabulous Improvement

        20.6 Testing

      21 Avoiding Recomputation by Remembering Answers

        21.1 An Interesting Numeric Sequence

          21.1.1 Using State to Remember Past Answers

          21.1.2 From a Tree of Computation to a DAG

          21.1.3 The Complexity of Numbers

          21.1.4 Abstracting Memoization

        21.2 Edit-Distance for Spelling Correction

        21.3 Nature as a Fat-Fingered Typist

        21.4 Dynamic Programming

          21.4.1 Catalan Numbers with Dynamic Programming

          21.4.2 Levenshtein Distance and Dynamic Programming

        21.5 Contrasting Memoization and Dynamic Programming

      22 Partial Domains

        22.1 A Non-Solution

        22.2 Exceptions

        22.3 The Option Type

        22.4 Total Domains, Dynamically

        22.5 Total Domains, Statically

        22.6 Summary

        22.7 A Note on Notation

      23 Staging

        23.1 Problem Definition

        23.2 Initial Solution

        23.3 Refactoring

        23.4 Separating Parameters

        23.5 Context

      24 Factoring Numbers

      25 Deconstructing Loops

        25.1 Setup: Two Functions

        25.2 Abstracting a Loop

        25.3 Is It Really a Loop?

        25.4 Re-Examining for

        25.5 Rewriting Pollard-Rho

        25.6 Nested Loops

        25.7 Loops, Values, and Customization

    VIII Interactive Programs

      26 Interactive Games as Reactive Systems

        26.1 About Reactive Animations

        26.2 Preliminaries

        26.3 Version: Airplane Moving Across the Screen

          26.3.1 Updating the World State

          26.3.2 Displaying the World State

          26.3.3 Observing Time (and Combining the Pieces)

        26.4 Version: Wrapping Around

        26.5 Version: Descending

          26.5.1 Moving the Airplane

          26.5.2 Drawing the Scene

          26.5.3 Finishing Touches

        26.6 Version: Responding to Keystrokes

        26.7 Version: Landing

        26.8 Version: A Fixed Balloon

        26.9 Version: Keep Your Eye on the Tank

        26.10 Version: The Balloon Moves, Too

        26.11 Version: One, Two, ..., Ninety-Nine Luftballons!

    IX Appendices

      27 Appendices

        27.1 Pyret for Racketeers and Schemers

          27.1.1 Numbers, Strings, and Booleans

          27.1.2 Infix Expressions

          27.1.3 Function Definition and Application

          27.1.4 Tests

          27.1.5 Variable Names

          27.1.6 Data Definitions

          27.1.7 Conditionals

          27.1.8 Lists

          27.1.9 First-Class Functions

          27.1.10 Annotations

          27.1.11 What Else?

        27.2 Pyret vs. Python

        27.3 Comparing This Book to HtDP

        27.4 Release Notes

        27.5 Glossary