I Introduction |
1 Overview |
1.1 What This Book is About |
1.2 The Values That Drive This Book |
1.3 Our Perspective on Data |
1.4 What Makes This Book Unique |
1.5 Who This Book is For |
1.6 The Structure of This Book |
1.7 Organization of the Material |
1.8 Our Programming Language Choice |
1.9 Sending Us Feedback, Errors, and Comments |
1.10 Staying Up-To-Date |
2 Acknowledgments |
|
II Introduction to Programming |
3 Basic Data |
3.1 Getting Started |
3.1.1 Motivating Example: Flags |
3.1.2 Numbers |
3.1.3 Expressions |
3.1.4 Terminology |
3.1.5 Strings |
3.1.6 Images |
3.1.6.1 Combining Images |
3.1.6.2 Making a Flag |
3.1.7 Stepping Back: Types, Errors, and Documentation |
3.1.7.1 Types and Contracts |
3.1.7.2 Format and Notation Errors |
3.1.7.3 Finding Other
Functions: Documentation |
3.2 Naming Values |
3.2.1 The Definitions Pane |
3.2.2 Naming Values |
3.2.2.1 Names Versus Strings |
3.2.2.2 Expressions versus
Statements |
3.2.3 The Program Directory |
3.2.3.1 Understanding the Run Button |
3.2.4 Using Names to Streamline Building Images |
3.3 From Repeated Expressions to Functions |
3.3.1 Example: Similar Flags |
3.3.2 Defining Functions |
3.3.2.1 How Functions Evaluate |
3.3.2.2 Type Annotations |
3.3.2.3 Documentation |
3.3.3 Functions Practice: Moon Weight |
3.3.4 Documenting Functions with Examples |
3.3.5 Functions Practice: Cost of pens |
3.3.6 Recap: Defining Functions |
3.4 Conditionals and Booleans |
3.4.1 Motivating Example: Shipping Costs |
3.4.2 Conditionals: Computations with Decisions |
3.4.3 Booleans |
3.4.3.1 Other Boolean Operations |
3.4.3.2 Combining Booleans |
3.4.4 Asking Multiple Questions |
3.4.5 Evaluating by Reducing Expressions |
3.4.6 Composing Functions |
3.4.6.1 How Function Compositions Evaluate |
3.4.6.2 Function Composition and the Directory |
3.4.7 Nested Conditionals |
3.4.8 Recap: Booleans and Conditionals |
4 Tabular Data |
4.1 Introduction to Tabular Data |
4.1.1 Creating Tabular Data |
4.1.2 Extracting Rows and Cell Values |
4.1.3 Functions over Rows |
4.1.4 Processing Rows |
4.1.4.1 Finding Rows |
4.1.4.2 Ordering Rows |
4.1.4.3 Adding New Columns |
4.1.4.4 Calculating New Column Values |
4.1.5 Examples for Table-Producing Functions |
4.2 Processing Tables |
4.2.1 Cleaning Data Tables |
4.2.1.1 Loading Data Tables |
4.2.1.2 Dealing with Missing Entries |
4.2.1.3 Normalizing Data |
4.2.1.4 Normalization, Systematically |
4.2.1.5 Using Programs to Detect Data Errors |
4.2.2 Task Plans |
4.2.3 Preparing Data Tables |
4.2.3.1 Creating bins |
4.2.3.2 Splitting Columns |
4.2.4 Managing and Naming Data Tables |
4.2.5 Visualizations and Plots |
4.2.6 Summary: Managing a Data Analysis |
5 Lists |
5.1 From Tables to Lists |
5.1.1 Basic Statistical Questions |
5.1.2 Extracting a Column from a Table |
5.1.3 Understanding Lists |
5.1.3.1 Lists as Anonymous Data |
5.1.3.2 Creating Literal Lists |
5.1.4 Operating on Lists |
5.1.4.1 Built-In Operations on Lists of Numbers |
5.1.4.2 Built-In Operations on Lists in General |
5.1.4.3 An Aside on Naming Conventions |
5.1.4.4 Getting Elements By Position |
5.1.4.5 Transforming Lists |
5.1.4.6 Recap: Summary of List Operations |
5.1.5 Lambda: Anonymous Functions |
5.1.6 Combining Lists and Tables |
5.2 Processing Lists |
5.2.1 Making Lists and Taking Them Apart |
5.2.2 Some Example Exercises |
5.2.3 Structural Problems with Scalar Answers |
5.2.3.1 my-len : Examples |
5.2.3.2 my-sum : Examples |
5.2.3.3 From Examples to Code |
5.2.4 Structural Problems that Transform Lists |
5.2.4.1 my-doubles : Examples and Code |
5.2.4.2 my-str-len : Examples and Code |
5.2.5 Structural Problems that Select from Lists |
5.2.5.1 my-pos-nums : Examples and Code |
5.2.5.2 my-alternating :
Examples and Code |
5.2.6 Structural Problems Over Relaxed Domains |
5.2.6.1 my-max : Examples |
5.2.6.2 my-max : From Examples to Code |
5.2.7 More Structural Problems with Scalar Answers |
5.2.7.1 my-avg : Examples |
5.2.8 Structural Problems with Accumulators |
5.2.8.1 my-running-sum : First Attempt |
5.2.8.2 my-running-sum : Examples and Code |
5.2.8.3 my-alternating : Examples and Code |
5.2.9 Dealing with Multiple Answers |
5.2.9.1 uniq : Problem Setup |
5.2.9.2 uniq : Examples |
5.2.9.3 uniq : Code |
5.2.9.4 uniq : Reducing Computation |
5.2.9.5 uniq : Example and Code Variations |
5.2.9.6 uniq : Why Produce a List? |
5.2.10 Monomorphic Lists and Polymorphic Types |
5.3 Recursive Data |
5.3.1 Functions to Process Recursive Data |
5.3.2 A Template for Processing Recursive Data |
5.3.3 The Design Recipe |
6 Structured Data |
6.1 Introduction to Structured Data |
6.1.1 Understanding the Kinds of Compound Data |
6.1.1.1 A First Peek at Structured Data |
6.1.1.2 A First Peek at Conditional Data |
6.1.2 Defining and Creating Structured and Conditional Data |
6.1.2.1 Defining and Creating Structured Data |
6.1.2.2 Annotations for Structured Data |
6.1.2.3 Defining and Creating Conditional Data |
6.1.3 Programming with Structured and Conditional Data |
6.1.3.1 Extracting Fields from Structured Data |
6.1.3.2 Telling Apart Variants of Conditional Data |
6.1.3.3 Processing Fields of Variants |
6.2 Collections of Structured Data |
6.2.1 Lists as Collective Data |
6.2.2 Sets as Collective Data |
6.2.2.1 Picking Elements from Sets |
6.2.2.2 Computing with Sets |
6.2.3 Combining Structured and Collective Data |
6.2.4 Data Design Problem: Representing Quizzes |
7 Trees |
7.1 Trees |
7.1.1 Data Design Problem – Ancestry Data |
7.1.1.1 Computing Genetic Parents from an Ancestry Table |
7.1.1.2 Computing Grandparents from an Ancestry Table |
7.1.1.3 Creating a Datatype for Ancestor Trees |
7.1.2 Programs to Process Ancestor Trees |
7.1.3 Summarizing How to Approach Tree Problems |
7.1.4 Study Questions |
8 Foundations: Bonus Materials |
8.1 Functions as Data |
8.1.1 A Little Calculus |
8.1.2 A Helpful Shorthand for Anonymous Functions |
8.1.3 Streams From Functions |
8.1.4 Combining Forces: Streams of Derivatives |
8.2 Queues from Lists |
8.2.1 Using a Wrapper Datatype |
8.2.2 Combining Answers |
8.2.3 Using a Picker |
8.2.4 Using Tuples |
8.2.5 A Picker Method |
8.3 Examples, Testing, and Program Checking |
8.3.1 From Examples to Tests |
8.3.2 More Refined Comparisons |
8.3.3 When Tests Fail |
8.3.4 Oracles for Testing |
|
III From Pyret to Python |
9 From Pyret to Python |
9.1 From Pyret to Python |
9.1.1 Expressions, Functions, and Types |
9.1.2 Returning Values from Functions |
9.1.3 Examples and Test Cases |
9.1.4 An Aside on Numbers |
9.1.5 Conditionals |
9.1.6 Creating and Processing
Lists |
9.1.6.1 Filters, Maps, and Friends |
9.1.7 Data with Components |
9.1.7.1 Accessing Fields within Dataclasses |
9.1.8 Traversing Lists |
9.1.8.1 Introducing For Loops |
9.1.8.2 An Aside on Order of Processing List Elements |
9.1.8.3 Using For Loops in Functions that Produce Lists |
9.1.8.4 Summary: The List-Processing Template for Python |
9.2 Dictionaries |
9.2.1 Creating and Using a Dictionary |
9.2.2 Searching Through the Values in a Dictionary |
9.2.3 Dictionaries with More Complex Values |
9.2.4 Dictionaries versus Dataclasses |
Summary |
9.3 Arrays |
9.3.1 Two Memory Layouts for Ordered Items |
9.3.2 Iterating Partly through an Ordered Datum |
10 Tables in Python via Pandas |
10.1 Introduction to Pandas |
10.1.1 Pandas Table Basics |
10.1.1.1 Core Datatypes: DataFrame and Series |
10.1.1.2 Creating and Loading DataFrames |
10.1.1.3 Using Labels and Indices to Access Cells |
10.1.2 Filtering Rows |
10.1.3 Cleaning and Normalizing Data |
10.1.3.1 Clearing out unknown values |
10.1.3.2 Repairing Values and Column Types |
10.1.4 Computing New Columns |
10.1.5 Aggregating and Grouping Columns |
10.1.6 Wide Versus Tall Data |
Converting Between Wide and Tall Data |
10.1.7 Plotting Data |
10.1.8 Takeaways |
10.2 Reshaping Tables |
10.2.1 Binning Rows |
10.2.2 Wide versus Tall Datasets |
|
IV Programming With State |
11 Programming with State (in Both Pyret and Python) |
11.1 State, Change, and Testing |
11.1.1 Example: Bank Accounts |
11.1.2 Testing Functions that Mutate Data |
11.1.3 Aliasing |
11.1.4 Data Mutation and the Directory |
11.1.4.1 Introducing the Heap |
11.1.4.2 Basic Data and the Heap |
11.2 Understanding Equality |
11.2.1 Equality of Data |
11.2.2 Different Equality Operations |
11.2.2.1 Equality in Python |
11.2.2.2 Equality in Pyret |
11.3 Arrays and Lists in Memory |
11.4 Cyclic Data |
11.4.1 Creating Cyclic Data |
11.4.2 Testing Cyclic Data |
11.4.3 Cycles in Practice |
12 More Programming with State: Python |
12.1 Modifying Variables |
12.1.1 Modifying Variables in Memory |
12.1.2 Variable Updates and Aliasing |
12.1.3 Updating Variables versus Updating Data Fields |
12.1.4 Updating Parameters in Function Calls |
12.1.5 Updating Top-Level Variables within Function Calls |
12.1.6 The Many Roles of Variables |
12.2 Mutable Lists |
|
V Algorithm Analysis |
13 Predicting Growth |
13.1 A Little (True) Story |
13.2 The Analytical Idea |
13.3 A Cost Model for Pyret Running Time |
13.4 The Size of the Input |
13.5 The Tabular Method for Singly-Structurally-Recursive Functions |
13.6 Creating Recurrences |
13.7 A Notation for Functions |
13.8 Comparing Functions |
13.9 Combining Big-Oh Without Woe |
13.10 Solving Recurrences |
14 Halloween Analysis |
14.1 A First Example |
14.2 The New Form of Analysis |
14.3 An Example: Queues from Lists |
14.3.1 List Representations |
14.3.2 A First Analysis |
14.3.3 More Liberal Sequences of Operations |
14.3.4 A Second Analysis |
14.3.5 Amortization Versus Individual Operations |
14.4 Reading More |
|
VI Data Structures with Analysis |
15 Directed Acyclic Graphs |
15.1 Sharing and Equality |
15.1.1 Re-Examining Equality |
15.1.2 The Cost of Evaluating References |
15.1.3 Notations for Equality |
15.1.4 On the Internet, Nobody Knows You’re a DAG |
15.1.5 It’s Always Been a DAG |
15.1.6 From Acyclicity to Cycles |
15.2 The Size of a DAG |
15.2.1 Stage 1 |
15.2.2 Stage 2 |
15.2.3 Stage 3 |
15.2.4 Stage 4 |
15.2.5 Stage 5 |
15.2.6 What We’ve Learned |
15.2.7 More on Value Printing: An Aside from Racket |
16 Graphs |
16.1 Introducing Graphs |
16.1.1 Understanding Graphs |
16.1.2 Representations |
16.1.2.1 Links by Name |
16.1.2.2 Links by Indices |
16.1.2.3 A List of Edges |
16.1.2.4 Abstracting Representations |
16.1.3 Measuring Complexity for Graphs |
16.2 Basic Graph Traversals |
16.2.1 Reachability |
16.2.1.1 Simple Recursion |
16.2.1.2 Cleaning up the Loop |
16.2.1.3 Traversal with Memory |
16.2.1.4 A Better Interface |
16.2.2 Depth- and Breadth-First Traversals |
16.3 Graphs With Weighted Edges |
16.4 Shortest (or Lightest) Paths |
16.5 Moravian Spanning Trees |
16.5.1 The Problem |
16.5.2 A Greedy Solution |
16.5.3 Another Greedy Solution |
16.5.4 A Third Solution |
16.5.5 Checking Component Connectedness |
17 Several Variations on Sets |
17.1 Representing Sets as Lists |
17.1.1 Representation Choices |
17.1.2 Time Complexity |
17.1.3 Choosing Between Representations |
17.1.4 Other Operations |
17.2 Making Sets Grow on Trees |
17.2.1 Using Binary Trees |
17.2.2 Checking the Complexity |
17.2.3 A Fine Balance: Tree Surgery |
17.2.3.1 Left-Left Case |
17.2.3.2 Left-Right Case |
17.2.3.3 Any Other Cases? |
17.3 Union-Find |
17.3.1 Implementing with State |
17.3.2 Optimizations |
17.3.3 Analysis |
17.4 Hashes, Sets, and Key-Values |
17.4.1 A Hash Function for Strings |
17.4.2 Sets from Hashing |
17.4.3 Arrays |
17.4.4 Sets from Hashing and Arrays |
17.4.5 Collisions |
17.4.6 Resolving Collisions |
17.4.7 Complexity |
17.4.8 Bloom Filters |
17.4.9 Generalizing from Sets to Key-Values |
17.5 Equality, Ordering, and Hashing |
17.5.1 Converting Values to Ordered Values |
17.5.2 Hashing in Practice |
17.5.3 Equality and Ordering |
17.6 Sets as a Case Study |
17.6.1 Nature of the Data |
17.6.2 Nature of the Operations |
17.6.3 Nature of the Guarantee |
|
VII Advanced Topics |
18 State and Equality |
18.1 Boxes: A Canonical Mutable Structure |
18.2 Mutation and Types |
18.3 Mutation and Equality |
18.4 Another Equality Predicate |
18.5 A Hierarchy of Equality |
18.6 Space and Time Complexity |
18.7 What it Means to be Identical |
18.8 Comparing Functions |
19 Recursion and Cycles from Mutation |
19.1 Partial Definitions |
19.2 Recursive Functions |
19.3 Premature Evaluation |
19.4 Cyclic Lists Versus Streams |
20 Detecting Cycles |
20.1 A Running Example |
20.2 Types |
20.3 A First Checker |
20.4 Complexity |
20.5 A Fabulous Improvement |
20.6 Testing |
21 Avoiding Recomputation by Remembering Answers |
21.1 An Interesting Numeric Sequence |
21.1.1 Using State to Remember Past Answers |
21.1.2 From a Tree of Computation to a DAG |
21.1.3 The Complexity of Numbers |
21.1.4 Abstracting Memoization |
21.2 Edit-Distance for Spelling Correction |
21.3 Nature as a Fat-Fingered Typist |
21.4 Dynamic Programming |
21.4.1 Catalan Numbers with Dynamic Programming |
21.4.2 Levenshtein Distance and Dynamic Programming |
21.5 Contrasting Memoization and Dynamic Programming |
22 Partial Domains |
22.1 A Non-Solution |
22.2 Exceptions |
22.3 The Option Type |
22.4 Total Domains, Dynamically |
22.5 Total Domains, Statically |
22.6 Summary |
22.7 A Note on Notation |
23 Staging |
23.1 Problem Definition |
23.2 Initial Solution |
23.3 Refactoring |
23.4 Separating Parameters |
23.5 Context |
24 Factoring Numbers |
25 Deconstructing Loops |
25.1 Setup: Two Functions |
25.2 Abstracting a Loop |
25.3 Is It Really a Loop? |
25.4 Re-Examining for |
25.5 Rewriting Pollard-Rho |
25.6 Nested Loops |
25.7 Loops, Values, and Customization |
|
VIII Interactive Programs |
26 Interactive Games as Reactive Systems |
26.1 About Reactive Animations |
26.2 Preliminaries |
26.3 Version: Airplane Moving Across the Screen |
26.3.1 Updating the World State |
26.3.2 Displaying the World State |
26.3.3 Observing Time (and Combining the Pieces) |
26.4 Version: Wrapping Around |
26.5 Version: Descending |
26.5.1 Moving the Airplane |
26.5.2 Drawing the Scene |
26.5.3 Finishing Touches |
26.6 Version: Responding to Keystrokes |
26.7 Version: Landing |
26.8 Version: A Fixed Balloon |
26.9 Version: Keep Your Eye on the Tank |
26.10 Version: The Balloon Moves, Too |
26.11 Version: One, Two, ..., Ninety-Nine Luftballons! |
|
IX Appendices |
27 Pyret for Racketeers and Schemers |
27.1 Numbers, Strings, and Booleans |
27.2 Infix Expressions |
27.3 Function Definition and Application |
27.4 Tests |
27.5 Variable Names |
27.6 Data Definitions |
27.7 Conditionals |
27.8 Lists |
27.9 First-Class Functions |
27.10 Annotations |
27.11 What Else? |
28 Pyret vs. Python |
29 Comparing This Book to HtDP |
30 Release Notes |
30.1 Version 2024-09-03 |
30.2 Version 2023-02-21 |
30.3 Version 2022-08-28 |
30.4 Version 2022-01-25 |
30.5 Version 2021-08-21 |
31 Glossary |