Doughnut Run: Time Adjustments & Performance
Module
Please note that these materials have not yet completed the required pedagogical and industry peer-reviews to become a published module on the SCORE Network. However, instructors are still welcome to use these materials if they are so inclined.
Introduction
Fun runs often include nontraditional race elements such as bonus challenges, costumes, or food stations. In a doughnut run, participants may stop during the race to eat doughnuts, and their official race time may be adjusted based on how many doughnuts they consume. This creates an interesting data analysis problem: how can we estimate doughnuts eaten from race timing data, and what relationship (if any) exists between doughnuts and performance?
In this worksheet, students analyze race data from a doughnut run using three datasets: adjusted race results, unadjusted race results, and a bonus threshold table. The main idea is to compare each runner’s adjusted and unadjusted times, compute the time difference, and use that difference as a proxy for how many doughnuts were eaten.
Throughout the activity, students will practice core data analysis skills including cleaning inconsistent time strings, converting time values into seconds, joining datasets by a common identifier, constructing a derived variable from thresholds, and exploring relationships with visualizations and a simple quadratic model.
This activity is especially useful for introducing students to realistic “messy data” workflows, where variables must be interpreted carefully and transformations must be justified before analysis begins.
Data
You will work with three datasets:
doughnut2015.csv: adjusted race resultsdoughnut2015unadj.csv: unadjusted race resultsdoughnuttime.csv: the “bonus time thresholds” for each doughnut
Download Data:
[doughnut2015.csv] {target=“_blank”} [doughnut2015unadj.csv] {target=“_blank”} [doughnuttime.csv] {target=“_blank”}
The doughnut2015.csv dataset contains race results given the donut-adjustment. Each row represents one runner. Students will create time_sec during the cleaning process.
Variable Descriptions: doughnut2015.csv (Adjusted Results)
| Variable | Description |
|---|---|
| Position | Overall finishing position |
| Race Number | Unique runner ID |
| Name | Runner name |
| Time | Race time |
| TimeAdj | Adjusted race tim |
| Category | Runner’s age category |
| Cat Pos | Position within category |
| Gender | Runner gender |
| Gen Pos | Position within gender |
The doughnut2015unadj.csv dataset contains race results without donut-adjustment. Students will also create time_sec during the cleaning process.
Variable Descriptions: doughnut2015unadj.csv (Unadjusted Results)
| Variable | Description |
|---|---|
| Position | Overall finishing position |
| Race Number | Unique runner ID |
| Name | Runner name |
| Time | Race time |
| Category | Runner’s age category |
| Cat Pos | Position within category |
| Gender | Runner gender |
| Gen Pos | Position within gender |
The doughnuttime.csv dataset contains the number of doughnuts that determine threshold times.
Variable Descriptions: doughnuttime.csv
| Variable | Description |
|---|---|
| Donut | Number of Donuts |
| Bonus | Time threshold |
Data Sources
https://truetimeracing.com/event/doughnut-run-2015/ https://truetimeracing.com/event/doughnut-run-2015/ https://github.com/iramler/stat450-spr2026-score/blob/main/donuts/donut_times.jpg
Materials
Additional Reading
Introduction to Data Wrangling with dplyr https://dplyr.tidyverse.org/
Data Visualization with ggplot2 https://ggplot2.tidyverse.org/
Working with Dates and Times in R https://lubridate.tidyverse.org/