Doughnuts and 5ks

Webscraping
Data Visualization
Data Wrangling
Joining / Merging
Exploring competitive eating while running.
Authors
Affiliation

Max Finley

St. Lawrence University

Ivan Ramler

St. Lawrence University

Published

Invalid Date

Module

Please note that these material have not yet completed the required pedagogical and industry peer-reviews to become a published module on the SCORE Network. However, instructors are still welcome to use these materials if they are so inclined.

Introduction

The Doughnut Run 5k is a very special athletic contest, held every year since 2015. This event is put on by the local college triathlon team in Ames Iowa. In this sporting event, athletes must complete a 5k, all the while eating as many doughnuts for as big a time bonus as possible. Here are the time bonuses:

I will note that there are athletes who participated in this event who didn’ eat donuts, and have been excluded from tests and plots for the sake of doughnuts and the 5k.

In this worksheet, you will create linear models, and new data sets to answer questions.

This could be suitable of an out of class activity or one that spans more than one class period.

By the end of this activity, you will be able to:

  1. Filter Data to meet requirements.

  2. Create and plot a regression model.

  3. Explore relationships between variables and communicate findings using visualizations.

Technology Requirement:

This activity requires the use of R with familiarity in Quarto documents and several tidyverse packages including dplyrand readr.

Data

The X5KRunUnadj1 dataset contains 177 rows and 11 columns. Each row represents a runner player from the 2015 Doughnut 5k.

Data:

Variable Descriptions
Variable Description
Position The place the athlete got
Race Number Number on bib of racer
Time Time adjusted for doughnuts ate
time_sec_adj total seconds of adjusted time
time_sec_unadj seconds it took to complete 5k without adjustment
time_sec_diff difference between adjusted and unadjusted
——————– ———————————————————————-

Data Sources

Materials

This data set has been made palatable so you don’t have to do the gritty work.

Find fastest runner who ate more than 10 donuts, using un-adjusted time. Then the slowest of who ate more than 10, then the fastest of who ate less than 3, and the slowest of who ate less than 3.

Create a linear model using the un-adjusted time to predict the adjusted time, then create a linear model using the donuts eaten to find adjusted time. Interpret the slope for both models.

Your turn: create a visualization of this data you think would be interesting or useful:

This is an intro to linear regression module that acts as an assignment and not a walk through.