Women’s Alpine Skiing Giant Slalom - Paired Data

Paired Data

Confidence Intervals

Hypothesis Testing

Statistics from two runs at a Women’s World Cup Alpine Skiing Giant Slalom Race at Mont-Tremblant

Authors

Affiliation

Emelia Agostinelli

St. Lawrence University

Robin Lock

St. Lawrence University

Published

May 15, 2024

Module

Please note that these material have not yet completed the required pedagogical and industry peer-reviews to become a published module on the SCORE Network. However, instructors are still welcome to use these materials if they are so inclined.

Introduction

In World Cup Giant Slalom (GS), there are two runs. Only the thirty fastest racers from the first run take a second run. If a racer is disqualified (DSQ) or did not finish (DNF) their first run, they do not take a second run. The order for the first run is determined by taking all racers and ordering them by their World Cup points, from highest to lowest. From that, the top 30 racers are put into three groups. The best seven racers are randomly assigning them a bib 1-7. The next eight best competitors are randomly assigned a bib 8-15. The next best 15 racers are randomly assigned a bib 16-30. The remaining racers go in descending order of points. For the second run, competitors race in reverse order of their results on the first run, so the 30th fastest racer on the first run goes 1st on the second run and so on. This data set includes data from only the top thirty finishers as any racers who placed higher than 30th do not take a second run.

In this worksheet, we will be able to explore paired data, as we have information on 2 runs by the same competitor (unless they DSQ or DNF on the second run). Additionally, we will investigate difference in means between the 2 runs to determine if racers are on average faster or slower on a specific run.

Preliminary script for video

Learning Objectives

By the end of this activity, you will be able to:

Gather summary statistics
Find a confidence interval for difference in mean run time
Perform a test for difference in means
Interpret findings from a paired t-test
Relate findings from CIs and t-tests

Methods

Familiarity with basic statistical analysis techniques, such as measures of central tendency (mean, median) and measures of dispersion (standard deviation, range), can provide insights into the overall characteristics and variability in difference in mean run times
Understanding of confidence intervals and their interpretations
Knowledge of performing and interpreting a t-test for difference in means with paired data

Data

Each of the 30 rows in this data set represents a racer, and there are 11 variables. There are a few missing values in the Run2_Time, Run2_Rank, Total_Time, and Rank_Diff columns due to racers not finishing their first run.

Download data: Tremblant1.csv

Variable Descriptions

Variable	Description
Name	Name of the racer
Nat	Country the racer represents
Run1_Order	Start order for the first run
Run1_Time	Time for run 1 in seconds
Run1_Rank	Rankings of run 1
Run2_Order	Start order for the second run
Run2_Time	Time for run 2 in seconds
Run2_Rank	Rankings of run 2
Total_Time	Combined time of runs 1 and 2
Final_Rank	Final results
Rank_Diff	Difference between run 1 rank and final rank

Data Source

The data was scraped from the following link: FIS World Cup Women’s GS at Tremblant

Materials

Class Worksheet

Class Worksheet Key

Authors

Fill in later