Women’s 100m Olympic Swimming: Comparing Strokes and Eras

Boxplots
Confidence Intervals
Two-Sample Tests
This module explores results from women’s 100m Olympic swimming events between 1964 and 2024, comparing different strokes and examining how times have changed over time.
Authors
Affiliation

Brendan Karadenes

St. Lawrence University

Ivan Ramler

St. Lawrence University

Robin Lock

St. Lawrence University

Published

September 25, 2025

Welcome Video

For those interested in how data analytics is used in swimming, please check out this video about how University of Virginia, Mathematics professor Dr. Ken Ono helps the UVA swim team.

Introduction

In Olympic swimming, athletes compete using different strokes, each with unique styles and challenges. Freestyle is usually the fastest (alternating arm strokes and a flutter kick), butterfly is powerful but demanding (both arms moving symmetrically, accompanied by the dolphin kick), and backstroke is swum on the back (arms reach alternately above the head and enter the water directly in line with the shoulders).

This module analyzes results from the women’s 100m events (finalists) from 1964-2024 (where complete results are available) as recorded by Olympics.com. Over time, performances reflect advances in training and technology, rule changes, and the growth of women’s participation. You will compare strokes and quantify changes across eras using boxplots and two-sample confidence intervals. Note that there is a fourth event, the 100m breaststroke, but it is excluded from this module because complete finalist times were not available on Olympics.com prior to 2020.

This activity would likely be suitable for an in-class example or quiz that lasts roughly 20 - 30 minutes.

By the end of the activity, you will be able to:

  1. Analyze distributions using boxplots.
  2. Compare and contrast distributions in the same or different groups.
  3. Assess differences between two samples using confidence intervals.

For this activity, students will primarily use basic concepts of boxplots and two-sample t confidence intervals to analyze data. The activity can easily be adapted to use other inference methods such as ANOVA or Bootstrap intervals.

The provided low-tech worksheet requires StatKey or similar software to obtain t-scores, as well as access to a calculator.

Since the data are provided, instructors are encouraged to adapt the worksheets so students can make calculations and build graphics using their preferred software. The tech worksheet is designed to facilitate this approach.

Data

The data set contains 218 rows and 10 columns. Each row represents a female swimmer who competed in the 100m Olympic event during the period 1968 to 2024. The data includes top 8 finishers from woman racers however, due to lack of records, some of the data is missing.

Download data: olympic_swimming_women.csv

Variable Descriptions
Variable Description
Location hosting city of the Olympics that the swimmer competed in
Year year that the swimmer competed
dist_m distance in meters of the race
Stroke Backstroke, Butterfly, or Freestyle
Gender gender of swimmer (Female only - see below)
Team 3 letter country code the swimmer is affiliated with
Athlete first and last name of the swimmer
Time time, in seconds, that the swimmer completed the race
Rank place of the swimmer in the event out of four
Era time period that Olympian swam in, either “early” (1924-1996) or “recent” (2000-2024)

Data Source

Adapted from

and

A related version of this data, that includes Men’s 100m events too, is available on the SCORE Data Repository: olympic_swimming

That data set contains 606 rows and 10 columns during the period 1924 to 2020.

Materials

We provide an editable Microsoft Word handout along with its solutions.

  • This Low tech class handout is based on provided summary statistics and graphics and should only require a calculator and the means to look up a t-score and can be modified to fit the instructors needs.

  • The Tech required class handout assumes students will be analyzing the data themselves to acquire the necessary output.

  • Sample solutions are based on Mintab output.

This module gives students practice interpreting boxplots and constructing confidence intervals in a real-world setting. By analyzing Olympic swimming results, they can see how statistical methods reveal meaningful differences between strokes and across eras. The activity highlights how changes in training, technology, and participation result in improvements in swimming.