Detecting Judge Bias in Competitive Diving Using Statistical Analysis

Chi-square Test for Assocations
Type I Error
Type II Error
Preliminary diving results of womens 16-18 1m springboard from the 2022 FINA World Junior Championships.
Authors
Affiliation

St. Lawrence University

Ivan Ramler

St. Lawrence University

Robin Lock

St. Lawrence University

Published

May 18, 2026

Introduction to Diving

If you are unfamiliar with diving, please watch this video:

Module

Please note that these material have not yet completed the required pedagogical and industry peer-reviews to become a published module on the SCORE Network. However, instructors are still welcome to use these materials if they are so inclined.

Introduction

The World Aquatics (FINA) Junior World Diving Championships is an elite diving meet where top divers from around the world, ages 16 to 18, compete. Each diver completes 9 dives, all of which contribute to their overall score. Each judge gives a score individually on a scale of 1-10 at half-point increments based on execution of the dive. 10 represents perfect execution, while 0 represents failed execution. To reduce the influence of unusually high or unusually low scores, the two lowest and two highest judges’ scores are discarded. The remaining scores are averaged and then multiplied by the degree of difficulty to determine the diver’s score for that dive.

In this activity, students analyze results from the preliminary round of the women’s 16-18 1m springboard event at the 2022 FINA World Junior Championships. Each judge’s score is classified into one of three categories: Okay, Too Low, or Too High. A score is classified as Okay if it was kept, Too Low if it was one of the discarded low scores, and Too High if it was one of the discarded high scores.

We are interested in determining if there is evidence of an association between judge and score result. More specifically, the larger question is whether any judge appears to score differently from the others in a way that could suggest possible judging bias. Statistically, we begin by asking whether there is an association between judge and score result. If judge and score result are associated, we can then examine which judge or judges contributed most to that association and discuss what that pattern might imply.

This could serve as an in class activity and should take roughly 30-45 minutes to complete. Questions can also be pulled from this activity to serve as shorter examples.

By the end of this activity, students should be able to:

  1. Identify the variables in a two-way table and describe the categories of a categorical response variable.

  2. Conduct and interpret a chi-square test for association between two categorical variables.

  3. Interpret a p-value and statistical decision in the context of possible judging bias.

  4. Define Type I and Type II errors in context.

  5. Discuss the real-world implications of Type I and Type II errors.

This activity is designed as a reinforcement module. Students are expected to have been introduced to:

  1. Two-way tables for summarizing the relationship between two categorical variables.

  2. The basic structure of a hypothesis test, including null and alternative hypotheses, test statistics, p-values, and statistical conclusions.

  3. Chi-square tests for association or independence.

  4. Type I and Type II errors in hypothesis testing.

Technology Requirement: The handout assumes that students will have access to statistical software capable of recreating a two-way table and obtaining chi-square test output from the raw data.

Data

The data set contains 360 rows and 15 columns. Each row represents a completed dive from a diver in the preliminary results of women (aged 16-18) 1m springboard from the 2022 FINA World Junior Championships. Each diver completed 9 dives, so there is 9 rows per diver.

Variable Description
LastName Last name of the athlete
Country Athlete’s home country
Age Athlete’s Age
TotalPoints Total points scored at the meet
DiveNum Order of athlete’s dive, 1-9
DiveName Name of dive executed
Difficulty Difficulty of the dive
Points Points scored on the dive
Judge Judge Identification
Points_Awarded Number of points the judge scored
Judge_Result If judges score was too low, too high, or okay

The data sets can be accessed here:

FemaleFINADivingChampionships.csv

Note that the activity only uses two of the available variables: Judge and Judge_Result. The addtional variables are provided for extensions and modifications.

Data Source

World Aquatics. (2022). FINA World Junior Diving Championships 2022. https://www.worldaquatics.com/competitions/2951/fina-world-junior-diving-championships-2022/results?event=6d65f6db-1e71-4bca-b1c3-7facf12f500f&unit=preliminary

Materials

Class Worksheet

Class Worksheet Key

In this module, you practiced conducting a chi-square test for association. You have learned how statistical decisions can relate to type I and type II errors. These errors can have real-world implications that are important to consider when working with statistics.