Olympic Rowing Medals Between 1900 and 2022 - Data Wrangling
Welcome video
Introduction
This activity organizes the data in Olympic rowing between 1900 and 2022 so that the total medals and points for countries can be analyzed.
The Summer Olympic Games are an international athletics event held every four years and hosted in different countries around the world. Rowing was added to the Olympics in 1896 but was cancelled due to weather. It has been in every Summer Olympics since 1900. Rowing races in the Olympic context are regatta style, meaning that there are multiple boats racing head to head against each other in multiple lanes. Since 1912, the standard distance for Olympic regattas has been 2000m, but before then there had been a range in distances. The boat that is first to cross the finish line is awarded a gold medal, the second a silver medal, and the third a bronze.
Over the course of its time as an Olympic sport there have been 25 different event entries. These events are separated by gender and range with the number of rowers in the boat (1, 2, 4, 6, 8, 17), the rigging (inrigged, outrigged), sculling, sweeping, and whether or not they are coxed. An inrigged shell means the riggers (where the oar is attached to the boat) are on the inside of the boat, outrigged shells mean the riggers are on the outside. Sculling is where the rowers have an oar on each side and sweeping is when each rower only has one oar on one side. The coxswain steers the boat and guides the rowers, some events have coxed boats whereas some others do not.
The original data set (athlete_events.csv) is not just about rowing but for every Olympic sport since 1896. It will be important to adjust the data so that it is just rowing. Moreover, medals are given to each athlete in the boat. In Olympic scoring however, the medals should be counted as one towards the entire boat. It is important to make sure this is the case when arranging the data.
In looking at the total medals and total points for each nation, it is interesting to see which nations dominate in Olympic rowing. Additionally, looking at the overall distribution of the medals for all countries provides insight on just how lop-sided medaling can be in rowing at the Olympic level. This effect could likely be attributed towards how much funding nations are placing towards their rowing teams.
Data
In the data set there are 271,106 athletes from 230 countries competing in 66 sports at 35 different Olympics. Each row represents an individual athlete competing in an event at an Olympics. There are 15 variables in the dataset.
- Download data:
Variable Descriptions
Variable | Description |
---|---|
ID |
The ID number assigned to the athlete. |
Name |
The name of the athlete in first name last name order. |
Sex |
The sex of the athlete. |
Age |
The age of the athlete at the time of competing. |
Height |
The height of the athlete at the time of competing. Measured in cm. |
Weight |
The weight of the athlete at the time of competing. Measured in kg. |
Team |
The team the athlete is a member of. In some cases this is different than theNOC . |
NOC |
The nation the athlete is representing, usually the best variable to choose to analyze nations. |
Games |
The Olympic Games the case is from. e.g. “1992 Summer Olympics”. |
Year |
The year the athlete was competing. |
Season |
The season the athlete was competing, “Summer” or “Winter”. |
City |
The city the Olympics were hosted in the case. |
Sport |
The sport the athlete competed in. |
Event |
The event the athlete competed in. |
Medal |
If the athlete medaled, which medal they won. |
Data Source
Materials
We provide editable qmd files along with their solutions.