Dakar Rally Analysis
Module
Please note that these material have not yet completed the required pedagogical and industry peer-reviews to become a published module on the SCORE Network. However, instructors are still welcome to use these materials if they are so inclined.
Introduction
The Dakar Rally is an annual off-road endurance event that typically spans over two weeks and covers thousands of kilometers across challenging terrain, and the most recent rally took place in Saudi Arabia. Participants, including motorcyclists, drivers, and truckers, compete in various categories, facing extreme conditions like deserts, mountains, and dunes, making it one of the toughest motor-sport events in the world. For this investigation, we will be looking at the motorist statistics for all 12 stages of race. In this race, riders can drop out or be eliminated after each stage due to various reasons such as mechanical failures, accidents, injuries, or if an rules are violated penalties are applied to riders overall time, affecting their final ranking.
- In this worksheet, we will be exploring the data from the 2024 Dakar Rally, focusing on the biker rankings and times throughout all 12 stages. We will fit multiple linear regression models to predict the rankings of drivers with rider times, and analyze the model summary outputs, patterns and trends, and potential outliers. We will also test the model efficiency and perform a nested-hypothesis test to find the best model.
Data
A data frame for the 2024 Dakar Rally, is an annual off-road endurance event that typically spans over two weeks and covers thousands of kilometers across challenging terrain in Saudi Arabia. Participants, including motorcyclists, drivers, and truckers, compete in various categories, facing extreme conditions like deserts, mountains, and dunes, making it one of the toughest motor-sport events in the world. But in this investigation we will be looking at the motorist statistics for all 12 stages of the race. The data frame has a total of 1584 observations with 16 variables. However, the way the race is set up so after each stage, drivers can drop out or be eliminated after each stage due to various reasons such as mechanical failures, accidents, injuries, or exceeding time limits. Therefore, the race started with 142 drivers, and by the time the 12th stage came around only 103 drivers remained.
Data: Variable Descriptions
Variable | Description |
---|---|
Rank |
The ranking of the driver in the competition |
Driver_Number |
The number assigned to the driver in the competition |
Team |
The team to which the driver belongs |
Country |
The country of origin of the driver |
Driver |
The name of the driver |
Hours |
The hours component of the time |
Minutes |
The minutes component of the time |
Seconds |
The seconds component of the time |
Variation | The variation in time is the difference in time between drivers in their specific ranks |
Variation_Hours |
The hours component of the variation in time |
Variation_Minutes |
The minutes component of the variation in time |
Variation_Seconds |
The seconds component of the variation in time |
Penalty_Hours |
The hours component of the penalty time |
Penalty_Minutes |
The minutes component of the penalty time |
Penalty_Seconds |
The seconds component of the penalty time |
Stage |
The stage number of the competition (0-12 stages) |
Image_URL |
The URL of the image associated with the player/driver/Experience Level |
Download data: dakarRally_bikes_data.csv