library(tidyverse)
library(tidymodels)
<- read_csv("data/fish.csv") fish
Modelling fish
For this application exercise, we will work with data on fish. The dataset we will use, called fish
, is on two common fish species in fish market sales.
The data dictionary is below:
variable | description |
---|---|
species |
Species name of fish |
weight |
Weight, in grams |
length_vertical |
Vertical length, in cm |
length_diagonal |
Diagonal length, in cm |
length_cross |
Cross length, in cm |
height |
Height, in cm |
width |
Diagonal width, in cm |
Visualizing the model
We’re going to investigate the relationship between the weights and heights of fish.
- Demo: Create an appropriate plot to investigate this relationship. Add appropriate labels to the plot.
# add code here
Your turn (5 minutes):
If you were to draw a a straight line to best represent the relationship between the heights and weights of fish, where would it go? Why?
Add response here.
Now, let R draw the line for you. Refer to the documentation at https://ggplot2.tidyverse.org/reference/geom_smooth.html. Specifically, refer to the
method
section.
# add code here
- What types of questions can this plot help answer?
Add response here.
Your turn (3 minutes):
- We can use this line to make predictions. Predict what you think the weight of a fish would be with a height of 10 cm, 15 cm, and 20 cm. Which prediction is considered extrapolation?
Add response here.
- What is a residual?
Add response here.
Model fitting
- Demo: Fit a model to predict fish weights from their heights.
# add code here
- Your turn (3 minutes): Predict what the weight of a fish would be with a height of 10 cm, 15 cm, and 20 cm using this model.
# add code here
- Demo: Calculate predicted weights for all fish in the data and visualize the residuals under this model.
# add code here
Model summary
- Demo: Display the model summary including estimates for the slope and intercept along with measurements of uncertainty around them. Show how you can extract these values from the model output.
# add code here
- Demo: Write out your model using mathematical notation.
Add response here.
Correlation
We can also assess correlation between two quantitative variables.
Your turn (5 minutes):
- What is correlation? What are values correlation can take?
Add response here.
- Are you good at guessing correlation? Give it a try! https://www.rossmanchance.com/applets/2021/guesscorrelation/GuessCorrelation.html
Demo: What is the correlation between heights and weights of fish?
# add code here
Adding a third variable
- Demo: Does the relationship between heights and weights of fish change if we take into consideration species? Plot two separate straight lines for the Bream and Roach species.
# add code here
Fitting other models
- Demo: We can fit more models than just a straight line. Change the following code below to read
method = "loess"
. What is different from the plot created before?
# add code here