### Learning objectives

In today’s Lab you will gain practice with the following concepts from today’s class:

• Interpreting linear regression coefficients of numeric covariates
• Interpreting linear regression coefficients of categorical variables
• Applying the “2 standard error rule” to construct approximate 95% confidence intervals for regression coefficients
• Using the confint command to construct confidence intervals for regression coefficients
• Using pairs plots to diagnose collinearity
• Using the update command to update a linear regression model object
• Diagnosing violations of linear model assumptions using plot

library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.3.2     ✔ purrr   0.3.3
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   1.0.0     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## Warning: package 'ggplot2' was built under R version 3.6.2
## ── Conflicts ─────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::lag()    masks stats::lag()
library(knitr)

Cars93 <- as_tibble(MASS::Cars93)
# If you want to experiment with the ggpairs command,
# you'll want to run the following code:
# install.packages("GGally")
# library(GGally)

### Linear regression with Cars93 data

(a) Use the lm() function to regress Price on: EngineSize, Origin, MPG.highway, MPG.city and Horsepower.

# Edit me

(b) Use the kable() command to produce a nicely formatted coefficients table. Ensure that values are rounded to an appropriate number of decimal places.

# Edit me

(c) Interpret the coefficient of Originnon-USA. Is it statistically significant?

# Edit me

(d) Interpret the coefficient of MPG.highway. Is it statistically significant?

# Edit me

(d) Use the “2 standard error rule” to construct an approximate 95% confidence interval for the coefficient of MPG.highway. Compare this to the 95% CI obtained by using the confint command.

# Edit me

(e) Run the pairs command on the following set of variables: EngineSize, MPG.highway, MPG.city and Horsepower. Display correlations in the Do you observe any collinearities?

panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor, ...)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- abs(cor(x, y))
txt <- format(c(r, 0.123456789), digits = digits)
txt <- paste0(prefix, txt)
if(missing(cex.cor)) cex.cor <- 0.4/strwidth(txt)
text(0.5, 0.5, txt, cex = pmax(1, cex.cor * r))
}

# Edit me

(f) Use the update command to update your regression model to exclude EngineSize and MPG.city. Display the resulting coefficients table nicely using the kable() command.

# Edit me

(g) Does the coefficient of MPG.highway change much from the original model? Calculate a 95% confidence interval and compare your answer to part (d). Does the CI change much from before? Explain.

# Edit me

(h) Run the plot command on the linear model you constructed in part (f). Do you notice any issues?
# Edit me