--- title: "Lab 3" author: "Your Name Here" output: html_document --- ##### Remember to change the `author: ` field on this Rmd file to your own name. For the first two problems we'll use the Cars93 data set from the MASS library. ```{r} library(tidyverse) Cars93 <- MASS::Cars93 ``` #### 1. Manipulating data frames There are certain situations where we want to transform right-skewed data before analysing it. Taking the log of right-skewed data often helps to make it more normally distributed. Here are histograms of the `MPG.highway` and `MPG.city` variables. ```{r, fig.width = 12} qplot(MPG.city, data = Cars93, bins = 10) qplot(MPG.highway, data = Cars93, bins = 10) ``` **(a)** Do the city and highway gas-mileage figures appear to have right-skewed distributions? > Your answer: Yes. Most of the the mass is closely concentrated near low MPG values, and there's a long right tail indicating a small proportion of cars that have very high MPG. **(b)** Use the `mutate()` and `log()` functions to create a new data frame called `Cars93.log` that has `MPG.highway` and `MPG.city` replaced with `log(MPG.highway)` and `log(MPG.city)`. ```{r} Cars93.log <- mutate(Cars93, MPG.highway = log(MPG.highway), MPG.city = log(MPG.city)) ``` **(c)** Run the histogram commands again, this time using your new `Cars93.log` dataset instead of `Cars93`. ```{r, fig.width = 12} qplot(MPG.city, data = Cars93.log, bins = 10) qplot(MPG.highway, data = Cars93.log, bins = 10) ``` **(d)** Do the distributions appear less skewed than before? > The MPG highway distribution does look more symmetric. #### 2. Table function **(a)** Use the `table()` function to tabulate the data by DriveTrain and Origin. ```{r} table(Cars93$DriveTrain, Cars93$Origin) ``` **(b)** Repeat part **(a)**, this time using the `count()` function. ```{r} Cars93 %>% count(DriveTrain, Origin) ``` **(c)** Does it looks like foreign car manufacturers had different Drivetrain production preferences compared to US manufacturers? > Your answer: The counts for each Drivetrain category are nearly the same for US and non-US manufacturers. The table suggests that they had similar Drivetrain production preferences. #### 3. Functions, lists, and if-else practice **(a)** Write a function called `isPassingGrade` whose input `x` is a number, and which returns `FALSE` if `x` is lower than 50 and `TRUE` otherwise. ```{r} isPassingGrade <- function(x) { x >= 50 } ``` **(b)** Write a function called `sendMessage` whose input `x` is a number, and which prints `Congratulations` if `isPassingGrade(x)` is `TRUE` and prints `Oh no!` if `isPassingGrade(x)` is `FALSE`. ```{r} sendMessage <- function(x) { if(isPassingGrade(x)) { print("Congratulations!") } else { print("Oh no!") } } # Here's another way of accomplishing the same thing sendMessage2 <- function(x) print(ifelse(isPassingGrade(x), "Congratulations", "Oh no!")) ``` **(c)** Write a function called `gradeSummary` whose input `x` is a number. Your function will return a list with two elements, named `letter.grade` and `passed`. The letter grade will be `"A"` if `x` is at least `90`. The letter grade will be `"B"` if `x` is between `80` and `90`. The letter grade will be `"F"` if `x` is lower than `"80"`. If the student's letter grade is an A or B, `passed` should be TRUE; `passed` should be FALSE otherwise. ```{r} gradeSummary <- function(x) { if(x >= 90) { letter.grade <- "A" passed <- TRUE } else if (x >= 80) { letter.grade <- "B" passed <- TRUE } else { letter.grade <- "F" passed <- FALSE } list(letter.grade = letter.grade, passed = passed) } gradeSummary(91) gradeSummary(62) ``` To check if your function works, try the following cases: `x = 91` should return ```{r, echo = FALSE} list(letter.grade = "A", passed = TRUE) ``` `x = 62` should return ```{r, echo = FALSE} list(letter.grade = "F", passed = FALSE) ```