Data Science with R
  • Syllabus
  • Lecture Notes
  • Assignments
  • Exercises

On this page

  • 1 Filter
    • 1.1 Exercise 1
    • 1.2 Exercise 2
    • 1.3 Exercise 3
    • 1.4 Exercise 4:
    • 1.5 Exercise 5:
  • 2 Mutate
    • 2.1 Exercise 1
    • 2.2 Exercise 2
    • 2.3 Exercise 3
    • 2.4 Exercise 4
  • 3 Group summary
    • 3.1 Exercise 1
    • 3.2 Exercise 2
    • 3.3 Exercise 3
    • 3.4 Exercise 4
    • 3.5 Exercise 5
  • 4 Use all
    • 4.1 Exercise 1: Calculate Average MPG by Cylinder
    • 4.2 Exercise 2: Adjusted Price Calculation
    • 4.3 Exercise 3: Compute Average Dispersion by Gear

Ex-1-1: Data Wrangling

1 Filter

1.1 Exercise 1

Objective: Filter the mtcars dataset for cars that have an automatic transmission (am == 1) and weigh more than 3,000 lbs (wt > 3).

  • Work here
  • Answer

1.2 Exercise 2

Objective: Filter the iris dataset for flowers of the species setosa where the sepal length (Sepal.Length) exceeds 5 cm.

  • Work here
  • Answer

1.3 Exercise 3

Objective: Filter the dataset for diamonds with a cut of β€œPremium” and a carat size between 1 and 2.

  • Work here
  • Answer

1.4 Exercise 4:

Objective: Filter the data for days in June (Month == 6) where the ozone level (Ozone) exceeded 100 (ignoring NA values).

  • Work here
  • Answer

1.5 Exercise 5:

Objective: Filter for records of chicks (Chick) number 1 to 5 (inclusive) and for times (Time) less than or equal to 10 days.

  • Work here
  • Answer

2 Mutate

2.1 Exercise 1

Objective: Add a column named efficiency that calculates miles-per-gallon (mpg) divided by the number of cylinders (cyl).

  • Work here
  • Answer

2.2 Exercise 2

Objective: Create a new column named area which multiplies sepal length (Sepal.Length) by sepal width (Sepal.Width).

  • Work here
  • Answer

2.3 Exercise 3

Objective: Calculate the price per carat and name the new column price_per_carat.

  • Work here
  • Answer

2.4 Exercise 4

Objective: Convert the temperature from Fahrenheit (Temp) to Celsius and name the new column TempC. The formula is C = (F - 32) * 5/9.

  • Work here
  • Answer

3 Group summary

3.1 Exercise 1

Objective: Group by the number of cylinders (cyl) and compute the average miles-per-gallon (mpg) for each group.

  • Work here
  • Answer

3.2 Exercise 2

Objective: Group by flower species (Species) and calculate the average sepal length (Sepal.Length) and sepal width (Sepal.Width) for each species.

  • Work here
  • Answer

3.3 Exercise 3

Objective: Group by cut and color and compute the median price for each combination.

  • Work here
  • Answer

3.4 Exercise 4

Objective: Group by month (Month) and compute the maximum temperature (Temp) and average ozone level (Ozone, omitting NA values) for each month.

  • Work here
  • Answer

3.5 Exercise 5

Objective: Group by diet (Diet) and chick number (Chick). For each combination, compute the final weight (i.e., weight at the maximum time).

  • Work here
  • Answer

4 Use all

4.1 Exercise 1: Calculate Average MPG by Cylinder

Task: Filter the dataset to cars with more than 100 horsepower. Then, for these cars, calculate the average miles per gallon (mpg) for each number of cylinders (cyl).

Functions to use: filter(), mutate(), group_by(), summarize()

  • Work here
  • Answer

4.2 Exercise 2: Adjusted Price Calculation

Task: Filter diamonds that are β€œIdeal” in cut and have carat less than 1. Calculate an adjusted price which is 90% of the original price. Finally, calculate the average adjusted price for each clarity level.

Functions to use: filter(), mutate(), group_by(), summarize()

  • Work here
  • Answer

4.3 Exercise 3: Compute Average Dispersion by Gear

Task: Filter cars with 4 or 6 cylinders. Create a new column named disp_per_cyl that calculates the dispersion (disp) per cylinder (cyl). Then compute the average disp_per_cyl for each gear (gear) level.

Functions to use: filter(), mutate(), group_by(), summarize()

  • Work here
  • Answer