Data Science with R (AECN 896-05)
  • Syllabus
  • Lecture Notes
  • Assignments
  • Exercises

On this page

  • 1 Merging with a single key
    • 1.1 Exercise 1
    • 1.2 Exercise 2
    • 1.3 Exercise 3
  • 2 Merging with multiple keys
    • 2.1 Exercise 1
    • 2.2 Exercise 2
    • 2.3 Exercise 3

Ex-1-2: Merge Datasets

Abstract
Data Wrangling

1 Merging with a single key

1.1 Exercise 1

Datasets: economics from the ggplot2 package and a fictitious dataset for financial events.

Task: Merge the economics dataset with a fictitious dataset that provides a financial event for specific dates. Join these datasets using left_join based on date.

Load and Create Dataset:

  • Work here
  • Answer

1.2 Exercise 2

Datasets: chickwts from the datasets package.

Task: The chickwts dataset contains the weight of chicks alongside feed type. Create a fictitious dataset that provides pricing information for each feed type. Join these datasets based on the feed type.

Load Dataset:

  • Work here
  • Answer

1.3 Exercise 3

Datasets: PlantGrowth from the datasets package.

Task: The PlantGrowth dataset provides information about the weight of plants under different treatment conditions. Create a fictitious dataset that assigns a scientific team responsible for each treatment type. Merge these datasets based on the group column.

Load Dataset:

  • Work here
  • Answer

2 Merging with multiple keys

2.1 Exercise 1

Datasets: Two fictitious datasets: one containing student enrollment details and another containing their grades.

Task: Join the enrollment dataset with the grades dataset using student_id and semester as the key variables.

Load Dataset:

  • Work here
  • Answer

2.2 Exercise 2

Datasets: Two fictitious datasets: one listing employee details and another detailing their project assignments.

Task: Merge the employees dataset with the projects dataset using both department and role as the key variables.

Load Dataset:

  • Work here
  • Answer

2.3 Exercise 3

Datasets: Two fictitious datasets: one containing transaction records and another with product pricing.

Task: Join the transactions dataset with the pricing dataset using both transaction_date and product_code as the key variables.

Load Dataset:

  • Work here
  • Answer
 

Made with Quarto