Homework 3 - Joining data
Overview: The goal of this assignment is to join two (or more) sources of data to create a single visualization, providing insight that could not be obtained from a single dataset alone. This is a core task in data analytics and visualization, as we often want to show relationships between different features of the world (though be sure to avoid the dreaded spurious correlations!)
Requirements:
Choose two or more public datasets each with at least 2 variables including at least one shared variable to join on.
As before, there are no strict restrictions on what datasets you choose, but you should aim to choose a datasets that are unique, insightful and the require minimal preprocessing.
There are no restrictions on what variable(s) you choose to join on. Common choices might be: time (day, month, year etc.), geography (city, county, state, country) or entity (school, company, etc.).
You may need to pre-process each dataset to ensure alignment of the joining variable.
Create a visualization involving at least 3 variables from across the two datasets. The specifics of the visualization are up to you, but should aim to be relevant and insightful.
Include a descriptive caption explaining the context.
Cite and provide a link to the source of each dataset used. Specify which variables came from each dataset.
Aim to make each visualization polished and legible.
You should follow the standard instructions for submitting this assignment on Canvas.