7 Group task
Hopefully, by this point you have learnt about creating different plots in R and you could follow the instructions. For the next step in your learning, it is time to apply your new found R skills to novel data.
First, arrange yourselves into groups. If we have a full room of 20 people, we recommend five groups of four members. This should spread the expertise while not leaving people with nothing to do. If we end up with a different number, we can tweak the group sizes during the workshop.
We have collated four data sets - some from academic articles and others from statistics agencies - for you to choose from for a final group task. We created an OSF project to host all the data and we added a code book so you know what each variable represents. The data sets come from a range of sources, so hopefully your group can find one you all find interesting. You will not need to use every single variable, so a key part of this task will be deciding what message you are trying to communicate and what type of plot would be best suited to the type of data you use.
You can find a direct link to each data set below:
Farias et al. (2019) – Athetist’s and Christian’s motivations to hike a pilgrimage trail. Since we have not covered data wrangling, we have included two versions of this data set for if you want to plot all the motivation sub-scales side by side (
Farias_2019_long.csv
) or just choose one or two sub-scales (Farias_2019.csv
).Dawtry et al. (2015) – Perceptions of income inequality based on people's household income, their estimated average income, how fair they perceive the current system to be, and how much they support wealth redistribution.
Road accidents - data from Glasgow City Council on the number of road accidents. You have also information on the day of the week each accident happened, the speed limit in the area it happened and accident severity.
Once you decide on a data set, we would like you to create two versions of a plot:
One “good” plot to transparently communicate the findings
One “bad” plot to misleadingly or poorly communicate the findings
The idea here is you will demonstrate what you have learnt about plotting using ggplot and the principles of visual data communication from the start of the workshop. A key part of knowing how to create effective data visualisation is understanding how it could also be presented ineffectively. For example, truncating the y-axis or using counter intuitive design elements, compared to transparently displaying the underlying distribution of continuous data and selecting colour blind friendly colour palettes.
When you have finished, save your plots and have one of your group upload your good and bad examples to this padlet board.
Everyone then has one good vote and one bad vote, so heart your favourite from each category. The good plot and the bad plot with the most votes will get a super special prize for each group! If we have plots with tie votes, Wil and James will vote as tiebreakers.