Due on Wednesday June 4 (week 10) at 11:59 PM
Description
Data visualization is a huge part of data storytelling. It also happens to be one of the major topics within the R user community. In this assignment, you’ll create an infographic using your personal data project, drawing inspiration from other people’s code and visualizations. You’ll learn about how people share their code (namely using GitHub) and create data art. You’ll then create your own piece, starting from the beginning (a sketch on a piece of paper) to a finished product (a visualization made in R, compiled in Canva, Illustrator, or another platform of your choosing). In the process, you’ll learn how to use ggplot and related packages to create visualizations. All components of this assignment should be written and rendered/knitted using Quarto or RMarkdown.
What is visual storytelling?
Asking an audience to read a paragraph of text is hard, but asking people to look at an infographic is way easier. You can draw people in with a well-presented, aesthetically pleasing figure, and then present some information that they might be interested in.
Some examples of infographics that incorporate icons, images, and data include:
- Information is Beautiful’s Seaweed: food, fertilizer, feed, fuel
- Information is Beautiful’s Snake Oil Supplements?
Where to find inspiration
Many examples of cool visualizations come from #tidytuesday, a hashtag/movement for R user community members to come together and learn more about how to use R tools to visualize and tell stories about data. Each week, organizers post a data set that has been cleaned up and is ready for visualization. We’ve worked with Tidy Tuesday data in class before: fisheries and ramen.
Components
Part 1. Understand the context
a. Choose 4-5 types of visualizations you might be able to make with your data.
Find examples that might work with the kind of data you have, or could work with some wrangling (e.g. you would have to calculate a sum or a mean, or otherwise summarize your data). Make notes of these visualizations (somewhere you can refer back to).
You’re not going to be using all of the types of visualizations you choose, but you need to come up with a list that allows you to try stuff out (and throw it out, if things don’t work).
Potential resources for this step (but you can and should find others):
- Yan Holtz’s From Data to Viz and R Graph Gallery to figure out what kinds of figures you could make
- Sam Csik’s “One workflow for building effective (and pretty) {ggplot2} data visualizations.” to see how you can adjust the ggplot aesthetics to make a good looking figure
- Nguyen Chi Dung’s “Infographics Using R” for a basic example of what is possible in R
- R statistics for Political Science’s “Create infographics with the Irish leader dataset in R and Canva” to see how to integrate visualizations from R with design platforms like Canva
- Deepanshu Bhalla’s “Create Infographics with R” to see how you can use images and icons
- Krzysztof Joachimiak’s compilation of packages and more that add visual interest to plots
- Elmera Azadpour et al.’s “Jazz up your ggplots!” to see what extensions to ggplot you can use to make more beautiful plots
b. Find some examples of visualizations you like, and save them somewhere you can refer back to.
Scroll through the #tidytuesday hashtag on X to see what visualizations people are making with the weekly data sets (link).
Some cool examples (not an exhaustive list):
- Ijeamaka Anyene’s collection of visualizations
- Aditya Dahiya’s visualization of National Science Foundation grants
- Steven Ponce’s visualization of racial and ethnic disparities in reproductive health research
- Ifeoma Egbogah’s visualization of the Palmer Penguins dataset
- Cédric Scherer’s visualization of the Palmer Penguins dataset (code here)
- Victor Gauto’s visualization of Dungeons & Dragons actions
- Manasseh Oduor’s visualization of attendance at different higher education institutions
- Georgios Karamanis’s visualization of Himalayan expedition data (code here)
- Dan Oehm’s visualization of Pokémon attributes (code here)
- Nicola Rennie’s visualization of word counts from The Simpsons
The most useful examples will be ones where people have shared their code on GitHub.
The best way to understand someone’s code is to run it yourself. You can do this by forking and cloning the repository to your machine (as we did in workshop), or by copy/pasting the script into your computer. If you choose the second option, you may have to do some adjustment to file paths to get the script to read in the data correctly.
Part 2. Your assignment
Requirements
Your infographic should have a clear narrative flow, demonstrating that you know how to tell a story (about yourself!) using data visualization.
All visualization components must be done in R. When you compile your visualizations into an infographic (with background colors, text boxes, titles, etc.) you can use Canva, Illustrator, PowerPoint, Slides, etc.
Your infographic should have at least 3 data visualizations (i.e. figures that show some kind of data or numbers):
You should also incorporate at least 2 add-ons from the following list:
Your final infographic should have:
-
- how long you have been collecting it
- how many observations you have
- the date of the most recent observation
In addition to incorporating these requirements, you will be graded on:
- correct, logical visualizations for the data you have (are you presenting the data accurately?)
- clear, concise communication about your data using visualization and text descriptions of patterns or messages
- interesting and compelling aesthetic presentation to tell a clear visual story across your visualizations
In your final submission, you will also include a write-up (details below).
a. Update your data entry.
Enter all the data you have.
If you get more observations as you’re completing this assignment (and you should), you need to update your figures to use your most recent data set.
b. Sketch your visualization.
Use colored pencils, highlighters, markers, etc. to represent the axes, images, text, titles, etc. of your infographic. Do not do this in code.
Incorporate elements you liked from the visualizations you found in part 1 and clearly mark where they came from (using a footnote or direct annotation).
If you want to check in with An, you can turn in this sketch (as a photo or screenshot) by the 21st of May.
This check-in is optional. However, if you choose to do the check-in, the more detail you have on your sketch, the more thorough feedback you will receive.
c. Code up your visualizations!
Create a GitHub repo for this assignment.
Clone the repo to your computer.
Organize your data and code into separate folders.
Create a .qmd or .Rmd file to code your visualizations.
Your file should have:
All code should be annotated (see An if you are unclear about what counts as annotation).
d. Arrange your visualizations using the design platform of your choosing.
Incorporate color, text, etc. as needed to make your infographic visually interesting and compelling.
There is a required check-in for this assignment due on the 28th of May so that you can get directed feedback on your progress so far.
For the required check-in, you should have the basic components of your visualizations completed (you don’t have to have things finalized with the right colors, aesthetics, or add-ons) and you should have arranged all of your visualizations into a single infographic with at least a title.
e. Write about your process
In the “write-up” section of your file, address the following points in 1-3 sentences each:
General information about design and visualization
- What patterns are you highlighting in each visualization? Why?
- Outside of the data visualization, what aesthetic choices (e.g. color, font, arrangement) did you make and why? In what way do your choices contribute to a compelling or interesting narrative?
Sources and process
- For each visualization, describe what examples inspired that visualization (cite the author)
- Describe your coding process, for example (not the only options for this prompt): did you have to clean up/summarize the data before you started? Which geoms did you start with? What geoms did you end up using, and why?
- Describe the tools you used: other people’s code? Google/StackOverflow? ChatGPT (with examples of prompts)?
Checklist
For your final submission, you only need to submit the link to GitHub repo where your materials for this specific assignment are (make sure it is public!).
Your GitHub repository should include:
Additionally, you should be committing and pushing changes throughout your completion of this assignment. Thus, you should make sure that you have at least 10 commits/pushes with informative commit messages.