Schedule for English 146 (W 2023)
Data Stories: Theory and Practice of Data-driven Narratives in the Digital Age
Some “Class Notes” pages are protected by a password (available on the course Canvas site)
Class 1 (Jan. 12, 2023) [rescheduled from Jan. 10] — Introduction to Course
- “Hans Rosling’s 200 Countries, 200 Years, 4 Minutes — The Joy of Stats” (BBC Four video, 2010)
- Also see analysis of the video by Anjali Sharma
Class 2 (Jan. 17, 2023) — Taking a First Look at Data Stories
|Optional: For a wider variety of data story formats (which you can return to for ideas in future classes), see this table of data stories classified by the kinds of techniques. Click around to explore. The table is from Charles D. Stolper, et al., “Emerging and Recurring Data-Driven Storytelling Techniques: Analysis of a Curated Collection of Recent Stories” (2016). (Reading the article is not required.)|
1. What’s a (Good) Story?
Class 3 (Jan. 19, 2023) — The Idea of Narrative
- H. Porter Abbott, The Cambridge Introduction to Narrative (2002). Read the following:
- Chapter 1: “Narrative and Life” (Alternative online source for this chapter with better quality illustrations)
- Chapter 2: “Defining Narrative”
- Paul Tough, The Inequality Machine: How College Divides Us (2021) — Read Chap. 1, “Wanting In,” on the Amazon site by clicking on the “Look Inside” free preview of the book (or clicking on the image of the book cover). (Also available to read online here through ProQuest Ebook Central via UCSB Library’s subscription.)
Class 4 (Jan. 24, 2023) — Narrative Discourse & Structure
- H. Porter Abbott, The Cambridge Introduction to Narrative (2002). Read the following:
- Chapter 4: “The Rhetoric of Narrative”
- Chapter 5: “Closure”
- Summary of Aristotle’s Poetics (c. 335 BC) on the nature of narrative (specifically, Classical Greek tragedies)
- After reading the summary, you may wish to read these selections from Aristotle’s Poetics in English translation.
- Allan Parsons, Summary: “Story (Fabula) and Plot (Sjuzet or Sjuzhet)” (2016)
- Rebecca Ray, “Narrative Structure” (2020)
- Ella Saltmarshe, “Using Story to Change Systems” (2018)
Related Materials (optional)
Students who wish to learn more about the theory and analysis of narrative may be interested in the field of “narratology.” See the following online resource for a guide:
Due on This Date: Solo Assignment 1 — Narrative Analysis
For this class, reread chapter 1 (“Wanting In”) of Paul Tough’s The Inequality Machine: How College Divides Us (2021) via the free preview (“Look Inside”) on the Amazon site. Then, considering the chapter as a “narrative,” conduct a narrative analysis of it using this Narrative Analysis Template. Submit your competed analysis as a PDF through this course’s Canvas site here. Grading Rubric
2. What’s (Good) Data?
Class 6 (Jan. 31, 2023) — The Idea of Data
- Lisa Gitelman, ed., “Raw Data” Is an Oxymoron (2013) [alternative link]. Read the following two chapters in the book. When you are done, pick out one thing that especially interests or challenges you that our class might turn to together during discussion.
- Clara Llebot Lorente and Diana Castillo, “Data Types & File Formats” (2020)
Class 7 (Feb. 2, 2023) — Data Models & Structures
- Yin Liu, “Ways of Reading, Models for Text, and the Usefulness of Dead People” (2013)
- Joshua M. Epstein, “Why Model?” (2008)
- Wikipedia, “Data Model” (quickly look at this article to get a sense of concepts )
Form project teams of 3-4 members each. (Students will be added as members to a Google “shared drive” assigned for each team that will serve as a common workspace for team activities.)
Class 8 (Feb. 7, 2023) — From Data to Big Data
- Jonathan Stuart Ward and Adam Barker, “Undefined By Data: A Survey of Big Data Definitions” (2013)
- Matthew L. Jones
- “How We Became Instrumentalists (Again): Data Positivism since World War II” (2018) (paywalled; requires UCSB institutional access; click on “View Full Page PDF” in right sidebar to download as PDF)
- “Querying the Archive: Data Mining from Apriori to PageRank” (2017)
Class 9 (Feb. 9, 2023) — Data Science Process
- Vijay Kotu and Bala Deshpande, “Data Science Process” (2019)
- [Optional: Victoria Stodden, “The Data Science Life Cycle: A Disciplined Approach to Advancing Data Science as a Science” (2020) ]
Due by time of class on this date: Solo Assignment 2 — Conceptual Spreadsheets
Using a spreadsheet program (Excel or Google Spreadsheets), prepare two spreadsheets of data according to the instructions below. Export your spreadsheets as PDFs and submit them on this course’s Canvas site here. (See examples of “easy” and “hard” spreadsheets.) Grading Rubric
- Conceptually easy data spreadsheet: Using any books, music tracks, videos, films, or similar items with familiar data values (e.g., author, genre, date, etc.) that are easily available to you, make a very small spreadsheet about those items (e.g., covering just 5 to 10 items). Each row in your spreadsheet will be the data record of one item. Columns (with labels you create at the top) will be for the kinds of data values you are recording about your items (e.g., author name, genre, length, publisher, date, gender of author, etc.). Make a decision about the purpose of the spreadsheet—i.e., what kind of pattern or meaning you might want it to allow you to discover. On the basis of that purpose, choose what kinds of data values you want your columns to record (create 4 to 10 columns with labels for such values). For example, if your items are films, do you want to record the gender of the director, or language of the film, and why?
Finally, write into an empty cell in your spreadsheet a brief explanation of the purpose of your spreadsheet, and include any thoughts you have about your choice of data values or examples This writing should be the equivalent of about 1-2 paragraphs, or about 200-300 words. (Use word-wrap and/or merge-cells to make all the text visible in the exported PDF of the spreadsheet.)
- Conceptually difficult data spreadsheet: Follow the same instructions as above to create a data spreadsheet for a set of items that do not have obvious, standard, or familiar data values (though they may have values assigned by scholarly specialists). For example, consider traditional American quilting patterns, traditional African masks, or human feelings, which do not have the typical kind of data values that libraries or playlists use (“author,” “publisher,” etc.). What are the important data values you can think of to record about these items, and why?
The “why” is the purpose of the spreadsheet, which in an empty cell you should explain, including any thoughts you have on your choice of data values or examples, epistemological or ethical issues you encountered, etc. (about 1-2 paragraphs, or 200-300 words).
Class 10 (Feb. 14, 2023) — Exploring & Assessing Datasets (Part 1)
In advance of this class, explore the sources for public datasets listed below. In this class and the next, teams will draw on these sources to choose a dataset as the basis for their data-narrative project.
Teams discuss the above public dataset sources and choose a short list of two best sources, and three specific datasets from those sources, that they might want to base their data-narrative project on. The best datasets for the purpose will have four properties:
(On data quality, see the readings for Class 11. For a quick preview, see this table. It may also be helpful to look ahead to the prompt for Solo Assignment 3 due in Class 13, which asks students to describe and critique their team’s final choice of a dataset.)
Best practice is for each team to start in its Google shared drive a collaboratively created document titled, for example, “Scouting Datasets” to take notes and record outcomes.
Class 11 (Feb. 16, 2023) — Exploring & Assessing Datasets (Part 2)
- Leo L. Pipino, Yang W. Lee, and Richard Y. Wang, “Data Quality Assessment” (2002)
- Table of “Dimensions of Data Quality” (from Ken So and Ben Lorica, “Data Quality Unpacked” (2021)
- Timnit Gebru et al., “Datasheets for Datasets” (2019) (read just the main article, pp. 1-12, not the appendix)
- Andrej Zwitter, “Big Data Ethics” (2014)
Continuing from the previous class, teams settle on a single public dataset (or combination of a small number of datasets) that they will use for their data-narrative project. (Teams are free to excerpt only parts of datasets and to adapt, restructure, or add to them, so long as appropriate credit is given to the original datasets.)
Best practice is for each team to create in its Google shared drive a new folder titled “Datasets,” and in that folder to create a document for notes and planning titled “Dataset Work Log.” Also, that folder can be the place to save any downloads from datasets (e.g., downloaded CSV files or spreadsheets, downloaded visualizations, etc.).
Team Coordination Weekly Planning Reports
Beginning at the end of this week in the course, each individual team member must fill out a brief Team Coordination Weekly Planning Report by the Friday of each week. Download this template for your report about what you understand to be your own duties/role on the team project in the next week (and the duties/role of your team members too). Upload the report as a file in Canvas here for this assignment. Be sure to save the file on your computer for future use. Then, during each subsequent week, append a new report based on the template to your file (but do not delete previous reports), and upload the new file in Canvas as a resubmission for this assignment.
The purpose of the Team Coordination Weekly Planning Report is to help you with planning and recording your role on your team, and it will give the instructors a “behind the scenes” look at your process. It will also bring a sustained level of accountability to the team. (For example, if there is an issue with a team member not doing their work, or if miscommunication is happening at any stage of the process, the instructors want to know sooner rather than later!). The Team Coordination Weekly Planning Report is a confidential communication from each student to the instructors.
Although there is no grade attached to these weekly planning reports, they are required each week (begnning in week 6). Also, please feel free to reach out directly over email or otherwise to the instructors about planning or other issues.
Reminder: The duties/roles you record on this sheet should be for the upcoming week, not the past week.
3. Making Data Stories
Class 12 (Feb. 21, 2023) — Telling (and Showing) Data Stories
- Michelle Scalise Sugiyama, “The Forager Oral Tradition and the Evolution of Prolonged Juvenility” (2011) — Read only the following (page numbers are for the PDF version):
- pp. 1-2
- pp. 8-14 (beginning with “To illustrate this point, consider three knowledge sets critical to success in the foraging niche….” and ending before the “Testing the Hypothesis” section)
- Martha Kang, “Exploring the 7 Different Types of Data Stories” (2015)
- FrameWorks Institute, “The Storytelling Power of Numbers” (2015)
- Alberto Cairo
|For examples of many kinds of data stories, see the readings for Class 2 and also the links in this table of data stories classified by the kinds of techniques they use. The table is from Charles D. Stolper, et al., “Emerging and Recurring Data-Driven Storytelling Techniques: Analysis of a Curated Collection of Recent Stories” (2016).|
Class 13 (Feb. 23, 2023) — Showing (and Telling) Data Stories
- Edward Segel and Jeffrey Heer, “Narrative Visualization: Telling Stories with Data” (2010)
- Bongshin Lee and Nathalie Henry Riche, et al., “More Than Telling a Story: Transforming Data into Visually Shared Stories” (2015)
- Stephen Few
- Cole Nussbaumer Knaflic, Storytelling with Data (2015). Read chapter 2, “Choosing an Effective Visual” (pp. 35-69) (access over the UCSB network (or from off-campus by using the campus Pulse VPN service or the campus Library Proxy Server)
Frameworking: Teams begin filling out a Framework Planning Document that prepares for making a data-narrative project. (Download this Framework Planning Document template with detailed instructions and copy it to your team workspace.) The document asks teams to imagine their audience; the purpose of their intended data narrative (and what is at stake); their key data; their primary media and form or genre; and their main distribution channel. It then asks for a “long sentence” about the dataset that is like a free-writing exercise from which a compelling data narrative can eventually be structured.
If the worksheet cannot be finished during class, teams are expected to collaborate on finishing it outside class.
Due on by time of class on this date: Solo Assignment 3 — Datasheet Report for Your Dataset
Each member of a team individually creates a “datasheet” for their team’s chosen dataset. Use this Datasheet for Dataset Template. as a model. (Choose only one dataset to report on if the team is working with a combination of more than one.) For the rationale and examples of datasheets, see Timnit Gebru et al., “Datasheets for Datasets” (2019). Grading Rubric
This is a solo writing assignment. Of course, teams will have already discussed their dataset together. But each team member must write a report individually without borrowing directly from anyone else’s writing. It is fine, however, to draw on collective team discussion that has already occurred so long as there is a clear footnote or endnote crediting the team (e.g., “This idea comes from our team discussion,” or, “I borrow with variation an idea that came up in our team discussion”).
Submit this datasheet report as a PDF file through the course Canvas site here.
Class 14 (Feb. 28, 2023) — Showing & Telling Data Stories: Story Maps
- Knight Lab, StoryMap.js (see some of the examples)
- See other storytelling tools from Knight Lab
Storyboarding: Using their Framework Planning Document as a starting point, teams begin storyboarding their data narrative. (See Wikipedia article on the idea and applications of storyboarding.)
Class 15 (Mar. 2, 2023) –Showing & Telling Data Stories: Timelines
- Knight Lab, Timeline.js (see some of the examples)
- Florian Kräutli, “Visualising Cultural Data: Exploring Digital Collections Through Timeline Visualisations” (dissertation, 2017) – read pp. 100-122
- Note: Specialized statistics programs such as SPSS or programming languages such as R may also be used to create visualizations if students are familiar with them. However, this course does not assume that students have such familiarity.
Class 16 (Mar. 7, 2023) — Showing & Telling Data Stories: Data Art
- Giorgia Lupi and Stefanie Posavec, “Dear Data” (website for their project and 2016 book)
- Lisa Jevbratt, 1:1 (2) (1999-2002)
- George Legrady
Continued teamwork, now pivoting from storyboarding to creating the final data narrative project.
Class 19 (Mar. 16, 2023) — Team Presentations of Data Narratives
Due on This Date: Presentations of Team Data Narrative Projects
(Submit the projects on Canvas by end of following Monday)
The data narrative that is the team project for this course can be relatively short (because of the limited time to work in UCSB’s quarter system). It should tell/show its data in a way that answers a question, makes a recommendation, or in some other way comes to a point (or concluding, further question)–where the telling/showing of that point is compelling because it follows some of the principles of good narrative.
The main goal is to demonstrate in compact form an understanding of the basic paradigm of an effective data story: using good data to make a good story that gets people to care about information.
Content: The data narrative project should include both text and data visualizations (and other visual elements as appropriate). It should move through at least 5 logical or narrative “scenes” (where a “scene” is loosely defined to mean an identifiably separate unit of telling/showing). There should also be an “About” statement for readers, and an “Information for the Instructors” statement (see details in Grading Rubric)
Staging Location: During design and development, data narratives can be created in a team’s Google shared drive or on the platform of a visualization service such as ArcGIS StoryMaps, Tableau Public, etc.
Final Location: By default, a project’s final location will be, on the team’s Google shared drive with permissions set to publicly viewable or on an online data visualization, mapping, or similar service. Students are also free to post stories on their own blog or other websites (e.g., a WordPress.com or Reclaim Hosting site).
Intellectual Property: Projects must be careful to respect the copyright constraints and conditions of any materials they use and make publicly viewable. In regard to the copyright status of the projects themselves: a team’s data narratives should by default be put online with a declaration that it is published under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license. However, teams are free to decide on a different option. For example, they can choose alternative Creative Commons options or declare traditional, restrictive copyright in the name of an individual or individuals. (By default, the “Student Work” page on the course site will include links to team projects, though students may request otherwise.)
Presentations on the team project occur in the course’s last class.
Submit this assignment by no later than the end of Monday March 20th on the course Canvas site here in a form appropriate for the nature of your project (e.g., as a document containing a URL, a PDF of the project, etc.). Remember to provide instructors with the permissions needed to view and comment on any online materials that require sharing permissions (“commenter” permissions in Google Drive, for example).
Only one member of a team needs to submit this assignment (which is a “group assignment” in Canvas).
(Due by Mar. 21, 2023) — Final Assignment
Due March 21 (by 11:59 pm): Solo Assignment 4 — Essay About Project
Each member of a team individually writes a three-page essay (approximately 900 words) that reflects critically on their team’s data narrative project. “Critically” means that the essay should identify both the strengths and problems of the specific data narrative, and possibly also those of data narratives in general.
The essay can begin with, or include, a description of the student’s team project and its essential message. But it must go beyond that to think critically about what works well and what doesn’t in the data narrative or in data narratives generally.
Conclude the essay with a paragraph offering a utopian vision of what the ideal version of the team data narrative would add if you had all the time and resources you needed.
Address the essay to a hypothetical general audience and not just “insiders” to our class who already know all the necessary context or information about your project. Include notes that cite any sources, borrowings, or quotations.
This is a solo writing assignment. Of course, teams will have already discussed their data narrative project together. But each team member must write an essay individually without borrowing directly from anyone else’s writing. It is fine, however, to draw on collective team discussion that has already occurred so long as there is a clear footnote or endnote crediting the team (e.g., “This idea comes from our team discussion,” or, “I borrow with variation an idea that came up in our team discussion”).
Submit the essay as a Word or PDF file through the course Canvas site here.
Additional Solo Grade for Participation in Team Project and in Class Discussion
The instructors will assign an additional 10% of the final grade based on their assessment of a student’s participation throughout the course in their team project (as witnessed in visible contributions to the final project or background contributions in a team’s shared drive) as well as in class discussion. Any student who participates equally in the team project and also speaks up during class discussion should be able to earn the full 10% of this grade.
A Note About Access to Reading Materials For This Course
All readings are online. Paywalled articles can be accessed over the UCSB network (or from off-campus by using the campus Pulse VPN service or the campus Library Proxy Server. You can also try to find open-access versions of paywalled materials using the Unpaywall extension for the Chrome or Firefox browsers. (Advice: It is a good idea to download materials as early as possible in case, for example, PDFs that are currently available open-access, on the open net, or through a UCSB Library digital database subscription later become inaccessible.)
Because so many readings are online (an increasingly prevalent trend in college courses), students will need to develop a method or workflow for themselves that optimizes their ability to study the materials. While everyone has their own personal preferences and technical constraints, the following guide includes suggested options for handling online materials: