English 146DS (W 2020) – Schedule

Schedule for English 146 (W 2021):
Data Stories

Introduction

Class 1 (Jan. 5, 2021)

Readings

Please view the following before the first class of the course if possible:

Class 2 (Jan. 7, 2021) — Taking a First Look at Data Stories

Readings

Manicule 1. For this “taking a first look” reading assignment, quickly explore the following examples of data stories to get an overall sense of what they are and how they work. The goal is to familiarize ourselves with some recent data stories.

Manicule 2. Please also go to this table of data stories classified by the kinds of techniques they use and click around to explore. The table is from Charles D. Stolper, et al., “Emerging and Recurring Data-Driven Storytelling Techniques: Analysis of a Curated Collection of Recent Stories” (2016). (Reading the article is not required.)

1. What’s a (Good) Story?

Class 3 (Jan. 12, 2021) — The Idea of Narrative

Readings

Class 4 (Jan. 14, 2021) — Narrative Discourse & Structure

Readings

Related Materials (not required)

Students who wish to learn more about the theory and analysis of narrative may be interested in the field of “narratology.” See the following online resource for a guide:

Solo assignment icon Due on This Date: Solo Assignment 1 — Narrative Analysis
5% of final grade

For this class, reread the children’s narrative by Michael Perry (with illustrations by Lee Ballard) titled Daniel’s Ride (2001). Then, using this Narrative Analysis Form, conduct an analysis of the work in which you identify or describe the narrative’s background; characters; beginning, middle, and end (or narrative arc of rising action, agon, and falling action); difference between “story (fabula)” and “plot (sjuzet)” (if any); and method of telling. Also say if you think this is a good story or not, and briefly why. Submit the form as a PDF through this course’s Gauchospace site here.

2. What’s (Good) Data?

Class 5 (Jan. 19, 2021) — The Idea of Data

Readings

  • Lisa Gitelman, ed., “Raw Data” Is an Oxymoron (2013). PDF File Read the following two chapters in the book:
    • Lisa Gitelman and Virginia Jackson, “Introduction”
    • Daniel Rosenberg, “Data before the Fact”

Class 6 (Jan. 21, 2021) — From Data to Big Data

Readings

In-class activity icon
In-Class Activity

Form project teams of 3-4 members each. (Students will be added as members to a Google “shared drive” assigned for each team that will serve as a common workspace for team activities.)

Class 7 (Jan. 26, 2021) — Data Models & Structures

Readings

Class 8 (Jan. 28, 2021) — Data Formats & Datasets

Readings

  • Clara Llebot Lorente and Diana Castillo, “Data Types & File Formats” (2020)
  • Vijay Kotu and Bala Deshpande, “Data Science Process” (2019) — Read the following sections :
    • “2.1.3. Data”
    • “2.2. Data Preparation” (including “2.2.1. Data Exploration”
    • “2.2.1. Data Exploration”
    • “2.2.3. Missing Values”
    • “2.2.5. Transformation”
    • “2.2.6. Outliers”
    • “2.2.7. Feature Selection”)
Solo assignment icon Due on this Date: Solo Assignment 2 — Conceptual Spreadsheets
10% of final grade

Using a spreadsheet program (Excel or Google Spreadsheets), prepare two spreadsheets of data according to the following instructions. When you are done, save your spreadsheets as PDFs and also export their data as CSV (comma separated values) files in .csv format. Submit both the PDFs and your CSV files on this course’s Gauchospace site here. (We will go over all this in the previous class so that everyone is familiar with this assignment.)

  • Conceptually easy data spreadsheet: Using any books, music tracks, videos, films, or similar items with familiar data values (e.g., author, genre, date, etc.) that are easily available to you in your residence or on your computer or internet, make a very small spreadsheet of data about just 5 to 10 of those items. Each row in your spreadsheet will be the data record of one item. Columns (with labels you create at the top) will be for the kinds of data values you are recording about your items (e.g., author name, genre, length, publisher, date, gender of author, etc.). Make a decision about the purpose of the spreadsheet—i.e., what kind of pattern or meaning you might want it to allow you to discover. (Write in any empty cell at the bottom of your spreadsheet a sentence or two about what this purpose is.) On the basis of that purpose, choose what kinds of data values you want your columns to record (create 4 to 10 columns with labels for such values). For example, if your items are films, do you want to record the gender of the director, or language of the film, and why?
  • Conceptually difficult data spreadsheet: Using anything ready to hand in your residence or area, or that you can find on the internet, follow exactly the same instructions as above to create a data spreadsheet for a set of items that do not have obvious, pre-established, or familiar data values (though they may have values assigned by scholarly specialists). For example, consider traditional quilting patterns or traditional African masks, which do not have the typical kind of data values that libraries or playlists use (“author,” “publisher,” etc.). What are the important data values you can think of to record about these items, and why? (The “why” is the purpose of the spreadsheet, which you should write in a sentence or two in any empty cell at the bottom of your spreadsheet.)

Class 9 (Feb. 2, 2021) — Exploring, Assessing, & Critiquing Datasets

Readings


In advance of this class, explore the sources for public datasets listed below. In this class and the next, teams will draw on these sources to choose a dataset as the basis for their data-narrative project.

Good sources of public datasets that might be the basis for a student team’s data-narrative project for this course:

  1. MEAD (Magazine of Early American Datasets)
    Representative examples:

    1. Philadelphia Migrant Landing Reports 1798-1801 Dataset
    2. York County Probate Records 1700-1800 (what people owned in Early America)
  2. U.S. Census Data (Census Bureau)
    Suggested Data “Profiles” (In each section of a “profile,” click on a table, labeled in a format like “Table: DP05,” to see the underlying data and download it in CSV format):

    1. United States Profile
    2. Los Angeles County Profile
  3. National Archives Datasets (U.S. National Archives, Open Government Initiative)
    Suggested:

    1. Amending America: Proposed Amendments to the United States Constitution, 1787 to 2014
    2. National Historical Publications and Records Commission (NHPRC) Grants, 1965-Present
    3. Social Media at the National Archives (what people view or engage with among the National Archives’ social media posts and blogs; Excel data download)
  4. Pew Research Center Datasets (datasets from the Pew Research Center) (Data downloads are in SPSS .sav format and require SPSS to work directly with the data or to export to Excel or other formats. See UCSB student access to SPSS. However, Tableau Public will open .sav files for visualization.)
    Representative example:

    1. American News Pathways June 2020 Survey (“Americans Who Mainly Get Their News on Social Media Are Less Engaged, Less Knowledgeable”) (Download dataset)
  5. World Bank Open Data
    Representative examples:

    1. Gender Statistics
    2. Education Statistics
  6. World Health Organization Datasets (Only some datasests are downloadable)
    Suggested:

    1. Maternal, Newborn, Child and Adolescent Helath and Ageing (download by clicking on “Export” to Excel icon)
  7. United Nations Statistics Division, “Other UNSD Databases” (CSV download of all data; data for particular countries and issues will need to be extracted manually into a separate spreadsheet for analysis and visualization)
  8. HUD Exchange (US Department of Housing and Urban Development)
    Suggested:

    1. 2015 Estimates of Homelessness in the U.S.

Late-breaking discoveries of good dataset sources (suggested dataset sources will be added here as the instructors or students discover them):

Other sources and search portals for datasets

  1. Google Dataset Search (search results include a variety of open and for-pay dataset sources)
  2. Humanities Data (“Humanitiesdata.com seeks to help collect and disseminate information about publicly available data of particular interest to digital humanities and humanities computing”) (tagged collection of links to datasets compiled by Matthew J. Lavin)
  3. Data Is Plural — Structured Archive (list of datasets compiled by Jeremy Singer-Vine)
  4. Wikidata (“Wikidata is a free and open knowledge base that can be read and edited by both humans and machines. Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others.”)
    1. Read the Wikidata “Introduction”
    2. Get example datasets (visualizations and downloadable data) throiugh the Wikidata Query Service. (The “Examples” of queries can also teach you how to build your own queries.)
In-class activity icon
In-Class Activity

Teams discuss the above public dataset sources and choose a short list of two best sources, and three specific datasets from those sources, that they might want to base their data-narrative project on. The best datasets for the purpose will have four properties:

  • They are fully public, meaning that they can be freely downloaded and then used, adapted, or shared unencumbered by intellectual-property restrictions;
  • They are on a topic the team is interested in or think is important for a scholarly, social, cultural, or other reason;
  • Individually or in combination they have the potential to make for a good data narrative (e.g., there is a surprise or contradiction in the data that allows a data narrative to be told in a logical form like the following, where the “but” is the equivalent of a narrative agon or tension: “The data suggests A, but it also turns out that the data shows B”);
  • The data seems “good” in the combined senses of being sufficiently valid, complete or representative, ethical, and usable (meaning well-structured, well-documented, downloadable, and free of intellectual-property constraints). (It may be helpful to look ahead to the prompt for Solo Assignment 3 due by Class 11, which asks students to describe and critique their team’s final choice of a dataset.)

Best practice is for each team to start in its Google shared drive a collaboratively created document titled, for example, “Scouting Datasets” to take notes and record outcomes.

Class 10 (Feb. 4, 2021) — Exploring, Assessing, & Critiquing Datasets (continued)

Readings


Read the following works to gain an understanding of why it is important to reflect critically on datasets in regard to their source, what they include or exclude, the way they organize or form their data, and the validity of their data:

In-class activity icon
In-Class Activity

Continuing from the previous class, teams settle on a single public dataset (or combination of a small number of datasets) that they will use for their data-narrative project. (Teams are free to excerpt only parts of datasets and to adapt, restructure, or add to them, so long as appropriate credit is given to the original datasets.)

Best practice is for each team to create in its Google shared drive a new folder titled “Datasets,” and in that folder to create a document for notes and planning titled “Dataset Work Log.” Also, that folder can be the place to save any downloads from datasets (e.g., downloaded CSV files or spreadsheets, downloaded visualizations, etc.).

3. Making Data Stories

Class 11 (Feb. 9, 2021) — Telling (and Showing) Data Stories

Readings

Solo assignment icon Due on This Date: Solo Assignment 3 — Dataset Report
15% of final grade

Each member of a team individually writes a two-page report (approximately 600 words) describing and critiquing their team’s chosen dataset. (Choose only one dataset to report on if the team is working with a combination of more than one.)

Description: The report should begin by describing the dataset objectively in regard to the following factors (add others as needed):

  • What is the source(s) of the data?
  • Who collected the data and made the dataset?
  • What is the apparent authority of the data (e.g., on a scale that runs from government or university repositories at one end to the collection data of individual hobbyists on the other end)?
  • What is the original purpose and audience of the dataset?
  • How complete and representative is the data?
  • How can you use, and how must you accredit, the dataset according to the terms of service of its source? (Only a summary of essentials is needed.)

Critique: The report should end by reflecting on what is good or bad about the dataset, whether in regard to the practical, sociocultural, or ethical.

Include notes that cite any sources, borrowings, or quotations used in the report.

This is a solo writing assignment. Of course, teams will have already discussed their dataset together. But each team member must write a report individually without borrowing directly from anyone else’s writing. It is fine, however, to draw on collective team discussion that has already occurred so long as there is a clear footnote or endnote crediting the team (e.g., “This idea comes from our team discussion,” or, “I borrow with variation an idea that came up in our team discussion”).

Submit this report as a PDF file through the course Gauchospace site here.

Class 12 (Feb. 11, 2021) — Showing (and Telling) Data Stories

Readings

In-class activity icon
In-Class Activity

Frameworking: Teams begin filling out a Framework Planning Document (download this template [TBD] and copy to your team workspace) that prepares for making a data-narrative project. The document asks teams to imagine their audience, what is at stake in their intended data narrative, their primary media channel(s), and their key data. It then asks for a “long sentence” about the dataset that is like a free-writing exercise from which a compelling data narrative can eventually be structured.

If the worksheet cannot be finished during class, teams are expected to collaborate on finishing it outside class.

Credits: This activity combines, adapts, and adds to Cole Nussbaumer Knaflic’s suggestion of a “Big Idea worksheet”; Alberto Cairo’s suggestion of a “long sentence”; and the Framework Institutes’ suggestions for framing policy recommendations.

Class 13 (Feb. 16, 2021) — Showing & Telling Data Stories: Story Maps

Readings

In-class activity icon
In-Class Activity

Storyboarding: Using their Framework Planning Document as a starting point, teams begin storyboarding their data narrative. (See Wikipedia article on the idea and applications of storyboarding.)

Suggested practice:

      1. Create in your team Google shared drive a Google Slides document called “Storyboard.” You can start with this template [TBD] if you wish (download and copy to your team’s shared drive).
      2. Using or adapting your “long sentence” from the Framework Planning Document, create a slide with short title for each logical unit of the sentence. For example, imagine that your long sentence begins as follows: “During 1970-1990, X percent of national wealth was owned by people defined as belonging to the middle class, while Y percent belonged to the lower class, and Z percent was the property of the ‘rich,’; but beginning around 2000 the percentages began to rebalance significantly.” The statements about X, Y, and Z, and also the important “but” statement in the long sentence could all be separate slides with their own title.
      3. The next step is to try to shape the “story (fabula)” of the long sentence about your data into a “narrative (syuzhet)” with a beginning, middle, and end that follows a narrative arc of rising action, point of agon or tension, and then falling action resulting in closure. The above example of the beginning of a long sentence follows a chronology that naturally seems to pivot around a “but” that is a point of narrative agon or tension. However, if chronology or any other “as found” order of your data does not make a good narrative, then there are many other ways of shaping the narrative along an arc of rising action, crisis or tension, and then falling action. For example, each of the following ways of organizing the presentation of data creates the equivalent of a narrative arc:
        • Problem indicated by the data Arrow right Suggested Action based on a specific part of the data Arrow right Outcome
        • Background sketched by the data Arrow right Opportunity identified by a part of the data Arrow right Proposal
        • Desired Outcome (i.e., “lead with the goal and not the problem”) Arrow right Suggested Action Arrow right Data on the problem
    1. Create in a separate slide a drawing of a narrative arc, and use text boxes to label its main parts (see example in the Google Slides template [TBD]). Then create boxes for the following kinds of components in your data narrative and arrange them along the arc:
      • Title of box: Data (with brief note or link to the part of your dataset needed for the following acts of “telling” and “showing”);
      • Title of box: Telling (with brief note about what you want to say in text form);
      • Title of box: Showing (with brief note or link to what you want to visualize with a graph or by other visual means).
Credits: This exercise is indebted to Cole Nussbaumer Knaflic’s ideas about storyboarding in her Storytelling with Data: Let’s Practice!, and also to Alberto Cairo’s suggestion of a “long sentence.”

Class 14 (Feb. 18, 2021) –Showing & Telling Data Stories: Timelines

Readings

In-class activity icon
In-Class Activity
  • Teams continue storyboarding their data narrative , now concentrating on the following tasks:
    • Streamlining and structuring the Data, Telling, and Showing components;
    • Detailing key components among the above.
  • Teams begin experimenting with using spreadsheet programs (Excel or Google Sheets) or visualization tools to create data visualizations for the “Showing” components on their storyboard.

Class 15 (Feb. 23, 2021) — Showing & Telling Data Stories: Data Art

Readings

In-class activity icon
In-Class Activity

Continued teamwork, now pivoting from storyboarding to creating the final data narrative project.

Class 16 (Feb. 25, 2021)

In-class activity icon
In-Class Activity

Continued teamwork on data narrative project.

Class 17 (Mar. 2, 2021)

In-class activity icon
In-Class Activity

Continued teamwork on data narrative project.

Class 18 (Mar. 4, 2021)

In-class activity icon
In-Class Activity

Continued teamwork on data narrative project.

Class 19 (Mar. 9, 2021)

In-class activity icon
In-Class Activity

Continued teamwork on data narrative project.

Team Project Presentations & Final Solo Work

Class 20 (Mar. 11, 2021) — Team Presentations of Data Narratives

Team assignment icon Due on This Date: Team Data Narrative Projects
40% of final grade

The data narrative that is the team project for this course can be relatively short (because of the limited time to work in UCSB’s quarter system). It should tell/show its data in a way that answers a question, makes a recommendation, or in some other way comes to a point (or concluding, further question)–where the telling/showing of that point is compelling because it follows some of the principles of good narrative.

The main goal is to demonstrate in compact form an understanding of the basic paradigm of an effective data story: using good data to make a good story that gets people to care about information.

Content: The data narrative project should include both text (or voice) and data visualizations (and other visual elements as appropriate). It should move through at least 5 logical or narrative stages (where a “stage” is loosely defined to mean an identifiably separate unit of telling/showing).

Staging Location: During design and development, data narratives can be created in a team’s Google shared drive. Because some data narratives may include dynamic, interactive, or other content that requires hosting on a server, it is also possible to stage a project under development in a content-management system such as a WordPress site or elsewhere online. (Mechanics can be discussed with the instructors as needed.)

Final Location: By default, a project’s final location will be online (e.g., on the team’s Google shared drive) with permissions set to publicly viewable. Narratives that require being hosted on a server can be located instead on a student’s own hosting service (e.g., a WordPress site) or elsewhere. (Mechanics can be discussed with the instructors as needed.)

Intellectual Property: Projects must be careful to respect the copyright constraints and conditions of any materials they use and make publicly viewable. In regard to the copyright status of the projects themselves: a team’s data narratives should by default be put online with a declaration that it is published under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license. However, teams are free to decide on a different option. For example, they can choose alternative Creative Commons options or declare traditional, restrictive copyright in the name of an individual or individuals. (By default, the “Student Work” page on the course site will include links to team projects, though students may request otherwise.)

Submit this assignment on the course Gauchospace site here in the form of the URL for the project.

(Due by Mar. 15, 2021) — Final Assignment

Solo assignment icon Due March 15: Solo Assignment 4 — Essay About Project
20% of final grade

Each member of a team individually writes a three-page essay (approximately 900 words) that reflects critically on their team’s data narrative project. “Critically” means that the essay should identify both the strengths and problems of the specific data narrative, and possibly also those of data narratives in general.

The essay can begin with, or include, a description of the student’s team project and its essential message. But it must go beyond that to think critically about what works well and what doesn’t in the data narrative or in data narratives generally.

Conclude the essay with a paragraph offering a utopian vision of what the ideal version of the team data narrative would add if you all had the time and the resources.

Include notes that cite any sources, borrowings, or quotations used in the report.

This is a solo writing assignment. Of course, teams will have already discussed their data narrative project together. But each team member must write an essay individually without borrowing directly from anyone else’s writing. It is fine, however, to draw on collective team discussion that has already occurred so long as there is a clear footnote or endnote crediting the team (e.g., “This idea comes from our team discussion,” or, “I borrow with variation an idea that came up in our team discussion”).

Submit the essay as a PDF file through the course Gauchospace site here.

Solo assignment icon Additional Solo Grade for Participation in Team Project and in Class Discussion
10% of final grade

The instructors will assign an additional 10% of the final grade based on their assessment of a student’s participation throughout the course in their team project (as witnessed in visible contributions to the final project or background contributions in a team’s shared drive) as well as in class discussion. Any student who participates equally in the team project and also speaks up during class discussion should be able to earn the full 10% of this grade.

A Note About Access to Reading Materials For This Course

All readings are online. Paywalled articles can be accessed over the UCSB network (or from off-campus by using the campus Pulse VPN service or the campus Library Proxy Server. You can also try to find open-access versions of paywalled materials using the Unpaywall extension for the Chrome or Firefox browsers. (Advice: It is a good idea to download materials as early as possible in case, for example, PDFs that are currently available open-access, on the open net, or through a UCSB Library digital database subscription later become inaccessible.)

Because so many readings are online (an increasingly prevalent trend in college courses), students will need to develop a method or workflow for themselves that optimizes their ability to study the materials. While everyone has their own personal preferences and technical constraints, the following guide includes suggested options for handling online materials:

Guide to Downloading and Managing Online Readings