Practicum Instructions for English 238 (Fall 2021)
Digital Humanities: Introduction to the Field
Practicum Assignment Instructions
Course “practicums” are hands-on, small-scale exercises that ask students to experiment at a beginner’s level or conceptually with the methods and tools of the digital humanities. The goal is not technical mastery but learning enough about the technologies to think about, and through, their concepts and also to discover which methods and tools might be used in a student’s future research. In many cases, experience gained in the practicums will contribute directly to the discussion of issues in class.
Outputs from practicum assignments (such as a file, screenshot, etc. should be uploaded to a student’s folder in the course’s Sandbox for Student Work (a private Google Drive folder for course members). The assignments are due by the time of class on various days so that they may be shown as a way to start discussion of the issues involved.
Conceptual Dataset Exercise
Using a spreadsheet program (Excel or Google Spreadsheets), prepare two spreadsheets of data according to the instructions below. Upload the sheets to your folder in the Sandbox for Student Work.
- Conceptually easy data spreadsheet: Using any books, music tracks, videos, films, or similar items with familiar data values (e.g., author, genre, date, etc.) that are easily available to you, make a very small spreadsheet about those items (e.g., covering just 5 to 10 items). Each row in your spreadsheet will be the data record of one item. Columns (with labels you create at the top) will be for the kinds of data values you are recording about your items (e.g., author name, genre, length, publisher, date, gender of author, etc.). Make a decision about the purpose of the spreadsheet—i.e., what kind of pattern or meaning you might want it to allow you to discover. On the basis of that purpose, choose what kinds of data values you want your columns to record (create 4 to 10 columns with labels for such values). For example, if your items are films, do you want to record the gender of the director, or language of the film, and why?
Finally, write into an empty cell in your spreadsheet a brief explanation of the purpose of your spreadsheet, and include any thoughts you have about your choice of data values or examples.
- Conceptually difficult data spreadsheet: Follow the same instructions as above to create a data spreadsheet for a set of items that do not have obvious, standard, or familiar data values (though they may have values assigned by scholarly specialists). For example, consider traditional American quilting patterns or traditional African masks, which do not have the typical kind of data values that libraries or playlists use (“author,” “publisher,” etc.). What are the important data values you can think of to record about these items, and why?
The “why” is the purpose of the spreadsheet, which in an empty cell you should explain, including any thoughts you have on your choice of data values or examples, epistemological or ethical issues you encountered, etc.
Text Encoding Exercise
(A) Simple HTML Exercise (with an added conceptually challenging problem)
Step 1 (hands-on exercise): Go to the W3C Schools HTML Tutorial page and look quickly at the explanations in the first 7 lessons linked in the left sidebar (from “HTML Introduction” to “HTML Paragraphs”). Also selectively look at the explanations for other lessons of interest to you. Be sure to look as well at the explanation of “HTML CSS.” (Students already familiar with HTML and CSS can skip these explanations.)
Then on the home page of the W3C Schools HTML Tutorial page, click on the “Try it Yourself” button to open a two column interactive page where you can write HTML on the left, hit the “Run” button, and see the results rendered on the right. On this page, write HTML to create a simple web page with any content, images, and links you wish (subject, of course, to good taste and copyright laws). The page should include at least the following features:
- Text formatted in basic ways (as headers, bold, italics, etc.)
- Text in paragraph structures
- Text in lists
- Links
- A table
- An image
Also experiment with simple CSS (Cascading Style Sheets) to adjust the format/style of various elements on your test page. Students already familiar with HTML and CSS can conduct more advanced exercises if they wish.
When you are done, save the HTML file in your folder in the Sandbox for Student Work as follows:
- Right-click in the folder to open the menu for new kinds of documents you can create in Google Drive.
- Select “More” to show options, including “Text Editor”
- Select “Text Editor” and then “Create New Text File”
- Paste in the HTML code you created on the W3C Tutorials page and save it with a filename ending in “.html”. (Downloading this file from Google Drive to a local computer and clicking on it will open it and render it in a browser if you have your browser set to automatically open local .html files.)
Step 2 (conceptual exercise): Take a look at E. E. Cummings’ well-known poem “r-p-o-p-h-e-s-s-a-g-r” (1935). (If you are interested, you can read an analysis of the poem, also known as “Grasshopper,” here.) Think about what might be the best approach to encoding this in HTML. (Note that multiple spaces entered in HTML code renders normally as a single space.)
(B) Imaginary TEI/XML Exercise
In this exercise, you will automatically generate a basic TEI XML encoding of Lincoln’s Gettysburg Address (Bliss copy, 1864) and then make up additional tags to encode something you think is important or of interest to you in the text.
- First download and save this text file to your computer:
lincoln-gettysburg-address-bliss-copy-1864.txt
. (Save as a file with a “.txt” extension.) - Go to the Text Encoding Initiative’s OxGarage Conversion site, and follow this process:
- For “Convert from: ?”, click on “Documents”. Among the options for document formats, choose “Plain Text (.txt).”
- Then after a wait, you will see the prompt “Convert to: ?” appear to the right. Choose among the options “TEI P5 XML Document”.
- When the next set of prompts appear at the top of the screen, click on “Choose File” and select the Lincoln text file where you stored it on your computer.
- Then click on “Convert”, which will generate a TEI/XML encoded version of the Lincoln text and download it to you computer with the file extension
.xml
. - Open this
.xml
file in a text editor (such as Notepad, Notepad++, or TextEdit) so that you are prepared to copy and paste from it.
- Next go to the XML Viewer page on the Code Beautify site.
- Copy and paste the content of the Lincoln
.xml
file you previously created into the “XML Input” column on the left. - Then click the button in the center of the page for “Beautify Format” to generate a nicely formatted version of the XML in the column at the right.
- Then click on “Download” in the center of the page to download the beautified version of the Lincoln
.xml
file. (Or just copy-and-paste from the column on the right into a text editor like Notepad, Notepad++, or TextEdit, and save the file as a .xml file.)
- Copy and paste the content of the Lincoln
- Now think about what you would like to encode in the beautified Lincoln
.xml
file that seems important or interesting, and how to do it. Since this is an imaginary exercise, you do not need to study the TEI guidelines for actual tags designating elements, attributes, etc. (If you are interested, though, you can get a sense of what is possible by browsing the TEI by Example Tutorials, such as the one for encoding prose.) Just make up your own tags. For example, you could make up a tag to use in the first sentence of Lincoln’s speech such as this one (a tag for the element “metaphor” that has attributes for “type” and whatever else you wish): “Four score and seven years ago <metaphor type=”genealogical” subtype=”patriarchal” motive=”call_for_authority”>our fathers</metaphor>….” Encode part or all of the text with one or two kinds of tags of this sort. - When you are done save your revised
.xml
file and upload it to your folder in the Sandbox for Student Work.
Text Analysis Exercise
What you will need for this exercise:
- Antconc program (download)
- A stopwords list (download the following stopwords list and save it as a plain-text “.txt” file on your computer: buckley-salton.txt)
- A long document or set of documents (up to several hundred documents if you wish) that you have access to as plain text files (stored as “.txt” files). [See suggestions for sources of texts below]
Instructions for this exercise:
- Download onto your computer the Antconc text-analysis program (available for Mac, Windows, Linux). Antconc comes as a simple executable file that does not need to be “installed.” You just run the file.
- Note for Mac Users: When you try to open AntConc, there will be a security message that says the app was prevented from opening. In order to get around this, you need to click the Apple icon, go to System Preferences, then Security & Privacy, and (if you haven’t already) change your preferences to Allow apps downloaded from “App Store and identified developers.” Even if these are your settings, the system will prevent AntConc from opening (it’s not an identified developer), but there will be a caption next to the radio buttons that says something along the lines of “AntConc was prevented from opening” and a button labeled “Open Anyway.” Click Open Anyway, and you shouldn’t have any issues running the application after that.
Note: If you have trouble getting AntConc to work (e.g., on a Mac) and cannot resolve the issues, then you may want to try a different well-known program that runs wholly through a browser online: Voyant Tools. On the Voyant Tools site, you can upload your own files for analysis. Or you can experiment with two corpora of demo text files (by Shakespeare or Austen). (Click “Open” on the home screen to access the demo files).
Voyant Tools include a whole suite of tools presented in a “dashboard” style, each of whose panes can be set for a different tool. Tools relevant to this practicum (analogous to those in AntConc) include “Corpus Terms,” “Corpus Collocates,” “Phrases,” etc. See the Voyant Tools Help page for guides to each tool.
Configuration bar in each of Voyant Tools’s panes. Click on the icon that the cursor is pointing to here to open a menu of tools. - Find or create a plain-text (.txt) version of a long literary work or collection of works. Possible sources:
- Alan keeps a set of demo text corpora on his DH Toychest site here.
- Study your chosen work(s) with Antconc. Then take “souvenirs” of your explorations with Antconc (hopefully something interesting) in the form of screenshots of Antconc, or text files created by using Antconc’s “Save Output” function. Save your souvenirs in your folder in the Sandbox for Student Work .
- Tutorials for Antconc:
- Video and other tutorials on Antconc home page (scroll down to the tutorials)
- Heather Froehlich, “Getting Started with AntConc” (tutorial)
- The Grammar Lab, “Antconc Walk-Through” (tutorial)
- Basic Instructions for using Antconc: (click on thumbnails for larger screenshot images)
- Instructions for using a “stopwords” list in Antconc (to filter out common words like “the,” “of,” etc.):
- Download the following stopwords list and save it as a plain-text “txt” file on your computer: buckley-salton.txt
- In AntConc, click on “Tool Preferences” among the tabs at the top. Then follow these steps:
- Instructions for loading a “reference” comparison file(s) for use with Antconc’s “Keywords” function (to see what words are most unique in a text, or have most “keyness,” compared with the reference files):
- Have available a reference comparison file, or multiple files on your computer (must be plain-text “txt” files)
- In AntConc, click on “Tool Preferences” among the tabs at the top. Then follow these steps:
- Now you can start exploring the document file(s) you chose for study using Antconc. The best way to start is to count all the words in the document(s):
Topic Modeling Exercise
- Read through (and, if you wish, try) the lesson plan in Shawn Graham, Ian Milligan, Scott Weingart, “Topic Modeling By Hand” (from The Historian’s Macroscope).
- Experiment with the Topic Modeling Tool, which provides a simplified, GUI front-end for the underlying MALLET topic modeling tool. (See the Topic Modeling Tool’s page linked above for download and operation instructions. Under “Optional Settings” in the program you can load a stopword list. For example, download the following stopwords list and save it as a plain-text “.txt” file on your computer: buckley-salton.txt. Or you can find stopword lists in many languages here and here.)
[Optional, more ambitious exercise: download, install, and experiment with the actual MALLET topic-modeling tool, which runs from the command line. (See The Programming Historian Tutorial “Building a Topic Model with MALLET” for instructions on installing and running MALLET).]
An ideal experiment is to topic model a relatively small collection of texts (e.g., 10 to 100 documents you have extracted as plain text and put in a folder) or a “chunked” plain-text version of a long text (e.g., a novel with separate files for each chapter). If you wish, you can use any of the ready-to-go text collections in the “Demo Corpora” section of Alan’s DH Toychest.
Note: If you have trouble getting the Topic Modeling Tool to work, then you may want to try the similar “Topics” tool implemented online in Voyant Tools. On the Voyant Tools site, you can upload your own files for analysis. Or you can experiment with two corpora of demo text files (by Shakespeare or Austen). (Click “Open” on the home screen to access the demo files).
Voyant Tools include a whole suite of tools presented in a “dashboard” style, each of whose panes can be set for a different tool. The tool relevant to this practicum is “Topics.” See the Voyant Tools Help page for guide to the tool.
Configuration bar in each of Voyant Tools’s panes. Click on the icon that the cursor is pointing to here to open a menu of tools. - Leave at least one souvenir of your experimentation in the form of a screenshot or spreadsheet (or other view) of topics in your folder in the Sandbox for Student Work.
Social Network Analysis Exercise
Install Gephi on your computer. Then follow the below instructions first to take a tutorial and then to analyze a small social network of your choice:
Note: Gephi requires that Java be installed on your computer (see Gephi Requirements — “The current stable version of Gephi will only run with Java 7 or 8. On Mac OS X, Java is bundled with the application so it doesn’t have to be installed separately. On Windows and Linux, the system must be equipped with Java.” [Java download]).
- Debugging the “Cannot locate java installation in specified jkdhom” Gephi error on Windows machines: If Gephi reports “Cannot locate java installation in specified jkdhom”, then do the following:
- Go into your program folder and find the folder location of the latest version of Java on your computer. For example, C:\Program Files (x86)\Java\jre1.8.0_311
- Then open the following Gephi configuration file in a text editor: “C:\Program Files\Gephi-0.9.2\etc\gephi.conf” and change the following line so that it goes to your current version of Java: “jdkhome=”C:\Program Files (x86)\Java\jre1.8.0_311″
Alternative if you cannot get Gephi to run: As an alternative to using Gephi, use John Ladd and Zoe LeBlanc’s Network Navigator — an online-only tool for visualizing networks from data (in the form of “edge” files representing node to node connections). You may in any case be interested in putting your data into Network Navigator in addition to visualizing it in Gephi (or as a preliminary to experimenting w ith Gephi).
- Work through the following tutorial: Adapted version of Par Martin Grandjean’s Gephi Tutorial of 2013 (adapted by A. Liu for Gephi 0.9.1).
Cheatsheets & other tutorials for Gephi:
- Gephi Cheatsheet
(by Clement Levallois)
- Gephi Basics
- Other Gephi tutorials (see in DH Toychest)
- (You may also be interested in an article explaining the frequently used “ForceAtlas2” layout option for Gephi visualizations. The article is technical, but gives a sense of what would be involved in unlocking the “black box” of concepts behind such algorithms: Mathieu Jacomy, et al. , “ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software” [2014])
- Try to understand the logic/format of the two
.csv
files used in Grandjean’s Gephi tutorial (one that identifies the “nodes” and the other the “edges,” or relations between nodes). Then choose a very limited work or works that would be of interest to humanities scholars (e.g., a chapter in a novel, a scene in a play or film, an hour of a Twitter timeline from a conference) and create your own nodes and edges.csv
files (which can be created in a plain-text editor or exported from a spreadsheet or even a word processor). Use your.csv
files in Gephi to create a visualization. (If you wish, you can create just a hypothetical set of nodes and edges “as if” you were analyzing something even though you don’t have time to do that for real at present.) (For other datasets of nodes and edges.csv
files designed for use with Gephi, see Melanie Walsh’s “Sample Social Network Datasets For Teaching With Gephi (and Other Tools Like It).”) - You may also be interested in downloading, unzipping, and opening or importing in Gephi some of the other Gephi datasets available from Wiki.Gephi.org in a variety of formats (.gexf and .gml).
- Leave at least one souvenir of your experimentation in the form of a screenshot or spreadsheet (or other view) of topics in your folder in the Sandbox for Student Work..