Class Business
- Readings for this Week
- For Tuesday
- Chris Nicholson, “A Beginner’s Guide to Neural Networks and Deep Learning” (n. d.)
- Demi Ajayi, “How BERT and GPT Models Change the Game for NLP”(2020)
- Minh Hua and Rita Raley, “Playing With Unicorns: AI Dungeon and Citizen NLP” (2020) [If you can’t finish this article by this class, try to read at least the first two sections; and then finish the rest of the article for the next class.]
- For Thursday
- Emily M. Bender et al., “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” (2021
- [Optional: If you are interested in the controversy and background behind this article, see Tim Simonite, “What Really Happened When Google Ousted Timnit Gebru” (2021)]
- Emily M. Bender et al., “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” (2021
- Due Thursday, Nov. 10th: Large Language Models & Text-to-Image Large Models Exercise
- For Tuesday
Framing the Discussion: “AI” and Neural Networks
Nicholas Thompson & Geoffrey Hinton (An AI Pioneer Explains the Evolution of Neural Networks,” 2019)
Nicholas Thompson: Explain what neural networks are. Explain the original insight.
Geoffrey Hinton: You have relatively simple processing elements that are very loosely models of neurons. They have connections coming in, each connection has a weight on it, and that weight can be changed through learning. And what a neuron does is take the activities on the connections times the weights, adds them all up, and then decides whether to send an output. If it gets a big enough sum, it sends an output. If the sum is negative, it doesn’t send anything. That’s about it. And all you have to do is just wire up a gazillion of those with a gazillion squared weights, and just figure out how to change the weights, and it’ll do anything. It’s just a question of how you change the weights.
NT: When did you come to understand that this was an approximate representation of how the brain works?
GH: Oh, it was always designed as that. It was designed to be like how the brain works.
Emily M. Bender, et al., “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” (2021)
Contrary to how it may seem when we observe its output, an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot. (¶ 6.1)
Shallow Neural Networks (& Word Embedding)
-
Chris Nicholson, “A Beginner’s Guide to Neural Networks and Deep Learning” (n. d.) neural network elements
-
Sanket, Doshi, “Skip-Gram: NLP Context Words Prediction Algorithm” (2019)
Deep Learning Neural Networks
-
Samuel K. Moore, David Schneider, and Eliza Strickland, “How Deep Learning Works” (2021)
-
Chris Nicholson, “A Beginner’s Guide to Neural Networks and Deep Learning” (n. d.) feature hierarchy
-
Adam W. Harley, “An Interactive Node-Link Visualization of Convolutional Neural Networks” (2015)
-
Phillip Schmitt, “I Am Sitting In A High-Dimensional Room” (2020)
Large Language Models (LLMs)
-
Alan D. Thompson, “What’s in My AI?” (2022)
-
Chuan Li, “OpenAI’s GPT-3 Language Model: A Technical Overview”
(2020)
-
Serdar Cellat, “Fine-Tuning Transformer-Based Language Models” (2021)
- Brandon Duderstadt, Andriy Mulyar, and Ben Schmidt, https://home.nomic.ai/visxwiki“Mapping Wikipedia with BERT and UMAP” (2022)
Text-to-Image Large Models (LLMs)
-
Jay Alammar, “The Illustrated Stable Diffusion” (2022)
- Nomic explorable map of KREA AI’s Stable Diffusion Search Engline
Thinking With / Thinking about Large Language Models
Minh Hua and Rita Raley, “Playing With Unicorns: AI Dungeon and Citizen NLP” (2020)
If a complete mode of understanding is as-yet unachievable, then evaluation is the next best thing, insofar as we take evaluation, i.e. scoring the model’s performance, to be a suitable proxy for gauging and knowing its capabilities. (link)
In this endeavor, the General Language Understanding Evaluation benchmark (GLUE), a widely-adopted collection of nine datasets designed to assess a language model’s skills on elementary language operations, remains the standard for the evaluation of GPT-2 and similar transfer learning models…. Especially striking, and central to our analysis, are two points: a model’s performance on GLUE is binary (it either succeeds in the task or it does not)…. But if the training corpus is not univocal — if there is no single voice or style, which is to say no single benchmark — because of its massive size, it is as yet unclear how best to score the model.(link)
Our research questions, then, are these: by what means, with what critical toolbox or with which metrics, can AID [AI Dungeon], as a paradigmatic computational artifact, be qualitatively assessed, and which communities of evaluators ought to be involved in the process? (link)
AID, as an experiment with GPT-2, provides a model for how humanists might more meaningfully and synergistically contribute to the project of qualitative assessment going forward…. (link)
Our presupposition … is that it is not by itself sufficient to bring to bear on the textual output of a machine learning system the apparatus of critical judgment as it has been honed over centuries in relation to language art as a putatively human practice. What is striking even now is the extent to which humanistic evaluation in the domain of language generation is situated as a Turing decision: bot or not. We do not however need tales of unicorns to remind us that passable text is itself no longer a unicorn. And, as we will suggest, the current evaluative paradigm of benchmarking generated text samples — comparing output to the target data to assess its likeness — falls short when the source for generated samples is neither stable nor fully knowable. (link)