RESEARCH PLATFORM

Academy for Theatre and Digitality

HAMLET.AI

completedFellowship

Summary
HAMLET.AI is a project of two parts. The first investigates the use of generative AI technologies in performance, using Shakespeare’s Hamlet as a starting point. We are developing deep fake video and audio technologies, close-to-real time image generation based on video inputs, and a tool that generates a script and delivers it via headsets to actors on stage. The second considers the questions and concerns that this technology raises for us aesthetically, ethically and philosophically: Do we even want to use these technologies? And will it shift our idea of what it means to be human? Or not!

Introduction

We are Wolf 359, which is something in between a narrative technology company, a theater company, and a research lab. The company was started by the two of us who are Fellows (Michael Yates Crowley, playwright, and Michael Rau, director) but includes a number of other artists that we enjoy working with, including Asa Wember, who's been visiting us in Dortmund to work on this project as well.

In our previous work, we are often exploring the way technology can be used to tell stories, and especially the way technology itself tells the story. Sometimes this leads in unexpected directions (for example, telling a story through Microsoft Excel in our work, TEMPING, or using email in BLOCK ASSOCIATION).

The Project: Hamlet.AI

We started this fellowship with a question about whether AI could be used to tell stories, and if so how. That's a very broad question so as a way of focusing our work we decided to look at using AI in the production of Hamlet, and specifically the character of Hamlet's dead father, the ghost.

We’re looking at three broad areas: deepfakes, image and live video generation, and large language models. These are the initial points of contact for us to think about how machine learning can enhance or support the ways in which we make theatrical work – and how the theatrical work that we make could affect or change how machine learning algorithms work. We’re trying to think of this as a two-way street.

Why Hamlet?

Crowley: First, because of the friction when something very old meets something extremely new. In choosing Hamlet, we are dealing with the most traditional, conservative, most performed play in the English theater tradition — and we wanted to see what happens when you bring that into contact with AI tools that are being developed and released in real time.

Why the ghost?

Crowley: We were interested in the concept of the "ghost in the machine" — that's a phrase that comes from thinking about the human mind, specifically, whether we humans are merely deterministic engines of neurons and muscles doing exactly what physics says we must do, or whether there is a "ghost" somehow making decisions and acting freely within this machine. Now it's more common to hear that phrase in the context of AI, and the feeling some people get when chatting with ChatGPT that they are talking to a person, or at least a "something", which we call a ghost in the machine.

Rau: I was less interested in the ghost of Hamlet’s father, and more interested in the ghost of all of the past productions. The ghost of the literature and scholarship and ideas that over centuries have built up around this piece, and seeing where and how these algorithms could start to intersect with this material.

Opposing views…

Lastly, we started this project with somewhat opposing views on whether we would succeed: Michael Rau, as a professor at Stanford, is much closer to where this technology is being developed, and more optimistic about the potential for its use in theater. Michael Yates Crowley, coming from a background in playwriting, is more skeptical about the aesthetic, ethical, and practical implications of using this technology in live theater. This conflict, and the resulting debates are a parallel track of our project, where we are trying to be rigorous in thinking about not just what can be done, but whether it should be done, and what the implications are of using this technology.

1. Deepfakes

Rau: I think that the most accurate way that we “approached” the Deepfake research was “cautiously.” We spent a significant amount of time exploring the different techniques and libraries that people were using for Deepfakes, and the most common use-case that we found was “deep-fake porn” followed shortly by “disinformation” and both of these creeped us out. Technologically, we were able to successfully clone ourselves and make deepfakes of both of us--but artistically, it seemed a bit like a dead-end. Theater is already a deepfake, and at least from my (Rau’s) POV, it seemed like a “hat on a hat.”

In the theatre we go on about synchronicity, and the fact that we’re breathing the same air, blah blah blah, but actually it’s the sense of latency - it’s the speed at which a certain amount of information can be communicated, because you’re in the same room. With deepfakes, it is a form of misinformation and deception, and not in the way that we use deception in the theatre, where we all collectively agree to call a guy Hamlet instead of mark, a deepfake works on a level of distrust. Those two modes are not interesting or useful to making a piece of theatre.



2. Live Video Generation and Pose Detection

How does it work?

Our image generation system takes in live video and runs it through three separate AI systems that work together. Those three systems are Text to Image, Image to Image Translation, and Image Diffusion.

Text to Image is a machine learning algorithm trained on hundreds of thousands of pairs of text and image. I.e. a piece of text says ‘dog’ and there’s a picture of a dog. This means that when you type in ‘dog’ it can immediately start to generate a picture of a dog.

Image to Image Translation takes one image as an input and transfers it to another. We take a single frame from the live video and put it in this system, where it is intersected by the Text to Image system. So you see an image of the actor on stage, that’s the live video, and with a prompt that says ‘a picture of Queen Gertrude, and it mixes those two systems. It transforms that image of an actor into Queen Gertrude.

The last is an Image Diffusion System, which is an AI system which enhances images by gradually reducing the noise in them. It starts with a random image that looks like static and refines it by iteration to produce a clearer and more detailed picture. So underneath the Image to Image and Text to Image systems is a diffusion model going at the same time. So if the system can’t necessarily recognise something from the video input, then the diffusion system kicks in and generates a kind of dream – it just tries to focus, through a series of quick, successive iterations.


What does it mean?

Dramaturgically--we’re doing something that I think is more interesting than a deepfake--we’re using these algorithms to translate a video stream (ideally what an audience is seeing right in front of them--the ground truth reality of the scene) and using these algorithms to translate the images of that reality into something fantastic. It’s a parallel to how the audience imagines the actors, and wraps them in a fantasy – or “suspends their disbelief”.

In terms of the greenscreen--we’re trying to just isolate the performer’s body--and play around with static foreground/background elements. However, we’ve discovered that these image-to-image translation systems will find ANYTHING, a shadow, or even a pixel and hallucinate images from them. Those hallucinations, as opposed to the direct “image-to-image” translations, are almost more interesting to us – we’re wondering – is this the computer being creative?

These three systems work really quickly, we call it ‘close to real time’, to transform the images coming from the video feed into these otherworldly, fantastic dreams of what the computer sees in the video screen. While it is technically possible for us to generate images faster (there are different more optimized algorithms, and if we used more computing power, we’d be able to get the images coming at closer to the 24 frames per second) – we found that these translated images are so strange, that you need a little bit of time for your brain to process the image – to see both the “reality” and the “AI translation” needs a second longer. Also because each [time the] algorithm runs it makes different decisions, so there is no continuity between images, and this means it also becomes distracting and strange if the images move faster.

3. Using Large Language Models to generate Hamlet scripts for “live delivery”

This system ‘delivers’ AI-generated scripts via headsets to actors as they stand on stage, for them to speak out loud.

And then once we’ve run through that--we’re left with individual files of “computer voice” reading a script.


What does it mean?

Rau: For centuries, our assumption around the performance practice of making a piece of theatre has been to say, there is a playwright who writes a script, and that is a set text. Actors memorise and rehearse it, setting and solidifying the choices that the playwright made. Those choices and that play is then locked and presented to an audience, possibly again and again. The goal of that process is about consistency, and finding new ways to do the same thing again and again in a way that feels new.

What an LLM offers is flexibility: a way in which the theatrical form could change, in rehearsal or in performance. That’s a different way of thinking and rehearsing a play than we’re used to. Certainly things like improv comedy exists, and that’s closer to what this form is, but improv works on a series of often-times simplistic tricks, and I’m interested in thinking about how this can be responsive to a moment, to a night, to an audience’s dreams or desires, or an actor’s dreams or desires. That does end up shifting the hierarchies, and moving into something that feels more like a network or a web. I don’t think all theatre needs to be made this way, I’m not calling for the Death of the Author, I’m just interested in new ways to make theatre. And this ability to change things on the fly, and to do new things with text that doesn’t require a memorisation and a rehearsal time, that’s what I think is interesting.

The future of theatre and AI

Rau: For me, the term technology is a way of doing things. You can have a technology of writing implements, or prayer. So really what we’re talking about is: what are the ways that we’ll want to do theatre in the future? Every time you think about adding technology to a performance, I ask myself, what is the thing that must be kept about this artform? What is essential, and what must be discarded? I think we’re going to do that with AI – we’re going to ask these questions, take what’s useful for us, and then we’re going to leave a whole bunch of it behind, too. We should celebrate theatre’s ability to say, more than in any other artform, that this must be kept and this must be left behind.

Crowley: I think that one of the reasons why we make theater is that we’re actually interested in the strange messiness of human beings, and we don’t make shows that just incorporate the latest flashing technology just to “show off”--we’re interested in using technology to illustrate a specific feeling, or idea or emotion (as in our previous work, to capture the strange alienation of being a temp, or subtle awkwardness of communal listserv emails). So in investigating this technology, we’ve been on the hunt for the human side of it--and for us, right now, especially since we seem to be on the cusp of incorporating these algorithms far more into our daily lives--the most interesting human focused story we were interested in, was that particular tension between one character who wants to rush headlong into the future, and one that ones to hold back.




Similar Projects