A Generative Graphic Novel Prototype
Here's How We Turned An Old Nautical Tale Into a Multimedia Experience Using AI Tools
In mid-December, we published our first generative graphic novel. Here’s a breakdown of the process, what we learned and how we will apply these learnings to the future.
You can read the entire 130-page work here.
Don’t forget to sign up for a Library Key here.
Text Selection
The text we chose for our first transformation was The Rime of the Ancient Mariner by Samuel T Coleridge. It’s got a dedicated following of English lit nerds, but its at-times obscure verbiage hasn’t exactly made it a pop culture phenomenon. More on how we turned that into an advantage later. But if you can get past the language it’s just a ripper of a tale of a sea voyage gone awry.
We chose Rime for a few reasons:
It’s a personal favorite of mine. Fun fact: years ago, I spent part of my winnings from a successful night at the tables in Vegas on a dinner at Bouchon and a tattoo of my favorite line from the ballad.
It’s a poem. Our AI plays well with poetry because each stanza is comprised of a single depiction or plot point. So Hapaxes, our natural language processor, knows when to stop reading and produce a prompt.
It’s psychedelic. Coleridge was tripping throughout the writing of Rime, and it shows. The descent into madness parallels nicely with the the decidedly twisted imagery coming out of these early text-to-image generators.
Production Process
The popsicle sticks and bubblegum pipeline we built for this first production was a tug-of-war between human and machine. Here are the steps we took, some observations along the way and a breakdown of the ratio between man and machine hours.
Language Refresher: So yeah, the language of Rime is not easy, which as it happens has little to do with English in the 1790s. Coleridge was zooted out of his mind on a drug call Laudanum during the time. Laudanum is a tincture containing 10% opium. The poem was released in 1796. In 1815, mounting criticism about the archaic language forced Coleridge to write margin notes, explaining wtf was going on at any given moment. To help modern readers even more, we substituted some words with new versions while maintaining the the poetry of the poetry. For instance, we changed words like kirk to church. That kind of thing. All human: no machine.
1/2 Prompt Generation | Action: After selecting Rime, we first ran Hapaxes over the poem to produce a collection of 143 prompts, one for each stanza. Each is a literal plot descriptors. We’re keeping most of the code to ourselves for now, but we’ve included a Python building block below. No human : all machine.
import re def read_song(song_file): with open(song_file, 'r') as f: song_text = f.read() return song_text def split_stanzas(song_text): stanzas = re.split(r'\n\n+', song_text) return stanzas def print_prompts(stanzas): for i, stanza in enumerate(stanzas): print(f"Enter text for stanza {i+1}:") song_file = 'song.txt' song_text = read_song(song_file) stanzas = split_stanzas(song_text) print_prompts(stanzas)
2/2 Prompt Generation | Style: The poem as a whole was analyzed to gather sentiment. Sad? Happy? Righteous? Despairing? We chose 16 different sentiments, and tied those to a range of different stylistic configurations - artistic choices that match the tone of the work. Hapaxes then made its decision. Once a style was established, it remained consistent throughout the book, except for minor tweaks if the image generation was way off. 20% human : 80% machine.
Image finalization: Everything from getting humans to look human and actualizing the style required improviong on each establishing shot. Some images requires 20 - 30 tweaks to get right. 90% human : 10% machine.
Layout: Each image was set to a 16:9 aspect ratio. We then dropped each image into a keynote and layered in the text. Then we brought on graphic designer Dersu Rhodes to whip the renderscape into shape with a proper layout. This was BY FAR and away the most work-intensive part of the process. 100% human: 0% machine. FML.
Chapter Intros: Each intro was produced by taking the 1817 margin notes and running them through Davinci. The result was succinct overviews matched with custom icons, like this:
Soundscape: We used an AI voice from Speechify’s library to read the poem out loud. We decided on a cantankerous old English dude. If there is one litmus test for whether AI narration is there yet or not, it’s asking it to read poetry. The timing is just way out of whack. So we cut the audio up using Audacity and then layered in sound effects. 70% human : 30% machine.
eBook functionality: We partnered with ebook publisher FlipHTML5 to create the page-flipping experience. They have a sea of options to choose from in order to get a real skeuomorphic book experience going.
NFTs: Using the image manipulation function in FlipHTML5, we dropped in hyperlinked “NFT Key” buttons that led to a collection of our favorite images on OpenSea. Each NFT gives users access to the Lore Machine Library, for life!
Observations
Machines are superficial: From a prompt generation standpoint, the AI is really good at interpreting literal action. Organic emotion, abstract thought, meta-narratives, and stream of consciousness… not so much.
Framing: Human decisions about how to re-contextualize certain prompts to get the best result are crucial. The human imagination, when it comes to angles, layout, metaphorical thinking, reigns supreme. Nothing beats human conviction it seems.
Text is nimble: Text is light and therefore very easy on the CPU. Hapaxes was able to run through the corpus in under 3 minutes.
Perfection has its price. Getting each image to match our imagination took time. On average, each image took between 10 - 30 minutes to finalize.
Avatars: The hardest aspect of the image creation process was consistent characters. Ultimately, we hacked it by feeding pre-existing imagery to the prompt. But there seems to be a big open design space to improve upon here.
Unexpected Audiences: We didn’t exactly make Rime for anyone in particular. Total reads hit 72k nonetheless, and some fascinating potential markets and prospective partners emerged - musicians wanting something more than the music video format, those living with learning disabilities, book publishers sitting on old IP and filmmakers tired of waiting years to see their visions become reality.
The Future
Making Rime gave us a lot of ideas. Here are a few of them:
Animated Panels: First order of business for the next book is producing select animated panels that can create ‘worlds in worlds’.
Character builds: Doing a pass to develop a design philosophy for each character before getting going on other aspects of the storytelling process will go a long way in smoothening out the production pipeline.
Featuring You!: Allowing readers to drop their own visage into the action is coming down the tubes.
Strategic Partnerships: We see a huge opportunity to partner with book publishers and filmmakers to take advantage of latent IP.
More than anything else though, we’re excited to discover the ways in which these new tools can elevate human storytelling. The future is wonderful, the future is terrifying. Let’s ride.