The Quest to Rule Scene Density ๐ฐ
โ๐ซ ๐๐ฅ๐ฆ๐ ๐ฅ ๐ ๐๐ข๐ฏ๐ฏ๐ถ ๐ ๐๐ซ๐ก ๐ฌ๐ฃ ๐๐ฑ๐ฌ๐ฏ๐ถ๐ฑ๐ข๐ฉ๐ฉ๐ข๐ฏ๐ฐ ๐งโโ๏ธ๐ง๐ง๐ปโโ๏ธ ๐๐ฎ๐ฒ๐ฆ๐ญ ๐๐ซ ๐๐๐ ๐ด๐ฆ๐ฑ๐ฅ ๐๐ถ๐ข๐ฐ ๐๏ธ
Lore Machine is on a quest to build the planetโs most powerful story visualization system. To this end, weโre on 10 side quests at all times. One of these is maximizing Scene Density.
Scene Density is the number of words it takes to generate one image. Itโs a crucial calculation:
Scene Density = StoryText Word Count / Total Images Generated
The relationship between a comprehensive visual narrative - like a film or graphic novel - and its constituent generated scenes is akin to a digital image and its pixels. When it comes to Scene Density more is better.
In utilitarian terms, more Scene Density gives you more multimedia blocks to work with when building an immersive story experience.
Hereโs where Lore Machine started with Scene Density, where weโre at and where weโre going.ย
Humble Beginnings
When Lore Machine lurched into gear this March, Scene Density was an afterthought.
We initially rigged Lore Machine to generate an image every time our LLM detected one of three story occurrences:
A character performed an action
A location changed
A physical object impacted its surroundings
Lots of these happen in every story, and so lots of images would come out the other side, so we thought. A perfect plan, so it seemed.
But alas, no. We drastically overestimated our LLMโs ability to detect these carefully-orchestrated scene triggers. As a result, swathes of beautiful storytext were going unvisualized. Three-thousand word masterpieces were yielding four images on the regular.
We needed a plan.
Round Two: Sight!
Our LLM was failing to see crucial scene triggers resulting in sub-optimal image generation and much unrest in the Discord. We needed to give our LLM eyes.
To begin, we told the LLM when to look and how often. Working backward from anecdotal user input, we established an ideal Scene Density of 200. In other words, we demanded the LLM see 1 image worth of scene details every 200 words before proceeding through a storytext.
Opening our LLMโs eyes up to this set of properties every 200 words began to work. The detected scene properties are then merged with libraries of metadata from each respective story using an assembler algorithm. The outcome is an explosion in the volume of modularly-built prompts, which turbocharges Scene Density and multimedia output for all users.
Here are some early results:
Scene Density All The Way Down
We are dangling our legs over the edge of a future in which writers are the commanders of content. This is a future in which the originators of our civilizationโs stories can conjure their imagination into reality in an evening.
Lore Machine is here to help writers tap into these emergent story visualization powers. By cranking aspects of our system like Scene Density, we provide high-resolution visual representations of storytext. This volume of media means creative options and that composability provides writers with the surface area upon which to heighten individual expression and maybe help us all understand each other a little more.
If this gets you so excited you canโt sleep at night