I don't know how but I'm writing!

Idea: Enriching Dungeons & Dragons with Generative Art

Happy New Year!

Like everyone who doesn't live under a rock, I've been fascinated with the recent advancements in AI. It seems like overnight, ChatGPT and friends brought on a sudden avalanche of sensational headlines, and the flow hasn't stopped with the new year.

What finally convinced me to board the hype train, though, was Midjourney, a Discord bot which feeds commands into an AI/ML algorithm that generates images. From day 1 the user experience and the quality of the images was very good. My friends added the bot to our Discord server and quickly got about using up all our free-trial images.

At first, we were trying to break the system, seeing how far it would go and what the limits were. Then, we made ourselves avatars for various sites, fed it inside jokes, etc. By this time most people were already bored but then somebody got the brilliant idea of creating portraits of our Dungeons & Dragons characters. I was extremely impressed with the quality of the images and made several scenes with my own character, a Hexblade Warlock. It was really fun, but eventually I ran out of trial images and had to stop.

I've written about how I play Dungeons & Dragons before (Ed.: Not on this blog, stay tuned...), it is a very non-standard setup. Basically we have about twelve people entering and exiting a rotating set of different storylines. Anyone can volunteer to be a DM and campaigns last anywhere from one session to ten. There is a loose world built and a few re-occurring NPCs, but for the most part anything goes.

Given such a chaotic mess, it's somewhat hard to develop any narrative identity, but I've found my niche as "The guy who tries weird stuff." For example, my one campaign was inside a "non-Euclidean" dungeon which had impossible geometries specifically designed to trick the party. Another one of my campaigns introduced survival mechanics, requiring the party to keep track of food, water, and weather - a common feature of many other campaigns but something we usually ignore. These have had varying levels of success; making custom D&D rules is very, very difficult to do without breaking the game. And as I learned with my survival campaign, if your party does not like the rules or you don't properly challenge them, they will simply ignore it. So I have been wondering - how do I top my previous campaigns without further delving into arcane custom rules?

At this point the astute reader knows exactly where I'm going with this - I would like to use Midjourney to generate D&D images, possibly live as they are happening. The idea of the party walking into a new area and getting to see a visual representation of that brand-new area they have helped narrate is extremely compelling. This is a perfect application for Midjourney. We already play over Discord and everyone is in the channel and watching the chat. Visuals are hard to come by, and up until now have mostly been absent, with DMs relying on verbal descriptions of objects and characters within the world. Adding in tons of art would require either buying a pre-written campaign or hiring an artist, both of which are not very appealing.

My vision is this - whenever the DM has a block of flavor text, they would feed a description of that to Midjourney, parameterized by whatever is happening. I would say something like, "$CHARACTER* approaches the Golden Idol sitting on the shrine and touches it. A blue tongue of flame bursts from the Idol's forehead, licking up towards the ceiling."* Either before or during my speech I would feed Midjourney a description along the line of /image $CHARACTER_DESCRIPTION in front of golden idol, blue fire surrounds idol, dark room, dungeon, scary, ... and post the upscaled result in chat for the group to see.

Dynamic scenes (such as fight between two characters, etc.) would be extremely cool but might take too long and be too difficult.

The one problem with this plan is the descriptions of the characters themselves. Everyone has a mental picture of what their character looks like, based on both reference art we've taken from various places (5e Player's Handbook, Google Images, etc.) and verbal descriptions we have collected over the various sessions. For example, my character wears hand-made half-plate armor, so (by my intention as a player) it is designed to look scraggly and hacked-together. I find it very hard to believe that Midjourney could accurately capture this kind of detail on-the-fly, but just depicting a normal human wearing armor really wouldn't fit my character and it would be obviously strange. Perhaps if I really took some time to work the AI before the session I could come up with good images, however, that removes the dynamic nature of it which is so appealing to me.

I believe I will purchase a subscription to Midjourney and then try to run a session where I pre-generate images during the writing and design phase. If that goes well then we can try and do it live. Any time spent messing with the bot is time away from game, it needs to be smooth, but if I do it this way then hopefully I'll have enough practice making good images that I can do it quickly. That's where the magic happens.