AI for Storyboard Creation

In my past blog posts, I discussed a lot of theoretical aspects of the treatment writing process. With that knowledge, I started taking a more hands-on approach. I tried to watch every YouTube video I could find on the topic and stumbled across the channel of Nur Niaz, a commercial director from Bangkok, Thailand, who has worked with major brands. His videos significantly helped me improve my treatment skills.

I applied his tips to my latest treatment, which I wrote for my Carhartt WIP Spec project. After receiving feedback from several people, I felt I had genuinely improved compared to my previous treatments. I also used ChatGPT to help me rephrase my sentences, and it effectively maintained the core of my ideas while refining the spelling and grammar for better flow. This further enhanced my treatment.

The idea is to send the treatment to the marketing team of Carhartt WIP and see if we can get some money or cloths to shoot the video. Therefor, In the next steps I will work out a clear storyboard. I will put different softwares, like previs ai, mid journey or Stable diffusion, in action and see which results I can generate and if it makes sense to use ai tools or if it’s more efficient to hire a storyboard artist. 

First Try with Stable Diffusion 

Quick Overview – Stable Diffusion is a text-to-image diffusion model developed by Stability AI in collaboration with researchers at LMU Munich and other partners. It generates images from textual descriptions using deep learning, particularly a latent diffusion model (LDM). Unlike earlier models, Stable Diffusion runs efficiently on consumer GPUs and allows users to create high-quality images by describing them in natural language.

Key Features:

  • Open-source and accessible.
  • Creates high-resolution, detailed images from prompts.
  • Works via “latent space” (compressed representation of images) for efficiency.
  • Supports various applications like art, concept design, photorealism, and more.
  • Can be fine-tuned or extended with custom models (e.g., for specific styles or subjects).

Since I am pretty lazy, I told Chat GPT what I want to generate and asked to give me a prompt which I can then use in stable diffusion. The prompts looked good and I had different options to choose from 

Input to Chat GPT – hello please give me prompts for stable diffusion. I want to create a picture of a person sitting in a football like press conference. dim lights with a lot os microphones in front of him. The person should be brown skinned and and a short buzzzcut. Dressed in streetwear

Result – 

Photorealistic Style Prompt:

A brown-skinned man with a short buzzcut sitting at a football-style press conference table, dim lighting, dramatic shadows, many microphones in front of him, wearing stylish streetwear (hoodie, sneakers, chains), serious expression, blurred background with press logos, cinematic lighting –realistic –v 5 –ar 16:9 –style photographic

Cinematic / Moody Style Prompt:

A dramatic football press conference scene, brown-skinned man with buzzcut, sitting at a table under dim lights, surrounded by microphones, urban streetwear outfit, intense atmosphere, moody lighting, gritty texture, cinematic look –ar 16:9 –style cinematic –v 5

Illustration / Stylized Version:

Illustration of a press conference setting, brown-skinned man with short buzzcut, lots of microphones in front of him, wearing streetwear (oversized hoodie, gold chain), sitting under moody, low lighting, stylized background with blurred sports logos –style illustration –ar 16:9 –v 5

The given Result of stable diffusion almost nailed my initial idea for the frame.   

It shows the correct mood and core idea of the Frame. Of course it’s not the shot how It would be in the final image but for reference it works pretty well. 

With the second frame I had more issues. I had a certain picture in my head on how I want the Frame to look like and the generated pictures didn’t give me the result I wanted. 

Input to Chat GPT – 3 people standing infront of a fence at the local basketball court and watching others play. should be a frontal shot medium wide. also dressed in streetwear

Cinematic Urban Style Prompt:
Three friends in stylish streetwear facing the camera, standing in front of a fence at a neighborhood basketball court, players in the background in motion, medium wide frontal shot, soft natural lighting, urban summer vibe, detailed street textures –ar 16:9 –style cinematic –v 5

The result wasn’t really pleasing to me and I tried different prompts and also changed them myself but nothing seemed to work. Out of frustration I asked Chat GPT to generate the pictures and I was suprised by the result. 

Since the concept is still at the beginning, it makes no sense to invest time in prompting each scene when some things will still change. The next steps will be sketching the storyboard by hand using simple stick figure drawings, and once that’s finalized, prompting the images could be an idea

Leave a Reply

Your email address will not be published. Required fields are marked *