Using AI Image Generators in Creative Production

Popul-AR
Popul-AR
Published in
7 min readDec 24, 2022

--

AI image generators have gotten a lot of press lately, and for good reason — there has been a huge spike in quality and general availability of these tools. They offer a new perspective on the arts, and enable us to do complex tasks, quickly; allowing us mere mortals to focus more on what matters the most: the narrative! In this article, we gathered a list of techniques that will boost your creative work.

Midjourney prompt: image generator robots sitting at a long table, discussing the ethics of prompt moderation, oil on canvas, chiarascuro — ar 3:2 — v 4

Midjourney, Dall-E, and Stable Diffusion seem to be the most popular. Each have their own strengths and weaknesses, so it’s worth experimenting with them and getting to know how they behave. Midjourney tends to make cool artistic pieces, while Dall-e is more realistic and conservative. Stable diffusion is free and generates images of similar quality, but lacks a web UI (unless you run it yourself).

There are plenty of options to choose from, but when I first started experimenting with the generators, it seemed like a crapshoot — no matter how much time I spent crafting my prompts, I felt like the outputs were essentially random. So the question is, how are you supposed to use an AI generator to make something with intention?

I’ll talk about some general approaches to the craft, as well as a specific use-case on a client project that caught me off guard. I came up with the solution using Dall-E’s in-painting, but I wasn’t expecting to use this technology in production so soon!

Techniques

Prompt Crafting

Most of the generators start with asking you for a prompt — a short description of what you want it to generate. Prompt libraries are great resources for increasing your vocabulary (not in general, but specifically related to AI image gen). You can browse around to see what prompts other creators are using to get specific outputs. You’ll notice some trends and common phrases like “trending on artstation”

I searched for “eagle headed man” on Krea, and found this image of a man with the head of a bald eagle. Like a lot of the generated images, this one is pretty goofy — and maybe not exactly what I wanted — but it’s a decent starting point. I can take the prompt from this image and building a sticky note collection of potential prompt vocabulary.

“man with head like a bald eagle”

It may seem a little silly since you need to give a prompt to the search bar (why not just run the generator?), but it lets you browse hundreds of results very quickly, as opposed to waiting a few minutes for an actual image generation.

Another good way to get a look at prompt/output pairs, is to join the Midjourney Discord server. Midjourney’s image generator runs inside of Discord, in the form of a chat command. This means you can see everyone’s work-in-progress (prompts, image inputs, and outputs) on the public channels, and you can even run variations directly from their outputs.

Piggybacking on another user’s WIP, I requested a variation of one of their outputs

Image Seeds

If you already have an image that you want to use as inspiration, you can throw it into the mix. The implementations vary, but the idea is that the generator will take style and content from your input, and use it in conjunction with a prompt to generate an image. Stable Diffusion calls it img2img, Midjourney calls it “image prompting”, and Dall-E has its own dedicated editor (more on this later).

Here’s a striking comparison between an initial sketch and the final output. Of course, you might see these two image side by side and think it’s a really magical instant transition, but in reality there were several generations between the two.

Here’s another workflow, using image collages with Midjourney v4. This approach gives you greater control over the composition, and overall seems like visual artists will be more comfortable with this technique. It feels less like hitting the prompt slots and more like a creative process that grants us some agency over the AI.

Photobashing with Midjourney, Sean Simon
Photobashing with Midjourney — by Sean Simon

In-painting / Out-painting

Because the generators can be hard to control, Dall-E created an interface that allows you to remove and replace specific areas of the image. This helps when your generated result is 95% correct, but you want to clean up the details or extend the background. You can choose to erase and regenerate sections of the image (in-painting), or ask it to generate additional sections that are outside of the original image (out-painting).

Our production use-case for Dall-E used out-painting to generate some missing sections of some illustrated characters. We had some great 2D compositions, but we needed them to exist in 3D space in AR, which exposed some missing arms, legs, etc.

Unfortunately I can’t show the client work that sparked this article, so I’ll use this render from unsplash.com to illustrate the process.

3D render by avat fathiazar

Let’s say this rendered woman is exactly what we need, but we want to show her full body in the shot. Her legs are missing, so I’ll start with some out-painting to generate them.

Sizing the original image inside a full generation frame
Picking one of the 4 results

So now we have a full length body shot of our character! Of course I could spend more time refining this, maybe with some manual edits in photoshop, but I’ll leave it at this for this quick demo.

In the client project, I didn’t edit any of the original image (only added to it), but I’ll show a quick example of in-painting so you can see what that looks like. The basic idea is that you erase the section that you want to regenerate. There’s an eraser tool in the web editor, but it really just looks at the alpha channel, so you could do this step in Photoshop too.

Eraser action

I wanted to get a falcon head in place of the human head, but the generator gave me a companion bird, and I like it.

Going one step further (I didn’t realize this was an option until discovering it now), you can move the generation frame, give it a new prompt, and continue building out your scene.

Upscaling

If you have done anything with image generators, you probably noticed how small the images are. They are generally around 512 or 1024 pixels. Once you have your final image, you can pass it through an AI upscaler app to increase the size for production. There are some free Colab notebooks for upscaling, but if you have budget to spend on an app, it seems like Gigapixel AI is the industry leader.

Upscaling isn’t a super exciting topic, but it’s a necessary step in a lot of cases, so I didn’t want to leave it out.

The Future of Digital Art

There are a lot of interesting conversation topics around what this new tech will mean for artists. I’m sure there will be many copyright lawsuits and some folks will have to pivot their careers to keep up with the new impending paradigm of AI in creative fields. It’s going to be a rough road, but I’m excited to explore the new potential of these tools.

The web interface of Dall-E and the awkward Midjourney Discord chat commands are obviously not ideal in terms of production workflows. The good news is that we are starting to see AI integrations in to apps like Photoshop and Blender, so artists can work directly with the generators in the apps they already know.

Dream Textures: Stable Diffusion built-in to Blender

Dream Textures looks super promising. It integrates Stable Diffusion directly into Blender, and it can do a bunch of cool stuff:

  • Create textures, concept art, background assets, and more with a simple text prompt.
  • Use the ‘Seamless’ option to create textures that tile perfectly with no visible seam.
  • Quickly create variations on an existing texture.
  • Re-style animations with the Cycles render pass.
  • Run the models on your machine to iterate without slowdowns from a service.

Continue the Discussion

Have you used AI generators in your professional workflow? Join our community on the Lab and share your thoughts!

Author: Josh Beckwith
Contributor: Laszlo Arnould

--

--