The other day, Joe had the strangest dream; he found himself in an elevator, and when the doors opened, he was met with an odd trio: a nun flanked by two doctors. The imagery was unsettling, to say the least. Naturally, we decided to recreate this bizarre scene. Here's how we turned it into a "talking head"- an AI generated video.
Follow us on TikTok for this kind of content (@orbitae_films)
First, Joe took to paper and drew a schematic of his dream's scene. Then, Alex used Midjourney to breathe life into the sketch. How to do that, you may ask. Well, you have to upload your drawing to the Discord chat – this will give you a link. Use this link in your prompt and add the description of what you imagine. Remember to be as clear as possible, be specific with the style, lighting and aspect ratio. Ad minima.
Pro Tip: Include specific details in your drawings. Below, you can see how the AI incorporated the yellow cross from our original sketch into the generated images, although it didn't quite capture that the other characters were doctors.
From initial sketch to first image attempts, and then to coherent, realistic elements.
The initial images Midjourney produced were nothing short of uncanny, teetering between the original sketch an a disturbing realism. So, what then? Well, you iterate, iterate… and what? Yes, iterate again. This is an often-overlooked aspect of AI tools; they require patience, time, and an incredible ability to live with frustration. But anyway, the key is to keep modifying and refining your prompts as you go, until the output looks like what you wanted.
Once the image was fine-tuned, we wanted our nun to say something creepy. So, we turned to ChatGPT, to craft a compelling short monologue about AI becoming the new omnipotent God – idea the AI really liked, by the way. After some back-and-forth with the tool, we then edited and perfected the text on our side, until it was ready for the next stage.
This is where D-ID comes into play. This software tool, mostly used for creating corporate videos and the like, uses AI to generate high-quality, realistic human avatars that talk (the lip-sync is pretty accurate, they even move their hands, head and blink!). But we did not want to do that. We wanted to challenge the tool. Unsurprisingly, it wasn't easy to find a more nuanced and creepy voice for our nun... But after hearing each and every one of them in all of their styles, we found one! Then we were ready: the image was uploaded, the script written and the voice selected. Time to click “generate”. In mere minutes, our nun was now inviting us to join a new AI-cult. A dash of atmospheric music for the vibe and voilà — the product of our AI-aided labor was complete.
In the end, within a day, we were able to bring Joe’s nightmare to life, thanks to sketches, patience, AI and a somewhat bizarre imagination.
Pro tip #2: Given its intended audience, D-ID primarily features corporate-sounding voices, some of which can be quite AI-y. If you’re looking to make a more creative piece, you have two options. As D-ID also lets you upload a specific voice, you can either use Eleven Labs (see zombie test), a voice generator tool that creates incredibly realistic voices, or record yourself, an actor, or a friend reciting the script (see fortune teller test).
Zombie test:
As we are closer to the face, the tool could make better, more realistic lip-sync.
Fortune teller test:
As she has a veil and a crown, you can definetely see where the AI cut the image. So, this would be something to avoid.
Comments