A few weeks ago, we published an article about Sora, the new OpenAI's text-to-video generator, wondering if (and how) it might revolutionize filmmaking. The tool was subsequently tested by several artists and filmmakers, whose short films gained traction and stirred up curiosity, but also doubts in the minds of many creatives.
But is AI as easy as it seems?
The short answer is: no, as we'll see with the Air Head case, a Sora short film.
As of today, AI is a tool that recreates things (images, videos, text). While anyone can quickly generate an image of a bunny under a rainbow in Dali's style, the standout AI art comes from a new breed of artists who invest time mastering these ever-evolving tools. They experiment with prompts, iterate repeatedly, create new workflows, and try new approaches. More often than not, the output is refined or post-produced using "traditional" tools to make the final result cohesive.
"It’s not as easy as just a magic trick: type something in and get exactly what you were hoping for," Sydney Leeder, Shy Kids producer, about Sora.
THE AIR HEAD VIDEO CASE
A prime example is Sora's short film that went viral, Air Head. Created by the Toronto-based group Shy Kids, it features a man named Sonny with a yellow balloon for a head. The film's concept got the attention of thousands partly because it was promoted as a showcase of Sora's imaginative content generation. And, yes, it is amazing.
Today, it’s often cited as a prime example of “what AI can do in video”, but is it really only AI?
Again, the short answer is: no.
First and foremost, the filmmakers at Shy Kids were the ones who came up with this great idea. In order to make it a reality, they had to test various prompts and create many iterations of scenes to find a few that worked.
In an in-depth interview with FXGuide, Patrick Cederberg, Shy Kids' animation and post-production director, discussed their experience using Sora. He noted that hundreds of generations were produced, saying, "my math is bad, but I would guess probably 300:1 in terms of the amount of source material to what ended up in the final."
He also explained that, on average, rendering a 3 to 20-second clip took around 10 to 20 minutes. While Sora can render up to 720p, they chose to work "at 480 for speed and then upright using Topaz", another AI tool that upscales video resolution.
Despite Sora's capabilities, the scenes generated also required extensive post-production work. They faced issues like maintaining the balloon's color and shape across scenes, and had to remove unwanted artifacts like faces embedded in the balloon.
"What you end up seeing took work, time, and human hands to get it semi-consistent, through curation, scriptwriting, editing, voiceover, music, sound design, color correction... all the usual post-production stuff", Cederberg explains in the BTS video.
So, while the technology enabled the filmmakers to generate surreal short clips quickly (which is very exciting), it still required manual intervention to achieve the complete vision.
This shows that tools like Sora aren't a magic bullet for seamless and original art. Instead, they complement traditional techniques and artists. As Sydney Leeder noted, "using Sora definitely opens up a lot more possibilities, especially with indie film crews working on low-budget projects".
Comments