Whilst road testing some AI video generation tools (round-up coming later this week), I decided to set myself the challenge of creating a movie trailer using only AI tools.
AI-generated videos are currently dominated by a few genres (sci-fi, animation, nature doc), so I decided to go in a different direction and make a trailer for an imagined alternative sequel to the 1980’s Stanley Kubrick psychological horror classic, The Shining.
I banned myself from directly referencing Kubrick, The Shining or any of the original actors in my prompts, but allowed myself to use the name of the hotel, both in the title of this fictitious sequel and in the ChatGPT prompt which generated the script (“write a voiceover for a short trailer for a film called Return to The Overlook”).
I pasted the narration into Elevenlabs and chose the creepiest premade voice available (Ethan).
As well as the narration, ChatGPT proposed music, sounds effects and individual shots, some of which I used as inspiration.
For the backing track, I asked Suno to create “an instrumental backing track to a horror film, starting with disconcerting strings and building to a frenzied crescendo”.
I generated most of the visuals using Midjourney, which I then animated using Runway.
I tried using ChatGPT to generate image prompts but wasn’t happy with the resulting images so wrote them myself. I included “1970s cinematic shot” in most of the prompts (detailed below) to create a consistent aesthetic.
I generated the credit screens using Ideogram, which is better at accurately rendering text than other AI image generators.
I edited the clips together using Kapwing and added a few sound effects from its library and other royalty-free sound effects libraries.
The finished result is below (warning: contains some AI-generated blood).
Some reflections
I am not an experienced video editor. I was surprised how quick and easy it was to use these tools to create something serviceable (albeit horribly derivative).
These tools are going to get easier to use and the quality and duration of the output is going to improve (👋 Sora).
The copyright challenges of generative AI are glaringly apparent. I would be willing to bet 50p that the script for The Shining’s original trailer and stills from film were in ChatGPT and Midjourney’s respective training sets.
I got better results by using separate tools (e.g. generating images in Midjourney rather than in Runway). It’s a smart move from Adobe to provide access to third-party models from within Premiere Pro.
Some of the realism of the Midjourney images was lost when Runway re-rendered them as animated video (e.g. the mountains in the first shot, the boy on his trike and the faces of the girls in the hallway).
Until Sora arrives, it’s best to steer clear of close-ups of faces (someone should have told the producers of Next Stop Paris 😱)
Runway refused to animate a few of the Midjourney-generated images: the blood-filled bathtub, the screaming child and an image I didn’t end up using (below) of “a deranged man looking through a missing splintered panel in a white wooden door with gritted teeth and a wild look in his eyes” - I think because it looked too much like Jack Nicholson…
Midjourney struggled to generate a convincing image of a key in a lock 🤷
Tools used
Script - ChatGPT
Voiceover - Elevenlabs
Soundtrack - Suno
Imagery - Midjourney & Ideogram
Video animation - Runway
Video editing - Kapwing
Sound effects - Kapwing, Freesound & Pixabay
Prompts used to generate the images
“1970s cinematic shot of an American plain, snow capped mountains on the horizon, a road cutting across the plain with a classic white car driving along it”
“1970s cinematic shot of a remote elevated grand American hotel lodge, cloud-covered mountain behind, pine trees in foreground”
“disconcerting tracking shot along the dimly lit corridor of a 1970s rural American hotel. balloons lie on the floor of the hallway”
“disconcerting 1970s cinematic tracking shot following a five year old boy in a red top and blue dungarees pedalling a blue tricycle along the corridor of a 1970s hotel with pale floral wallpaper and a brown carpet”
“disconcerting 1970s cinematic shot of two twin pale 10 year old girls in matching blue dresses with a white ribbon around the waist, white knee-length socks, black Mary Jane shoes, holding hands in a corridor with pale floral wallpaper and a brown carpet”
“1970s cinematic shot of a classic typewriter with a single sheet of cream paper in it and one line of typed text on a desk in a 1970s American hotel lounge”
“1970s cinematic shot of a locked white wooden door in a dimly lit 1970s American hotel”
“1970s cinematic shot of an axe breaking through a white wooden door in a dimly lit 1970s American hotel”
“1970s cinematic shot of a mint green fitted bathtub overflowing with blood”
“1970s close up cinematic shot of a 8 year old boy in a red top and blue dungarees screaming in terror with his hands on his cheeks in a dimly lit 1970s American hotel”
The careful selective and justified use of the most appropriate software for each aspect is as impressive as the product.