Meta has announced its new artificial intelligence model, Movie Gen, for generating video and audio from text prompts. Competing with OpenAI’s Sora, Meta’s Movie Gen AI model can create videos based on user descriptions and generate accompanying audio. The company stated that it can also produce personalised videos using actual photos of individuals to depict them in various scenarios. The generated videos can be further enhanced or edited using text inputs. However, unlike the Llama series of AI models, Meta is unlikely to release Movie Gen for open use by developers, reported Reuters.
Meta Movie Gen: What is it and how it works
In a research paper detailing the new AI model, Meta explained that the Movie Gen model has been trained for both text-to-image and text-to-video tasks. When prompted, it generates several coloured images, each serving as a frame for the video.
Meta stated that Movie Gen can produce high-definition (1080p) videos of up to 16 seconds at 16 frames per second (FPS). Within its parameters, the model can generate videos of variable resolutions and durations in different aspect ratios. The company noted that the model has learned real-world visuals by “watching” videos and can reason about object motion, camera motion, subject-object interaction, and more.
For audio generation, Meta mentioned that the Movie Gen model can produce corresponding audio using video-to-audio and text-to-audio techniques. The company claims it can generate 48kHz audio with cinematic sound effects and music synchronised to the video input. While the model’s video-generating capabilities are limited to a few seconds, it can create “long-form coherent audio for videos up to several minutes long.”
Meta Movie Gen: Notable features
Meta stated that the Movie Gen model has been trained to condition on both text and images, enabling it to generate videos featuring a selected person from an actual image. The company assured that the video will maintain the person’s identity while the actions will be based on the user’s prompt.
Additionally, the model possesses video editing capabilities for both generated content and real videos. The company claimed that Movie Gen can perform “precise and imaginative edits” to a provided video based on the user’s description. In a preview shown by the company, the model successfully edited the background of a video and added additional elements to the main subject.
First Published: Oct 07 2024 | 12:52 PM IST