Home
News
You are here

Meta works on Emu Video and Emu Edit: Generative AI tricks for GIFs, photos and 4-second videos

By Sebastian Pier

Published: Nov 17, 2023, 6:03 AM

0comments

Apps

Meta works on Emu Video and Emu Edit: Generative AI tricks for GIFs, photos and 4-second videos

Meta is announcing through a blog post that they’re busy working on new research into “controlled image editing based solely on text instructions and a method for text-to-video generation based on diffusion models”.

Which, in simpler words, means they want to put in Facebook and Instagram generative AI tools. The projects Meta is developing are called Emu Video and Emu Edit.

What is Emu Video?

This tool, as the name suggests, is for generating video. Meta describes it as “a simple method for text-to-video generation based on diffusion models”. Emu Video should respond to a variety of inputs: text only, image only, and both text and image. The process is split into two steps, Meta clarifies: first, generating images conditioned on a text prompt, and then generating video conditioned on both the text and the generated image.

Our state-of-the-art approach is simple to implement and uses just two diffusion models to generate 512x512 four-second-long videos at 16 frames per second.

What is Emu Edit?

This one should allow “precise image editing” via recognition and generation tasks. Like Meta says, the use of generative AI is often a process, not a single task.

“Emu Edit is capable of free-form editing through instructions, encompassing tasks such as local and global editing, removing and adding a background, color and geometry transformations, detection and segmentation, and more. Current methods often lean towards either over-modifying or under-performing on various editing tasks. We argue that the primary objective shouldn’t just be about producing a ‘believable’ image. Instead, the model should focus on precisely altering only the pixels relevant to the edit request. Unlike many generative AI models today, Emu Edit precisely follows instructions, ensuring that pixels in the input image unrelated to the instructions remain untouched. For instance, when adding the text ‘Aloha!’ to a baseball cap, the cap itself should remain unchanged”, says the Meta team.

The potential use cases

The road ahead is definitely AI-driven for Meta.

“Although this work is purely fundamental research right now, the potential use cases are clearly evident. Imagine generating your own animated stickers or clever GIFs on the fly to send in the group chat rather than having to search for the perfect media for your reply. Or editing your own photos and images, no technical skills required. Or adding some extra oomph to your Instagram posts by animating static photos. Or generating something entirely new”, the blog post concludes.

View Full Bio

Sebastian, a veteran of a tech writer with over 15 years of experience in media and marketing, blends his lifelong fascination with writing and technology to provide valuable insights into the realm of mobile devices. Embracing the evolution from PCs to smartphones, he harbors a special appreciation for the Google Pixel line due to their superior camera capabilities. Known for his engaging storytelling style, sprinkled with rich literary and film references, Sebastian critically explores the impact of technology on society, while also perpetually seeking out the next great tech deal, making him a distinct and relatable voice in the tech world.

Read the latest from Sebastian Pier