Meta’s new AI assistant is fueled by your public Facebook and Instagram posts
At Connect 2023, Meta rolled out a slew of AI editing tools and features, introducing AI stickers and image editing capabilities like restyle and backdrop. The showstopper, however, was the unveiling of Meta's new AI assistant, set to join WhatsApp, Messenger, and Instagram in the coming days. But, the method used to train this assistant might not sit well with everyone.
According to Reuters, Meta used public Facebook and Instagram posts to train portions of its new Meta AI virtual assistant. In an interview, the company's top policy executive assured Reuters that they excluded private posts shared exclusively with family and friends to respect consumers' privacy.
Nick Clegg, Meta's President of Global Affairs, shared that private chats on messaging services were also off the training data table. Meta took steps to filter private details from public datasets used for training. Clegg highlighted that Meta "tried to exclude datasets that have a heavy preponderance of personal information." Clegg also said that the "vast majority" of the data used by Meta for training was publicly available. For instance, LinkedIn was deliberately omitted due to privacy concerns.
Meta developed the assistant using a custom model based on the Llama 2 large language model, publicly released in July, and a new model named Emu, designed for generating images in response to text prompts. The product is set to produce text, audio, and imagery, accessing real-time information through a partnership with Microsoft's Bing search engine.
Public Facebook and Instagram posts, containing both text and photos, played a role in training Meta AI. Emu focused on image generation, while chat functions were based on Llama 2, enhanced with publicly available and annotated datasets. Clegg said that safety restrictions were implemented to prevent the creation of photo-realistic images of public figures.
Addressing concerns about copyrighted materials, Clegg anticipates potential litigation, especially regarding whether creative content falls under the existing fair use doctrine. While Meta believes it does, Clegg acknowledges this might unfold in legal battles.
According to Reuters, Meta used public Facebook and Instagram posts to train portions of its new Meta AI virtual assistant. In an interview, the company's top policy executive assured Reuters that they excluded private posts shared exclusively with family and friends to respect consumers' privacy.
Clegg's remarks come amidst criticism directed at tech companies, including Meta, OpenAI, and Google, for using internet-scraped information without proper authorization to train their AI models. These models ingest massive amounts of data to summarize information and generate imagery.
Meta developed the assistant using a custom model based on the Llama 2 large language model, publicly released in July, and a new model named Emu, designed for generating images in response to text prompts. The product is set to produce text, audio, and imagery, accessing real-time information through a partnership with Microsoft's Bing search engine.
Addressing concerns about copyrighted materials, Clegg anticipates potential litigation, especially regarding whether creative content falls under the existing fair use doctrine. While Meta believes it does, Clegg acknowledges this might unfold in legal battles.
Things that are NOT allowed: