Meta releases an AI music generator that creates music from text
Recently, Meta and Microsoft joined forces to introduce the new Llama 2 - a next-generation large language AI model, resulting in Mark Zuckerberg's company working on several generative AI tools for Instagram, including one that helps identify AI-generated content. Such a tool might be more needed than we thought, as Meta has now introduced its latest project.
In a blog post, Meta introduced its latest AI tool, AudioCraft, which generates, according to the company, high-quality, realistic audio and music from text. The company says this tool would help, for example, "a small business owner add a soundtrack to their latest video ad on Instagram with ease."
So this might also mean no more browsing through different songs for hours before uploading a Reel. You might just have to write down what type of music you need, and the AI tool will generate it. Not sure how artists would feel about that, though.
AudioCraft is still not rolled out on any of the Meta-owned social media platforms, but maybe it is just a matter of time before the AI tool becomes just another feature we can use daily. For now, Meta is releasing AudioCraft as open-source code. The company says that the goal is to allow researchers and practitioners to train their own models with their own datasets and help advance the field of AI-generated audio and music.
AudioGen is an AI model capable of text-to-audio generation. By providing a written description of an acoustic scene, the model can produce realistic environmental sounds that match the description, complete with complex scene context and lifelike recording conditions. The EnCodec decoder ensures higher-quality music generation with fewer issues.
According to Meta, "responsible innovation can't happen in isolation." The tech giant also says that its models' training datasets lack diversity, especially in terms of music styles and language. By sharing the code for AudioCraft, Meta aims to enable other researchers to test new methods to reduce bias and misuse in generative models.
The tech giant also shares that it is excited to see the creative outcomes people will produce using its method. You can already hear hundreds of samples that the AI tool generated, from 80s disco through jazz instrumentals to, for example, a male speaking with many people cheering in the background.
In a blog post, Meta introduced its latest AI tool, AudioCraft, which generates, according to the company, high-quality, realistic audio and music from text. The company says this tool would help, for example, "a small business owner add a soundtrack to their latest video ad on Instagram with ease."
AudioCraft is still not rolled out on any of the Meta-owned social media platforms, but maybe it is just a matter of time before the AI tool becomes just another feature we can use daily. For now, Meta is releasing AudioCraft as open-source code. The company says that the goal is to allow researchers and practitioners to train their own models with their own datasets and help advance the field of AI-generated audio and music.
AudioCraft is a collection of three models: MusicGen, AudioGen, and an improved version of EnCodec. MusicGen is an audio generation model designed for creating music. It was trained on a large dataset of around 400,000 music recordings, including text descriptions and metadata, totaling 20,000 hours of music owned by Meta or licensed for this specific purpose.
Image Credit–Meta
AudioGen is an AI model capable of text-to-audio generation. By providing a written description of an acoustic scene, the model can produce realistic environmental sounds that match the description, complete with complex scene context and lifelike recording conditions. The EnCodec decoder ensures higher-quality music generation with fewer issues.
According to Meta, "responsible innovation can't happen in isolation." The tech giant also says that its models' training datasets lack diversity, especially in terms of music styles and language. By sharing the code for AudioCraft, Meta aims to enable other researchers to test new methods to reduce bias and misuse in generative models.
Things that are NOT allowed: