Meta is aggressively ramping up its artificial intelligence efforts in a bid to catch up to rivals like Google, Microsoft, and OpenAI. The social media giant has introduced a new text-to-image model called CM3leon that it claims achieves state-of-the-art performance for generating images from text prompts. But it’s not yet available for testing or commercial use.
CM3leon marks a breakthrough for Meta’s AI capabilities. The model can not only generate high-fidelity images from text descriptions, but also write coherent captions for existing images. This lays the groundwork for more advanced image understanding models in the future.
Meta is leveraging its formidable data science team and computing infrastructure to advance state-of-the-art models like CM3leon. While diffusion-based AI like MidJourney’s has grabbed headlines, Meta is betting on autoregressive transformer architectures (the same tech used by ChatGPT). The company claims CM3leon needs 5x less training compute than other comparable methods.
In head-to-head comparisons, CM3leon appears to handle complex objects and constraints in text prompts better than models like OpenAI’s DALL-E 2, and even Midjourney. Images shared by Meta show that its new text-to-image generator is capable of accurately representing the human anatomy (no more spaghetti hands) and can even render accurate text (no more random words in AI images)
CM3leon also provides advanced pictures that could let users create more accurate representations of their ideas: Text to image, image to image, structure-guided image editing, object to image, segmentation to image and super-resolution upscaling are some features that are not available in any generator other than Stable Diffusion using Controlnet.
Rumors of a new LLM
Meta is also reportedly planning to release a commercial version of its LLaMA natural language model to outside developers, according to sources cited by the Financial Times. If true, this will allow startups and enterprises to build custom applications powered by Meta’s AI, putting the social media behemoth in direct competition against ChatGPT (OpenAI-Microsoft), Bard (Google), and Claude v2 (Anthropic-Google)
Meta’s focus seems to be pivoting strongly towards AI across all its apps even though it has claimed to be also heavily focused on its metaverse projects. Earlier this year, the company set up a dedicated generative AI unit led by Chief Product Officer Chris Cox. Meta is also working on AI tools that generate better ads to target users.
By open-sourcing key models like the leaked LLaMA LLM (the world’s largest, most advanced, open source LLM available), Meta aims to catalyze innovation from developers worldwide to improve the technology. This contrasts with the closed-off approach of competitors like OpenAI. However, monetization of Meta’s models remains a possibility down the line.
The flurry of AI activity comes as Meta struggles with sinking stock value and controversies around privacy and misinformation stemming from activity on Facebook, which remains the company’s biggest platform. Meta CEO Mark Zuckerberg believes that this heavy investment in generative AI aligns with the company’s vision for the metaverse and could open up new revenue streams.
Meta also recently launched Threads, a Twitter clone that is seeing rapid user growth, outpacing that achieved by OpenAI after the launch of ChatGPT. It has also proven to be adept at taking key elements of previous technologies, improving them, and creating successful products that almost kill its competitors on the ground they created.
With new models like CM3leon showing promising performance, Meta seems determined to aggressively pursue AI to reshape its future, after leaving investors unimpressed with its metaverse endeavors. The race to lead generative AI just got a new runner.