- Meta uses public photos and text from Instagram and Facebook to train an AI text-to-image generator.
- Meta executive Chris Cox told Bloomberg Tech Summit that the company “doesn't train on private matters.”
- The chief product officer's comments come as Big Tech companies race to acquire data to train AI models.
Big tech companies are vying for AI training data, and Meta appears to have a big advantage over its rivals. That means using photos from Instagram and Facebook.
Chris Cox, Meta's chief product officer, said Thursday at the Bloomberg Tech Summit that the platform uses publicly available photos and text to train a text-to-image generative model called Emu. Told.
“We don't train on things that are private. We don't train on things that people share with their friends. We train on things that are public,” he said.
Because Instagram has a lot of photos “of art, fashion, culture, and not just images of people and us,” Meta's text-to-image model can produce “really amazing quality images,” Cox said. he added.
Users can create images on Meta AI by entering a prompt that starts with the word “imagine,” and four images will be generated, according to the website.
For an AI model to be effective, it must be trained by inputting data. it is a debatable issue, There is no way to prevent it Prevent copyrighted content from being collected from the Internet and used to create your LLM.
However, the US Copyright Office has been trying to address this issue since early last year and is considering amending the law to address the issue.
One of the ways companies are trying to capture data is by partnering with other companies. For example, OpenAI partners with several media outlets to license content and develop models for it.
Meta also considered acquiring publisher Simon & Schuster to get more data to train its models, The New York Times reported last month.
In addition to raw datasets, companies use “feedback loops” – data collected from past interactions and outputs that are analyzed to improve future performance – to train models. It contains algorithms that tell the AI model when an error occurs so it can learn from it.
Meta CEO Mark Zuckerberg told The Verge last month that feedback loops are “more valuable” than any “upfront corpus.”
Meta did not immediately respond to Business Insider's request for comment outside of normal business hours.