{"id":2634,"date":"2024-08-05T10:30:00","date_gmt":"2024-08-05T10:30:00","guid":{"rendered":"http:\/\/mobiledave.me\/?p=2634"},"modified":"2024-08-07T00:21:18","modified_gmt":"2024-08-07T00:21:18","slug":"its-practically-impossible-to-run-a-big-ai-company-ethically","status":"publish","type":"post","link":"http:\/\/mobiledave.me\/index.php\/2024\/08\/05\/its-practically-impossible-to-run-a-big-ai-company-ethically\/","title":{"rendered":"It\u2019s practically impossible to run a big AI company ethically"},"content":{"rendered":"
\n

\"Two

<\/figcaption><\/figure>\n

Anthropic was supposed to be the good AI company. The ethical one. The safe one.<\/p>\n

It was supposed to be different from OpenAI, the maker of ChatGPT. In fact, all of Anthropic\u2019s founders<\/a> once worked at OpenAI but quit in part because of differences over safety culture there<\/a>, and moved to spin up their own company that would build AI more responsibly. <\/p>\n

Yet lately, Anthropic has been in the headlines for less noble reasons: It\u2019s pushing back on a landmark California bill<\/a> to regulate AI. It\u2019s taking money from Google and Amazon<\/a> in a way that\u2019s drawing antitrust scrutiny. And it\u2019s being accused of aggressively scraping data<\/a> from websites without permission, harming their performance. <\/p>\n

What\u2019s going on?<\/p>\n

The best clue might come from a 2022 paper<\/a> written by the Anthropic team back when their startup was just a year old. They warned that the incentives in the AI industry \u2014\u00a0think profit and prestige \u2014 will push companies to \u201cdeploy large generative models despite high uncertainty about the full extent of what these models are capable of.\u201d They argued that, if we want safe AI, the industry\u2019s underlying incentive structure needs to change.<\/p>\n

Well, at three years old, Anthropic is now the age of a toddler, and it\u2019s experiencing many of the same growing pains that afflicted its older sibling OpenAI. In some ways, they\u2019re the same tensions that have plagued all Silicon Valley tech startups that start out with a \u201cdon\u2019t be evil\u201d<\/a> philosophy. Now, though, the tensions are turbocharged.\u00a0<\/p>\n

An AI company may want to build safe systems, but in such a hype-filled industry, it faces enormous pressure to be first out of the gate. The company needs to pull in investors to supply the gargantuan sums of money<\/a> needed to build top AI models, and to do that, it needs to satisfy them by showing a path to huge profits. Oh, and the stakes \u2014 should the tech go wrong \u2014 are much higher than with almost any previous technology. <\/p>\n

So a company like Anthropic has to wrestle with deep internal contradictions, and ultimately faces an existential question: Is it even possible to run an AI company that advances the state of the art while also truly prioritizing ethics and safety?<\/p>\n

\u201cI don\u2019t think it\u2019s possible,\u201d futurist Amy Webb, the CEO of the Future Today Institute, told me<\/a> a few months ago.<\/p>\n

If even high-minded Anthropic is becoming an object lesson in that impossibility, it\u2019s time to consider another option: The government needs to step in and change the incentive structure of the whole industry.\u00a0<\/p>\n

The incentive to keep building and deploying AI models<\/h2>\n

Anthropic has always billed itself as a safety-first company. Its leaders say they take catastrophic or existential risks from AI very seriously. CEO Dario Amodei has testified before senators<\/a>, making the case that AI models powerful enough to \u201ccreate large-scale destruction\u201d and upset the international balance of power could come into being as early as 2025. (Disclosure: One of Anthropic\u2019s early investors is James McClave, whose BEMC Foundation helps fund Future Perfect<\/a>.)\u00a0<\/p>\n

So you might expect that Anthropic would be cheering a bill<\/a> introduced by California state Sen. Scott Wiener (D-San Francisco), the Safe and Secure Innovation for Frontier Artificial Intelligence Model Act, also known as SB 1047<\/a>. That legislation would require companies training the most advanced and expensive AI models to conduct safety testing and maintain the ability to pull the plug on the models if a safety incident occurs. <\/p>\n

But Anthropic is lobbying to water down the bill. It wants to scrap the idea that the government should enforce safety standards before a catastrophe occurs. \u201cInstead of deciding what measures companies should take to prevent catastrophes (which are still hypothetical and where the ecosystem is still iterating to determine best practices)\u201d the company urges<\/a>, \u201cfocus the bill on holding companies responsible for causing actual catastrophes.\u201d<\/p>\n

In other words, take no action until something has already gone terribly wrong.<\/p>\n

In some ways, Anthropic seems to be acting like any for-profit company would to protect its interests. Anthropic has not only economic incentives \u2014 to maximize profit, to offer partners like Amazon a return on investment, and to keep raising billions<\/a> to build more advanced models \u2014 but also a prestige incentive to keep releasing more advanced models so it can maintain a reputation as a cutting-edge AI company.\u00a0<\/p>\n

This comes as a major disappointment to safety-focused groups, which expected Anthropic to welcome \u2014 not fight \u2014 more oversight and accountability.<\/p>\n

\u201cAnthropic is trying to gut the proposed state regulator and prevent enforcement until after a catastrophe has occurred \u2014 that\u2019s like banning the FDA from requiring clinical trials,\u201d Max Tegmark, president of the Future of Life Institute, told me.  <\/p>\n

The US has enforceable safety standards in industries ranging from pharma to aviation. Yet tech lobbyists continue to resist such regulations for their own products. Just as social media companies<\/a> did years ago, they make voluntary commitments to safety to placate those concerned about risks, then fight tooth and nail to stop those commitments being turned into law.<\/p>\n

In what he called \u201ca cynical procedural move,\u201d Tegmark noted that Anthropic has also introduced amendments to the bill that touch on the remit of every committee in the legislature, thereby giving each committee another opportunity to kill it. \u201cThis is straight out of Big Tech\u2019s playbook,\u201d he said<\/p>\n

An Anthropic spokesperson told me that the current version of the bill \u201ccould blunt America\u2019s competitive edge in AI development\u201d and that the company wants to \u201crefocus the bill on frontier AI safety and away from approaches that aren\u2019t adaptable enough for a rapidly evolving technology.\u201d  <\/p>\n

The incentive to gobble up everyone\u2019s data <\/h2>\n

Here\u2019s another tension at the heart of AI development: Companies need to hoover up reams and reams of high-quality text\u00a0from books and websites\u00a0in order to train their systems. But that text is created by human beings,\u00a0and human beings generally do not like having their work used without their consent.\u00a0<\/p>\n

All major AI companies scrape publicly available data to use in training, a practice they argue is legally protected under fair use. But scraping is controversial, and it\u2019s being challenged in court. Famous authors<\/a> like Jonathan Franzen and media companies<\/a> like the New York Times have sued OpenAI for copyright infringement, saying that the AI company lifted their writing without permission. This is the kind of legal battle that could end up remaking copyright law, with ramifications for all AI companies. (Disclosure: Vox Media is one of several publishers that has signed partnership agreements with OpenAI. Our reporting remains editorially independent<\/a>.)<\/p>\n

What\u2019s more, data scraping violates some websites\u2019 terms of service. YouTube says<\/a> that training an AI model using the platform\u2019s videos or transcripts is a violation of the site\u2019s terms. Yet that\u2019s exactly what Anthropic has done, according to a recent investigation<\/a> by Proof News.   <\/p>\n

Web publishers and content creators are angry. Matt Barrie, chief executive of Freelancer.com, a platform that connects freelancers with clients, said<\/a> Anthropic is \u201cthe most aggressive scraper by far,\u201d swarming the site even after being told to stop. \u201cWe had to block them because they don\u2019t obey the rules of the internet. This is egregious scraping [that] makes the site slower for everyone operating on it and ultimately affects our revenue.\u201d<\/p>\n

Dave Farina, the host of the popular YouTube science show Professor Dave Explains<\/em>, told<\/a> Proof News that \u201cthe sheer principle of it\u201d is what upsets him. Some 140 of his videos were lifted as part of the dataset that Anthropic used for training. \u201cIf you\u2019re profiting off of work that I\u2019ve done [to build a product] that will put me out of work, or people like me out of work, then there needs to be a conversation on the table about compensation or some kind of regulation,\u201d he said.<\/p>\n

Why would Anthropic take the risk of using lifted data from, say, YouTube, when the platform has explicitly forbidden it and copyright infringement is such a hot topic right now?<\/p>\n

Because AI companies need ever-more high-quality data to continue boosting their models\u2019 performance. Using synthetic data, which is created by algorithms, doesn\u2019t look promising. Research<\/a> shows that letting ChatGPT eat its own tail<\/a> leads to bizarre, unusable output. (One writer coined a term for it<\/a>: \u201cHapsburg AI,\u201d after the European royal house that famously devolved over generations of inbreeding.) What\u2019s needed is fresh data created by actual humans, but it\u2019s becoming harder and harder to harvest that. <\/p>\n

Publishers are blocking web crawlers, putting up paywalls, or updating their terms of service to bar AI companies from using their data as training fodder. A new study<\/a> from the MIT-affiliated Data Provenance Initiative looked at three of the major datasets \u2014\u00a0each containing millions of books, articles, videos, and other scraped web data \u2014 that are used for training AI. It turns out, 25 percent of the highest-quality data in these datasets is now restricted. The authors call it \u201can emerging crisis of consent.\u201d Some, like OpenAI, have begun to respond to this in part by striking licensing deals with media outlets, including Vox<\/a>. But that may only get them so far, given how much remains officially off-limits.\u00a0<\/p>\n

AI companies could theoretically accept the limits to advancement that come with restricting their training data to what can be ethically sourced, but then they wouldn\u2019t stay competitive. So companies like Anthropic are incentivized to go to more extreme lengths to get the data they need, even if that means taking dubious action.\u00a0\u00a0<\/p>\n

Anthropic acknowledges that it trained its chatbot, Claude, using the Pile, a dataset that includes subtitles from 173,536 YouTube videos. When I asked how it justifies this use, an Anthropic spokesperson told me, \u201cWith regard to the dataset at issue in The Pile, we did not crawl YouTube to create that dataset nor did we create that dataset at all.\u201d (That echoes what Anthropic has previously told<\/a> Proof News: \u201c[W]e\u2019d have to refer you to The Pile authors.\u201d)\u00a0<\/p>\n

The implication is that because Anthropic didn\u2019t make the dataset, it\u2019s fine for them to use it. But it seems unfair to shift all the responsibility onto the Pile authors \u2014 a nonprofit group<\/a> that aimed to create an open source dataset researchers could study \u2014 if Anthropic used YouTube\u2019s data in a manner that violates the platform\u2019s terms.\u00a0\u00a0<\/p>\n

\u201cCompanies should probably do their own due diligence. They\u2019re using this for commercial purposes,\u201d said Shayne Longpre, lead author on the Data Provenance Initiative study. He contrasted that with the Pile\u2019s creators and the many academics who have used the dataset to conduct research. \u201cAcademic purposes are clearly distinct from commercial purposes and are likely to have different norms.\u201d<\/p>\n

The incentive to rake in as much cash as possible <\/h2>\n

To build a cutting-edge AI model these days, you need a ton of computing power \u2014 and that\u2019s incredibly expensive. To gather the hundreds of millions of dollars needed, AI companies have to partner with tech giants.<\/p>\n

That\u2019s why OpenAI, initially founded as a nonprofit, had to create a for-profit arm and partner with Microsoft. And it\u2019s why Anthropic ended up taking multibillion-dollar investments<\/a> from Amazon and Google.<\/p>\n

Deals like these always come with risks. The tech giants want to see a quick return on their investments and maximize profit. To keep them happy, the AI companies may feel pressure to deploy an advanced AI model even if they\u2019re not sure it\u2019s safe.\u00a0<\/p>\n

The partnerships also raise the specter of monopolies \u2014 the concentration of economic power. Anthropic\u2019s investments from Google and Amazon led to a probe by the Federal Trade Commission<\/a> and are now drawing antitrust scrutiny<\/a> in the UK, where a consumer regulatory agency is investigating whether there\u2019s been a \u201crelevant merger situation\u201d that could result in a \u201csubstantial lessening of competition.\u201d\u00a0<\/p>\n

An Anthropic spokesperson said the company intends to cooperate with the agency and give them a full picture of the investments. \u201cWe are an independent company and none of our strategic partnerships or investor relationships diminish the independence of our corporate governance or our freedom to partner with others,\u201d the spokesperson said. <\/p>\n

Recent experience, though, suggests that AI companies\u2019 unique governance structures may not be enough to prevent the worst.<\/p>\n

Unlike OpenAI, Anthropic has never given either Google or Amazon a seat on its board or any observation rights over it. But, very much like OpenAI, Anthropic is relying on an unusual corporate governance structure of its own design. OpenAI initially created a board whose idealistic mission was to safeguard humanity\u2019s best interests, not please stockholders. Anthropic has created an experimental governance structure, the Long-Term Benefit Trust, a group of people without financial interest in the company who will ultimately have majority control over it, as they\u2019ll be empowered to elect and remove three of its five corporate directors. (This authority will phase in as the company hits certain milestones.) <\/p>\n

But there are limits to the idealism of the Trust: It must \u201censure that Anthropic responsibly balances the financial interests of stockholders with the interests of those affected by Anthropic\u2019s conduct and our public benefit purpose.\u201d Plus, Anthropic says<\/a>, \u201cwe have also designed a series of \u2018failsafe\u2019 provisions that allow changes to the Trust and its powers without the consent of the Trustees if sufficiently large supermajorities of the stockholders agree.\u201d \u00a0\u00a0<\/p>\n

And if we learned anything from last year\u2019s OpenAI boardroom coup<\/a>, it\u2019s that governance structures can and do change. When the OpenAI board tried to safeguard humanity by ousting CEO Sam Altman, it faced fierce pushback. In a matter of days, Altman clawed his way back into his old role, the board members who\u2019d fired him were out, and the makeup of the board changed in Altman\u2019s favor. What\u2019s more, OpenAI gave Microsoft an observer seat on the board, which allowed it to access confidential information and perhaps apply pressure at board meetings. Only when that raised (you guessed it)\u00a0antitrust scrutiny did Microsoft give up the seat<\/a>.\u00a0<\/p>\n

\u201cI think it showed that the board does not have the teeth one might have hoped it had,\u201d Carroll Wainwright, who quit OpenAI this year, told me<\/a>. \u201cIt made me question how well the board can hold the organization accountable.\u201d\u00a0<\/p>\n

That\u2019s why he and several others published a proposal demanding that AI companies grant them \u201ca right to warn about advanced artificial intelligence.\u201d Per the proposal<\/a>: \u201cAI companies have strong financial incentives to avoid effective oversight, and we do not believe bespoke structures of corporate governance are sufficient to change this.\u201d<\/p>\n

It sounds a lot like what another figure in AI told<\/a> Vox last year: \u201cI am pretty skeptical of things that relate to corporate governance because I think the incentives of corporations are horrendously warped, including ours.\u201d Those are the words of Jack Clark, the policy chief at Anthropic.\u00a0<\/p>\n

If AI companies won\u2019t fix it, who will? <\/h2>\n

The Anthropic team had it right originally, back when they published that paper in 2022: The pressures of the market are just too brutal. Private AI companies do not have the motivation to change that, so the government needs to change the underlying incentive structure within which all these companies operate.\u00a0<\/p>\n

When I asked Webb, the futurist, what a better AI business ecosystem could look like, she said it would include a mix of carrots and sticks: positive incentives, like tax breaks for companies that prove they\u2019re upholding the highest safety standards; and negative incentives, like regulation that would fine companies if they deploy biased algorithms<\/a>. <\/strong><\/p>\n

With AI regulation at a standstill at the federal level<\/a> \u2014 plus a looming election \u2014 it\u2019s falling to states to pass new laws. The California bill, if it passes, would be one piece of that puzzle. <\/p>\n

Civil society also has a role to play. If publishers and content creators are not happy about having their work used as training fodder, they can fight back. If tech workers are worried about what they see at AI companies, they can blow the whistle. AI can generate a whole lot on our behalf, but resistance to its own problematic deployment is something we have to generate ourselves.<\/p>\n","protected":false},"excerpt":{"rendered":"

Anthropic was supposed to be the good AI company. The ethical one. The safe one. It was supposed to be different from OpenAI, the maker of ChatGPT. In fact, all of Anthropic\u2019s founders once worked at OpenAI but quit in part because of differences over safety culture there, and moved to spin up their own company that would build AI more responsibly.  Yet lately, Anthropic has been in the headlines for less noble reasons: It\u2019s pushing back on a landmark California bill to regulate AI. It\u2019s taking money from Google and Amazon in a way that\u2019s drawing antitrust scrutiny. And it\u2019s being accused of aggressively scraping data from websites without permission, harming their performance.  What\u2019s going on? The best clue might come from a 2022 paper written by the Anthropic team back when their startup was just a year old. They warned that the incentives in the AI industry \u2014\u00a0think profit and prestige \u2014 will push companies to \u201cdeploy large generative models despite high uncertainty about the full extent of what these models are capable of.\u201d They argued that, if we want safe AI, the industry\u2019s underlying incentive structure needs to change. Well, at three years old, Anthropic is now the age of a toddler, and it\u2019s experiencing many of the same growing pains that afflicted its older sibling OpenAI. In some ways, they\u2019re the same tensions that have plagued all Silicon Valley tech startups that start out with a \u201cdon\u2019t be evil\u201d philosophy. Now, though, the tensions are turbocharged.\u00a0 An AI company may want to build safe systems, but in such a hype-filled industry, it faces enormous pressure to be first out of the gate. The company needs to pull in investors to supply the gargantuan sums of money needed to build top AI models, and to do that, it needs to satisfy them by showing a path to huge profits. Oh, and the stakes \u2014 should the tech go wrong \u2014 are much higher than with almost any previous technology.  So a company like Anthropic has to wrestle with deep internal contradictions, and ultimately faces an existential question: Is it even possible to run an AI company that advances the state of the art while also truly prioritizing ethics and safety? \u201cI don\u2019t think it\u2019s possible,\u201d futurist Amy Webb, the CEO of the Future Today Institute, told me a few months ago. If even high-minded Anthropic is becoming an object lesson in that impossibility, it\u2019s time to consider another option: The government needs to step in and change the incentive structure of the whole industry.\u00a0 The incentive to keep building and deploying AI models Anthropic has always billed itself as a safety-first company. Its leaders say they take catastrophic or existential risks from AI very seriously. CEO Dario Amodei has testified before senators, making the case that AI models powerful enough to \u201ccreate large-scale destruction\u201d and upset the international balance of power could come into being as early as 2025. (Disclosure: One of Anthropic\u2019s early investors is James McClave, whose BEMC Foundation helps fund Future Perfect.)\u00a0 So you might expect that Anthropic would be cheering a bill introduced by California state Sen. Scott Wiener (D-San Francisco), the Safe and Secure Innovation for Frontier Artificial Intelligence Model Act, also known as SB 1047. That legislation would require companies training the most advanced and expensive AI models to conduct safety testing and maintain the ability to pull the plug on the models if a safety incident occurs.  But Anthropic is lobbying to water down the bill. It wants to scrap the idea that the government should enforce safety standards before a catastrophe occurs. \u201cInstead of deciding what measures companies should take to prevent catastrophes (which are still hypothetical and where the ecosystem is still iterating to determine best practices)\u201d the company urges, \u201cfocus the bill on holding companies responsible for causing actual catastrophes.\u201d In other words, take no action until something has already gone terribly wrong. In some ways, Anthropic seems to be acting like any for-profit company would to protect its interests. Anthropic has not only economic incentives \u2014 to maximize profit, to offer partners like Amazon a return on investment, and to keep raising billions to build more advanced models \u2014 but also a prestige incentive to keep releasing more advanced models so it can maintain a reputation as a cutting-edge AI company.\u00a0 This comes as a major disappointment to safety-focused groups, which expected Anthropic to welcome \u2014 not fight \u2014 more oversight and accountability. \u201cAnthropic is trying to gut the proposed state regulator and prevent enforcement until after a catastrophe has occurred \u2014 that\u2019s like banning the FDA from requiring clinical trials,\u201d Max Tegmark, president of the Future of Life Institute, told me.   The US has enforceable safety standards in industries ranging from pharma to aviation. Yet tech lobbyists continue to resist such regulations for their own products. Just as social media companies did years ago, they make voluntary commitments to safety to placate those concerned about risks, then fight tooth and nail to stop those commitments being turned into law. In what he called \u201ca cynical procedural move,\u201d Tegmark noted that Anthropic has also introduced amendments to the bill that touch on the remit of every committee in the legislature, thereby giving each committee another opportunity to kill it. \u201cThis is straight out of Big Tech\u2019s playbook,\u201d he said An Anthropic spokesperson told me that the current version of the bill \u201ccould blunt America\u2019s competitive edge in AI development\u201d and that the company wants to \u201crefocus the bill on frontier AI safety and away from approaches that aren\u2019t adaptable enough for a rapidly evolving technology.\u201d   The incentive to gobble up everyone\u2019s data  Here\u2019s another tension at the heart of AI development: Companies need to hoover up reams and reams of high-quality text\u00a0from books and websites\u00a0in order to train their systems. But that text is created by human beings,\u00a0and human beings generally do not like having their work used without their consent.\u00a0 All major AI companies scrape publicly available data to use in training, a practice they argue is legally protected under fair use. But scraping is controversial, and it\u2019s being challenged in court. Famous authors like Jonathan Franzen and media companies like the New York Times have sued OpenAI for copyright infringement, saying that the AI company lifted their writing without permission. This is the kind of legal battle that could end up remaking copyright law, with ramifications for all AI companies. (Disclosure: Vox Media is one of several publishers that has signed partnership agreements with OpenAI. Our reporting remains editorially independent.) What\u2019s more, data scraping violates some websites\u2019 terms of service. YouTube says that training an AI model using the platform\u2019s videos or transcripts is a violation of the site\u2019s terms. Yet that\u2019s exactly what Anthropic has done, according to a recent investigation by Proof News.    Web publishers and content creators are angry. Matt Barrie, chief executive of Freelancer.com, a platform that connects freelancers with clients, said Anthropic is \u201cthe most aggressive scraper by far,\u201d swarming the site even after being told to stop. \u201cWe had to block them because they don\u2019t obey the rules of the internet. This is egregious scraping [that] makes the site slower for everyone operating on it and ultimately affects our revenue.\u201d Dave Farina, the host of the popular YouTube science show Professor Dave Explains, told Proof News that \u201cthe sheer principle of it\u201d is what upsets him. Some 140 of his videos were lifted as part of the dataset that Anthropic used for training. \u201cIf you\u2019re profiting off of work that I\u2019ve done [to build a product] that will put me out of work, or people like me out of work, then there needs to be a conversation on the table about compensation or some kind of regulation,\u201d he said. Why would Anthropic take the risk of using lifted data from, say, YouTube, when the platform has explicitly forbidden it and copyright infringement is such a hot topic right now? Because AI companies need ever-more high-quality data to continue boosting their models\u2019 performance. Using synthetic data, which is created by algorithms, doesn\u2019t look promising. Research shows that letting ChatGPT eat its own tail leads to bizarre, unusable output. (One writer coined a term for it: \u201cHapsburg AI,\u201d after the European royal house that famously devolved over generations of inbreeding.) What\u2019s needed is fresh data created by actual humans, but it\u2019s becoming harder and harder to harvest that.  Publishers are blocking web crawlers, putting up paywalls, or updating their terms of service to bar AI companies from using their data as training fodder. A new study from the MIT-affiliated Data Provenance Initiative looked at three of the major datasets \u2014\u00a0each containing millions of books, articles, videos, and other scraped web data \u2014 that are used for training AI. It turns out, 25 percent of the highest-quality data in these datasets is now restricted. The authors call it \u201can emerging crisis of consent.\u201d Some, like OpenAI, have begun to respond to this in part by striking licensing deals with media outlets, including Vox. But that may only get them so far, given how much remains officially off-limits.\u00a0 AI companies could theoretically accept the limits to advancement that come with restricting their training data to what can be ethically sourced, but then they wouldn\u2019t stay competitive. So companies like Anthropic are incentivized to go to more extreme lengths to get the data they need, even if that means taking dubious action.\u00a0\u00a0 Anthropic acknowledges that it trained its chatbot, Claude, using the Pile, a dataset that includes subtitles from 173,536 YouTube videos. When I asked how it justifies this use, an Anthropic spokesperson told me, \u201cWith regard to the dataset at issue in The Pile, we did not crawl YouTube to create that dataset nor did we create that dataset at all.\u201d (That echoes what Anthropic has previously told Proof News: \u201c[W]e\u2019d have to refer you to The Pile authors.\u201d)\u00a0 The implication is that because Anthropic didn\u2019t make the dataset, it\u2019s fine for them to use it. But it seems unfair to shift all the responsibility onto the Pile authors \u2014 a nonprofit group that aimed to create an open source dataset researchers could study \u2014 if Anthropic used YouTube\u2019s data in a manner that violates the platform\u2019s terms.\u00a0\u00a0 \u201cCompanies should probably do their own due diligence. They\u2019re using this for commercial purposes,\u201d said Shayne Longpre, lead author on the Data Provenance Initiative study. He contrasted that with the Pile\u2019s creators and the many academics who have used the dataset to conduct research. \u201cAcademic purposes are clearly distinct from commercial purposes and are likely to have different norms.\u201d The incentive to rake in as much cash as possible  To build a cutting-edge AI model these days, you need a ton of computing power \u2014 and that\u2019s incredibly expensive. To gather the hundreds of millions of dollars needed, AI companies have to partner with tech giants. That\u2019s why OpenAI, initially founded as a nonprofit, had to create a for-profit arm and partner with Microsoft. And it\u2019s why Anthropic ended up taking multibillion-dollar investments from Amazon and Google. Deals like these always come with risks. The tech giants want to see a quick return on their investments and maximize profit. To keep them happy, the AI companies may feel pressure to deploy an advanced AI model even if they\u2019re not sure it\u2019s safe.\u00a0 The partnerships also raise the specter of monopolies \u2014 the concentration of economic power. Anthropic\u2019s investments from Google and Amazon led to a probe by the Federal Trade Commission and are now drawing antitrust scrutiny in the UK, where a consumer regulatory agency is investigating whether there\u2019s been a \u201crelevant merger situation\u201d that could result in a \u201csubstantial lessening of competition.\u201d\u00a0 An Anthropic spokesperson said the company intends to cooperate with the agency and give them a full picture of the investments. \u201cWe are an independent company and none of our strategic partnerships or investor relationships diminish the independence of our corporate governance or our freedom…<\/p>\n","protected":false},"author":1,"featured_media":2636,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[15],"tags":[],"class_list":["post-2634","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech-policy"],"_links":{"self":[{"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/posts\/2634"}],"collection":[{"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/comments?post=2634"}],"version-history":[{"count":2,"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/posts\/2634\/revisions"}],"predecessor-version":[{"id":2637,"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/posts\/2634\/revisions\/2637"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/media\/2636"}],"wp:attachment":[{"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/media?parent=2634"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/categories?post=2634"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/mobiledave.me\/index.php\/wp-json\/wp\/v2\/tags?post=2634"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}