Sadly the NYT law suit, however well intentioned and justified they might believe it to be, will only further muddy the waters and inhibit AI’s evolution. Good for the status quo, yes, but is it good for the future of the publishing industry in particular and society in general?
A ragbag of outraged authors and other creatives have been suing AI companies this past year, alleging copyright theft and IP infringement, but without any significant progress. Legal arguments have been rebuffed by irritated judges, who at best have told the litigants to go away and rethink their strategies.
That’s often the case with ambulance-chasing class-actions that invite anyone and his aggrieved dog to sign up to a game of legal hoopla, frequently with the flimsiest of legal arguments and a whole lot of finger-crossing.
Throw legal spaghetti at the wall and hope some of it sticks. And that of course is the principle behind no-up-front-fees class-actions. Get as many people on board with as many class actions as possible. and it just takes one big success to cover all the losses.
Great for lawyers. Great for John Grisham’s publishers!
But 2023 was not a good year for the lawyers backing the AI-is-the-bad-guy brigade. Either the law firms were as bad as they sounded with their soundbite claims, or the law itself dos not give authors and other creatives the protection they seem to believe it does.
The law, of course, already protects works from being willfully reproduced and distributed. It may not be perfect, but it’s adequate, if outdated.
But that’s not what the big AI companies stand accused of, and we should be absolutely clear that reports of individual authors abusing AI to copy and redistribute protected works or ride on the names of established authors is just that: abuse.
It’s no more the fault of the AI company than a drunk-driver that kills someone in a car accident is the fault of the motor company.
Much of the nonsense litigation being farmed out by the 2023 class-action ambulance chasers is of that nature. One action even contends that AI is a threat to author careers and this is in some way illegal.
Argued one law firm earlier this year: “GPT-3.5 and GPT-4 are not just an infringement of authors’ rights; whether they aim to or not, models such as this will eliminate ‘author’ as a viable career path. “
But amid all the smoke and mirrors legal spaghetti-throwing is a claim that certainly bears close scrutiny in the courts: Have the AI companies broken the law by using copyrighted content to “train” their AI-models, and are they liable to compensate authors and publishers for that training.
That is at the heart of the law suit filed in federal court this week by the New York Times against OpenAI and Microsoft, and the final ruling, probably by the US Supreme Court once the losing side appeals the earlier decisions, may cause immense damage to the future of the gen-AI industry that will in term harm the publishing industries.
The New York Times asserts that the AI large-language-models operated by OpenAi and Microsoft “were built by copying and using millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more.”
Further, submits the NYT, the defendants “seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment.“
The NYT‘s legal submission includes “scores of examples” of where AI “recites Times content verbatim, closely summarizes it, and mimics its expressive style.”
It is further asserted the Bing search engine “copies and categorizes The Times’s online content, to generate responses that contain verbatim excerpts and detailed summaries of Times articles that are significantly longer and more detailed than those returned by traditional search engines.”
Now I’m no lawyer, but I’m struggling to envisage how “responses that contain verbatim excerpts and detailed summaries of Times articles that are significantly longer and more detailed than those returned by traditional search engines” is necessarily unlawful.
The argument itself recognises that verbatim excerpts and summaries are the norm for “traditional search engines“, and we must logically assume, absent any reference to previous case law, that the NYT has not been too bothered by this.
But this has to be more than just about how long an excerpt or summary can be within the bounds of existing fair use law. And of course, it is. It’s about permission and compensation, and as I’ve argued many times in my essays on AI, this is the one area where the publishing industry does have grounds for being extremely pissed off.
But the NYT‘s heavy-handed legal action will likely not have a happy ending for any party.
The NYT argues that by using NYT content “without The Times’s permission or authorization (the LLMs) undermine and damage The Times’s relationship with its readers and deprive The Times of subscription, licensing, advertising, and affiliate revenue.”
The legal brief goes into detail of how Microsoft and OpenAI have used the success of ChatGPT to add billions to their company values. No, make that a trillion. And that New York Times content played a key role in these valuations.
“Microsoft’s deployment of Times-trained LLMs throughout its product line helped boost its market capitalization by a trillion dollars in the past year alone. And OpenAI’s release of ChatGPT has driven its valuation to as high as $90 billion.”
Be serious. Yes, the AI elements have played a part in boosting market capitalisation (as ever, apologies for reverting to British English when not quoting US texts), but so have a lot of other things. And clearly the NYT is getting free advertising from the published AI responses if they are crediting the NYT for the material they quote or summarise. This is not a one-way street.
In the legal summary (not created by AI), it is acknowledged the NYT attempted to negotiate with the defendants to seek compensation for NYT material used, but the defendants argued their use was within the realms of “fair use”.
“Defendants insist that their conduct is protected as “fair use” because their unlicensed use of copyrighted content to train GenAI models serves a new “transformative” purpose.“
The NYT of course disagrees. And from here we move to the law courts.
The NYT lawyers are not inclined to frivolous litigation and will have carefully considered their arguments and the likely counter-arguments, yet still believe they have a case. On the other side, the OpenAI lawyers and the Microsoft have clearly decided existing legal interpretations of fair use puts the law firmly on their side.
Looking over the NYT submission, as a layman, suggests there are clear instances where the NYT might be on to something, and that the fair use rule likely has been pushed beyond its intended limits. But ultimately a court – or rather, several courts culminating in the Supreme Court – will decide. Which means a final decision is many years in the future.
Which is semi-good news for everyone, because it means AI companies can, for now, continue to develop their models, although they will be wary of a future ruling against them and their ilk that may one day land them with crippling compensation and damages claims.
But the road to that decision is bumpy, with hidden twists and turns.
Just this month OpenAI landed a three-year multi-million euro deal with German publisher Axel-Springer that allows OpenAI to use content for training purposes from Axel Springer publications including Politico, Business Insider, Bild and Die Welt, and also to summarise content for its ChatGPT responses.
CNBC called the deal “unprecedented”, clearly unaware that Associated Press signed a deal with OpenAI back in July of this year.
And this is where things get interesting, because the NYT has clearly stated in its submission to the court that, “For months, The Times has attempted to reach a negotiated agreement with Defendants, in accordance with its history of working productively with large technology platforms to permit the use of its content in new digital products (including the news products developed by Google, Meta, and Apple). These negotiations have not led to a resolution.”
The problem being here that OpenAI has come to agreement with two other major publishing entities, which begs the question what aspect of the demands the NYT was making was not acceptable to the defendants.
That will be even more of an issue if OpenAI comes to an agreement with Apple before this NYT court case gets heard.
We might reasonably assume German law differs from US law and that that might explain the deal with Axel-Springer, but this would not explain the deal with New York based AP.
And while for me that’s purely an issue of piqued curiosity, for the courts it may be decisive.
But as above, the final decision is some years away, unless one of these already underway class actions gets a final decision sooner.
Meanwhile publishers will continue to use AI, some much more than others, and many will continue to hide behind mostly soft-soap sentiments about how important authors and other creatives are to them, so they can sit on the fence and avoid difficult decisions.
And sadly the NYT law suit, however well intentioned and justified they might believe it to be, will only further muddy the waters and inhibit AI’s evolution.
Good for the status quo, yes, but is it good for the future of the publishing industry in particular and society in general?
Clearly not. The advantages of AI are clearer with each passing day, across every walk of life.
The AI genie cannot and will not be put back in the bottle.