IP laws already exist. They exist to protect intellectual property, not to protect creator careers. Creator careers are carved by providing content the consumer desires. There are no job guarantees, for authors or anyone else. And the courts are not the place to try protect author career paths.


The Bookseller reports that the UK’s Society of Authors has welcomed legal action US authors have taken against OpenAI over alleged copyright infringement, but this case is not just about IP protection.

The San Francisco Federal Court class action case alleges OpenAI’s ChatGPT and Google’s BERT were trained using thousands of books scraped from self-publishing platform Smashwords, now owned by Draft2Digital.

The key issue here appears to be permission, as the AI training process supposedly involves copying Smashwords hosted texts to train on – “the training dataset” – and then deleting the original copy.

And straight away we are into legal grey areas. Per the bookseller’s report by Lauren BrownNicola Solomon, CEO of the Society of Authors stated, “We are very pleased to see that authors are suing OpenAI over their works being used as training materials for ChatGPT” (having) “long been concerned at the wholesale copying of authors’ and illustrators’ work to train large language models”.

This issue of “training” is one I dealt with a week ago:

No alt text provided for this image

But the issue here is alleged copyright infringement, so let’s stick with that.

Per Brown, Solomon explained that “AI learns by accessing content and copying it briefly before deleting it, after which the system will remember what it has learned. If the copying takes place without the permission of the copyright holder, she added, it is an infringement of copyright under UK law even though the copy is held only briefly by the AI before being deleted.”

Assuming US law concurs, there is a potential legal liability here regarding copying, but still the issue of “training” is unresolved. But clearly the issue is this brief copying exercise, and there is no broader allegation that the AI bots copied the original texts into new works then passed off as original creations.

And while this copying may or may not be happening as here described, the evidence presented so far is ambiguous, to say the least.

Evidence cited includes ChatGPT, when prompted, being able to accurately summarise a number of books found on the Smashwords platform, “something only possible if ChatGPT was trained on plaintiffs’ copyrighted works,” according to the lawyer.

But of course, that’s not the “only possible” explanation – the bot may have found a summary elsewhere, or put together a summary from disparate sources including reviews and product pages (one assumes if the book is on Smashwords it is being distributed to numerous outlets) which the AI bots will have access to at some level.

My reading of the case thus far is that the Smashwords scraping allegations are broadly not without substance, pending an outright denial by the bot owners. But without that clear admission, pinpointing a bot-generated summary to one single source may prove a challenge in itself, let alone the wider copyright issues being raised.

Here’s the thing: Type “book summaries” into Amazon and countless published books will appear that purport to summarise well-known published books, and that go well beyond the 400 word Chat GPT summary being evidenced in Court.

These books appear to rely on “fair use” clauses in copyright law.

Let me share one example:

Do you wish you could read great books faster? The Great Books Fast series is here to help!

Grab your copy of this summary of Homo Deus: A Brief History of Tomorrow by Yuval Noah Harari now and get the author’s valuable insights faster while saving yourself hours of reading time!

“This in-depth, chapter by chapter summary allows you to quickly read the highlights and key points of Homo Deus: A Brief History of Tomorrow by Yuval Noah Harari in minutes instead of hours. Published February 2017, Homo Deus takes New York Times Bestselling author Yuval Noah Harari’s previous work Sapiens even further, discussing in detail humanity’s future and our quest to turn humans into gods.”

Cleary the original book is still in copyright, this is a major best-selling author, and the book is backed by a major publisher.

Type in “Summaries of Stephen King books” and get countless summaries of the horror-master’s works, all openly sold on Amazon, some for a decade or more.

Bottom line is I can legally write and publish a summary of every book I have ever read, and no-one is going to scream IP infringement, regardless of whether i paid for the book, borrowed it from a library, brought it second-hand or found it left on a train. And of course in the latter two instances there is no question of the author and publisher being remunerated for my use of that book.

All the summaries referenced above were no doubt produced by humans, not bots, and that ultimatel, beyond the alleged copy and delete process, is what this legal case boils down to.

This is not about copyright infringement per se, but about AI doing what humans are already doing, only much faster.

Yes, that’s unnerving. Early chess computers were unpopular because the computer responded with a move before the human players could take their finger off the button. Unnerving? Downright scary if you’re a serious chess player. Chess computer manufacturers got round this by creating an artificial delay in the computer’s response.

Perhaps if ChatGPT and co delivered their results a bit more hesitantly it might worry authors and other creatives less, and allow us to focus on the big question of whether copyright infringement has taken place.

Cleary, summarising a book is not copyright infringement, and if the case comes down to did Chat GPT or whatever have permission to scrape Smashwords, then this is a weak legal issue and is solely about permission from the platform.

This is a slapped wrist and sent to bed without supper case, not a case on which the future of the publishing industry pivots. Yet that is what many parties are trying to imply.

The law firm Joseph Saveri LLP makes clear there is a wider agenda here that really has no place in the law courts, and this is what disturbs me so much about this and similar alleged copyright cases.

IP infringement is a crime. Period. Laws are already in place to protect IP. Individual authors and small publishers might not have the wherewithal to take legal action against a company that owns an AI bot, of course, and that ls where class actions like this come into their own. But they need to stick to the legal remit, not play to the audience.

Per Saveri, “it’s critical that we recognise and protect the rights of authors such as these against unlawful theft and fraud.”

Go for it, Joseph. Prove it happened, let the Judge deal the punishment, and let’s move on.

But Saveri’s agenda is just beginning.

“GPT-3.5 and GPT-4 are not just an infringement of authors’ rights; whether they aim to or not, models such as this will eliminate ‘author’ as a viable career path. This case represents a larger fight for preserving ownership rights for all artists and other creators.”

No. Ownership rights are already preserved in existing law. If existing laws are insufficient, that is for the lawmakers to address, not the courts.

In what country are there legal grounds for objecting to AI because it will – Saveri’s words – “eliminate ‘author’ as a viable career path”?

With that one meaningless gesture, Saveri totally undermines the legitimate case that may exist regarding scraping Smashwords, and turns this into a Luddite battle for the return to steam engines and quill pens.

Yes, AI is a threat to author jobs. Of course it is. In the same way as email was a threat to the postman’s job and the motor car a threat to the horse-drawn carriage industry, or TV a threat to Hollywood.

But society evolves by embracing new technologies and new opportunities, not clinging to the past because someone’s job depends on it. If that were the case we would still be lighting our homes with candles, doing our international business via sailing ship, and you certainly would not be reading this on a screen.

IP laws already exist. They exist to protect intellectual property, not to protect creator careers. Creator careers are carved by providing content the consumer desires. There are no job guarantees, for authors or anyone else.

And the courts are not the place to try protect author career paths. Joseph Saveri should stick to its primary remit, and challenge the AI companies on perceived breaches of existing law.