Lately we’ve been seeing lawsuits popping up claiming that by training algorithms based off of their content that the resulting AI has somehow infringed the copyright of their original works.
Authors Sarah Silverman, Christopher Golden and Richard Kadrey — are suing OpenAI and Meta over claims of copyright infringement.
The suits alleges, among other things, that OpenAI’s ChatGPT and Meta’s LLaMA were trained on illegally-acquired datasets containing their works
I have sympathy towards this position. On a visceral level I can see that the idea that an AI capable of generating “original” ideas might have used their content to achieve it’s current form of pseudo-sentience might be disturbing.
I actually had one acquaintance on Twitter/X refer to what AI is doing as “evil”. While I understand that extreme reactions are the natural currency of social media I’m equally convinced that once you try to assign moral values to technology you’ve already lost the argument.
But no matter how you slice it, the belief that AI training on existing content is equivalent to someone “stealing your stuff” is one that’s wrong. And I believe it’s wrong on three levels. So let’s discuss it here!
- Inspiration starts with imitation.
New technology is always going to be built on the knowledge base of existing content. And as creatives we all stand on the shoulders of the giants that came before us. Since the first days we put pigment onto cave walls, other humans have been copying the work of those that inspired them or realized new ways to push their medium forward. Whether you’re writing a book, drawing, or creating music, on some level you’re remixing your past and moving the state of the art forward from there.
Some of the best advice I’d ever gotten as young author was to start writing a book by trying to copy something you love. By a few paragraphs in you’d be finding yourself creating something completely new. It’s hard not to see AI as doing the same thing. As strange as it is to think that a machine can be “inspired” by anything there’s no doubt that is at least some of what is going on here.
I’ve seen people argue that the fact that it’s being done by a machine makes it uniquely different. And that may be true. After all, someone grinding wheat with a mortar and pestle is different than the stone wheel doing the work for you. But in the end it’s all grist for the mill.
- Accessibility of Information is a Fundamental Good.
Information, the old saying goes, wants to be free. And that’s true enough that the large corporations have been gobbling down information from everywhere and everyone they can.
Facebook is primarily a trap for information, only giving us the barest bits of value in terms of allowing us to share that surveilled data with our friends.
Meanwhile, despite the machines that sit on our desks being fully capable of capturing, recreating, and distributing that information with no loss in quality those same corporations have spent a tremendous amount of time and money finding ways to gate, guard, and monetize the information they possess and create.
And yet a tremendous amount of energy has been spent the last half century transitioning all our media onto the web as quickly and completely as humanly possible precisely because we have (or at least many of us have) accepted that widespread access to information is something we desperately want.
- Copying vs. experiencing.
The current lawsuits are attempting to blur the line between what we consider “copying” and the ability of large language models to “ingest” information when being created.
The models themselves aren’t copying the information directly but adding it to the experience. They are, in their own indefatigable way “reading” the books that they ingest. Just as they are “viewing” the images that are being transferred into the overall intelligence of painting LLMs.
You can claim that the experience is different because it is being had by a machine and not a human. But that claim isn’t based on any existing law. And a machine being able to simulate an experience does not mean that your material was used in a way that is any different than what was intended.
We could, of course, ultimately decide to extend copyright beyond it’s original intent even further than we have over the last century. But I firmly believe that if we do we won’t only be stunting the growth of what may possibly foment the greatest increase in raw creativity in human history but also be handing corporations the tools necessary for them to secure a de facto creative monopoly, utterly and completely locking down any competition from individual creators.
It’s this last point that really is the cherry on top of the argument that what AI is doing doesn’t actually break any of our existing laws.
No matter how you slice it there is no doubt that the idea of a simulated experience being had by an artificial being is shocking when you first encounter it. That a machine is not only able to digest a book or other media, but convert that into a general purpose description of having that experience was, until recently, the stuff of science fiction. But now it’s real, it’s here, and it’ going to change the world of creativity in ways that we’re only beginning to imagine.
And that’s where the copyright argument collapses, in my opinion. Up until now you can control (or at least attempt to control) access to the material contained within the book but even as a creator you’ve had have no rights to the reaction to the experience of reading that material.
The whole concept of it takes the idea of authorial intent to an absurd degree.
If these kinds of cases ultimately succeed in setting new precedents, I’d argue the dangerous knock-on effects on competition, innovation, and free expression could far outweigh any supposed harms from current AI training methods.
Look at it this way – if creators can control how anyone experiences and interprets their works on a conceptual level, that gives the corporations unprecedented power to suppress certain points of view and applications.
Effectively, it risks handing copyright holders monopoly power over broad applications of the core ideas contained within the works they manage – stifling future iterations. And we’ve seen attempts to stretch copyright claims to absurd lengths every time new technological applications emerge.
This could choke progress and creativity, as we’ve seen historically when monopolistic power goes unchecked. Access inspires new thinking; overzealous control crushes it.
Progress requires reshaping systems thoughtfully to nurture innovation, not letting entrenched interests use legal technicalities to exclude competition. If we cede that ground, the negative implications for the free exchange of ideas and our collective advancement could be immense.
Like it, or not, and I can assure you as a creator that no one does, you don’t get to control the way anyone or anything gets to experience your content.
We stand at an inflection point – AI now begins to mimic our creative capacity, but its full potential requires judicious pruning, not heavy-handed restrictions.
Attempting to control how creative works are processed and applied, whether by humans or machines, cuts against the norms that have driven progress for centuries. Access powers innovation. Overcontrol leads to stagnation.
The nature of creativity won’t disappear but it is changing. The future beckons: terrifying and unexplored, as always. Rather than clinging to comfortable conventions, we must reconsider what drives creativity and how we get to express it.
The challenge is in shaping systems that allow both human ingenuity and machine learning to thrive in parallel. This calls for nuance, not reactionary lawsuits.
As I said in a previous piece, when it comes to artificial intelligence’s ability to independently produce genuinely inventive, inspiring, and artistic content, we’re still safe… for now. But that day is coming. It may be months or it may be years but either way it’s only a matter of time.